Today, scientists in biomedical fields rely on biological data sources in their research. Large amounts of information concerning genes, proteins and diseases are available on the internet, and are used daily for acquiring knowledge. Typically, biological data is spread across multiple sources, which has led to heterogeneity and redundancy. The current thesis suggests grouping as one way of computationally managing biological data. A conceptual model for this purpose is presented, which takes properties specific for biological data into account. The model defines sub-tasks and key issues where multiple solutions are possible, and describes what approaches that have been used in earlier work. Further, an implementation of this model is described, as well as test cases which show that the model is indeed useful. Since the use of ontologies is relatively new in the management of biological data, the main focus of the thesis is on how semantic similarity of ontological annotations can be used for grouping. The results of the test cases show for example that the implementation of the model, using Gene Ontology, is capable of producing groups of data entries with similar molecular functions
Book Details: |
|
ISBN-13: |
978-3-8383-1056-5 |
ISBN-10: |
383831056X |
EAN: |
9783838310565 |
Book language: |
English |
By (author) : |
David Rundqvist |
Number of pages: |
96 |
Published on: |
2010-05-21 |
Category: |
Informatics |