Motivation and Purpose
The Ontologies Mapping project has been set up to create better tools or services and to establish best practices for ontology management in the Life Sciences.
Ontologies can include hierarchical relationships; taxonomies; classifications and/or vocabularies which are becoming increasingly important for support of research and development. They have numerous applications such as knowledge management, data integration and text mining where researchers need to analyse large quantities of complex data as part of their daily work. The Ontologies Mapping Project will give users access to standardised tools, methodologies and service which will enable them to map and visualise ontologies, to understand ontology structure, potential overlaps and equivalence of meaning. The impact of this project will be to help users to better integrate, understand and analyse their data more effectively.
Business case and Achievements
The idea for the Ontologies Mapping project was proposed through the Pistoia Alliance Ideas Portfolio Platform (IP3) which was selected by the Operations Team and Pistoia Board for development of a formal business case as published in IP3. GSK, Merck & Co, Novartis, Roche and BIOVIA 3DS are funding the project throughout 2016, as shown in the Project Timeline and Deliverables figure.
Phase 1 of the project in 2015 delivered:- 1) the selection of disease, phenotype and experimental investigation domains as "test case" 2) guidelines for best practice and "checklist" to support the application and mapping of source ontologies and 3) requirements for an Ontologies Mapping tool. For phase 2 in 2016, we have 1) developed an RFI process to evaluate existing ontologies mapping tools; 2) organised a new track evaluation of ontology matching algorithms in OM-2016; 3) defined the requirements for an ontologies mapping service and 4) conducted a questionnaire to understand the demand for such a service, potentially for Phase 3 in 2017.
Project structure and Communication
The Project Steering committee (including the funders) is responsible for making decisions, informed by recommendations from the Project Team which executes tasks and makes recommendations. It comprises of Pistoia Alliance members (including the funders) and meets biweekly. The Project Team consults with the Community of Interest each month, which is open to any organisation or individual with relevant skills and experience. The relationship between the three groups is illustrated in the Project Structure and Communication figure.
Community of Interest and Ontologies Guidelines
The project has built a Community of Interest of considerable size and influence in the ontologies field. It has delivered a set of guidelines for best practice and a checklist to exemplify their value. The Ontologies Guidelines for Best Practice are available on a publicly accessible wiki:- https://pistoiaalliance.atlassian.net/wiki/display/PUB/Ontologies+Mapping+Resources
Evaluation of Ontologies Mapping Tools
Understanding the detailed requirements for an ontologies mapping tool enabled us to evaluate existing tools during phase 2 of the project. Seven tool providers participated in the request for information (RFI) to support our systematic evaluation. Details of this process and our findings are available on the public project resource wiki. Our RFI evaluation considered 1) tool capability against our requirements and 2) mapping performance for two mapping tasks between two pairs of ontologies in the disease and phenotype domain. Namely, Human Phenotype (HP) Ontology vs. Mammalian Phenotype (MP) Ontology and Human Disease Ontology (DOID) vs. Orphanet and Rare Diseases Ontology (ORDO).
We identified the three top performing tools able to substantially meet our requirements and to match equivalence and similarity. The anonymised results are publicly available, whereas individual tool identity is available only to Pistoia members for internal use. A summary of these results have been presented as poster papers at conferences and a Pistoia Debates webinar on ontologies are also available on the the public project resource wiki.
Evaluation of Ontology Matching algorithms
Ontology Matching algorithms are fundamental components of any modern Ontologies Mapping tool. To get closer to this substantial community of algorithm developers in a meaningful way, we sponsored and organised the disease and phenotype track of the 2016 campaign for the Ontologies Alignment Evaluation Initiative. This involved evaluation of the performance of algorithms for ontology matching. The results of this evaluation have been published in the paper entitled “Matching disease and phenotype ontologies in the ontology alignment initiative” as https://doi.org/10.1186/s13326-017-0162-9. Our involvement in this annual challenge has continued for the 2017 campaign.
Implementation of a prototype Ontologies Mapping Service for Phase 3
An important output from the OM project has been to develop our requirements for an Ontologies Mapping Service. While an Ontologies Mapping Tool finds matches of equivalence and similarity between ontologies which powers applications, such valuable mappings loose value as the source ontologies change. This requires a mapping service, as illustrated here, to maintain the ontologies mappings, which would otherwise consume valuable internal resource in any organisation using them. We have been making good progress with implementation of a prototype Ontologies Mappings Service working with the Samples, Phenotypes and Ontologies Team at EMBL-EBI. We expect to be in a position to share our work early in 2018, which we plan to publish.