Delivering Data Driven Value

FAIR Implementation with Mondo

This webinar explores how the Mondo ontology bridges clinical and translational research, featuring insights from Chris Mungall and hosted by the Pistoia Alliance’s FAIR Implementation Community of Interest.

The Cellosaurus: A FAIR Repository to Help Researchers Navigate the Confusing Universe of Cell Lines

By Amos Bairoch, University of Geneva and Swiss Institute of Bioinformatics

This webinar presents the Cellosaurus, a manually curated knowledge resource which aims to describe all cell lines used in biomedical research. It provides information on immortalized, naturally immortal and finite life cell lines. Its taxonomy scope encompasses both vertebrates and invertebrates. Currently it describes over 122,000 cell lines from 684 species. For each cell line it provides a wealth of information, cross‐references and literature citations.

The Cellosaurus is available on the ExPASy server (https://web.expasy.org/cellosaurus/) and can be downloaded in different formats under the CC BY 4.0 license. The Cellosaurus is a key resource to help researchers identify potentially contaminated/misidentified cell lines, thus contributing to improving the quality and reproducibility of research in the life sciences. It is part of the Resource Identification Initiative (RII) which aims to enable resource transparency within the biomedical literature through the use of Research Resource Identifiers (RRIDs). Some of the information in the Cellosaurus is uploaded into Wikidata thus allowing semantic connection of cell lines to other biological objects. We would like to expand its use in the context of the FAIRification of biological data by providing an RDF version of the resource and a SPARQL endpoint query service.

FAIR by Design

This webinar will explore how the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles can serve as a key enabler to automate and accelerate R&D process workflows. Through the lens of a real-world use case, the session will illustrate the practical implementation of FAIR, highlighting its role in driving faster and more impactful science. By making data more reusable and enhancing its value, FAIR also facilitates greater collaboration and partnership through improved data sharing. Ultimately, the webinar aims to show how FAIR interoperability makes data truly actionable, unlocking its full potential across research ecosystems.

Knowledge Graphs and Semantic Models for Drug Discovery and Healthcare

Data for drug discovery and healthcare is often trapped in silos which hampers effective interpretation and reuse. To remedy this, such data needs to be linked both internally and to external sources to make a FAIR data landscape which can power semantic models and knowledge graphs.

Semantics of Data Matrices & the STATO Ontology

This webinar presents the Statistics Ontology, STATO which is a semantic framework to support the creation of standardized analysis reports to help with review of results in the form of data matrices. STATO includes a hierarchy of classes and a vocabulary for annotating statistical methods used in life, natural and biomedical sciences investigations, text mining and statistical analyses.

Data Market Evolution: A Future Shaped by FAIR

This presentation reviewed the challenges in identifying, acquiring and utilizing research data in relation to an evolving data market. Strategic solutions were examined in which the FAIR principles play a key role in the future of data management.

CEDAR Work Bench for Metadata Management

With the explosion of interest in both enhanced knowledge management and open science, the past few years have seen considerable discussion about making scientific data “FAIR” — findable, accessible, interoperable, and reusable. The problem is that most scientific datasets are not FAIR. When left to their own devices, scientists do an absolutely terrible job creating the metadata that describe the experimental datasets that make their way in online repositories. The lack of standardization makes it extremely difficult for other investigators to locate relevant datasets, to re-analyse them, and to integrate those datasets with other data.

The Center for Expanded Data Annotation and Retrieval (CEDAR) has the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community. The CEDAR work bench for metadata management will be presented in this webinar. CEDAR illustrates the importance of semantic technology to driving open science. It also demonstrates a means for simplifying access to scientific data sets and enhancing the reuse of the data to drive new discoveries.

Open Interoperability Standards, Tools And Services At EMBL-EBI

Ontologies and Semantic Web technologies play an important role in the life sciences to help make data more interoperable and reusable. EMBL-EBI contributes to the development of biomedical ontologies and makes extensive use of them in the annotation of public datasets especially for large-scale data integration efforts. There is an increasing recognition for the role of ontologies in making data Findable, Accessible, Interoperable and Reusable (FAIR).

The ontologies team (https://www.ebi.ac.uk/spot/ontology/) at EMBL-EBI provide a suite of services to make ontologies more accessible for both humans and machines. We work with scientific data curators and software developers to integrate ontologies and semantics into both the data generation and data presentation workflows. We provide:

  • An ontology lookup service (OLS) for search and visualisation of over 200 ontologies
  • Services for automating and predicting the annotation of data with ontologies (Zooma)
  • An ontology mapping and alignment service (OxO)
  • Tools for generating ontologies from spreadsheets (Webulous)
  • Software for enriching documents in search engines to support semantic search

This webinar will present how we are using these services at EMBL-EBI to scale up the annotation of data and deliver added value through ontologies and semantics to our users.

Computational approaches to therapeutic antibody design: established methods and emerging trends

Antibodies are proteins that recognize the molecular surfaces of potentially noxious molecules to mount an adaptive immune response or, in the case of autoimmune diseases, molecules that are part of healthy cells and tissues. Due to their binding versatility, antibodies are currently the largest class of biotherapeutics, with five monoclonal antibodies ranked in the top 10 blockbuster drugs. Computational advances in protein modelling and design can have a tangible impact on antibody-based therapeutic development. Antibody-specific computational protocols currently benefit from an increasing volume of data provided by next generation sequencing and application to related drug modalities based on traditional antibodies, such as nanobodies. Here we present a structured overview of available databases, methods and emerging trends in computational antibody analysis and contextualize them towards the engineering of candidate antibody therapeutics.

HELM for Bioregistration

GSK will be making use of the Pistoia Alliance’s Hierarchical Editing
Language for Macromolecules (HELM) notation to represent therapeutic
large molecules in its bio-registration system, facilitated by the deployment
of Dassault Systèmes BIOVIA’s Biological Registration solution. GSK
scientists at sites around the world will use the system.