IDMP Viewer is a JAVA application that is specifically designed to access both the IDMP structure and its content in the easiest possible way. It can serve both as a web application and REST API. IDMP Viewer is an open-source project that EDM Council hosts. See https://github.com/edmcouncil/onto-viewer for details.
Resource Tag: Delivering Data Driven Value
FAIR4Clin Implementation Guide
A Guide for Clinical Trial and Healthcare Data
This work was done and is maintained as part of the FAIR implementation project – an initiative by the Pistoia Alliance, a not-for profit organization to facilitate pre competitive collaboration in life science industry. The FAIR4Clin guide consists of three parts: Introduction, Metadata and Application.
Visualizing Data At Scale: Complex Science, Unruly Users, And The Vitruvian Triad
Drug discovery and development are becoming increasingly complex and data-intensive. Advances in high-throughput, high-content, imaging, ‘omics, digital and other technologies have enabled scientists to generate enormous amounts of data on a routine basis and use them to generate new insights and make more informed decisions. Yet, despite concurrent advances in software technology, bringing this data to life and making it accessible to “ordinary” users who lack data science expertise remains a challenge.
In this talk, the speaker will discuss some of the common challenges and pitfalls in visualizing data and highlight a number of analytic solutions that he and his teams have developed over his 30-year career in pharmaceutical R&D, reflecting on the design, engineering, and organizational principles that underpinned their success. In a way, this is a talk about the invigorating diversity of data science, the power of data visualization, and the elusive art of story-telling.
Speaker
- Dimitris K. Agrafiotis, Vice President, Digital, Worldwide Research, Development and Medical, Pfizer Digital
Data Visualization To Support Signal Detection In Early Clinical Development
This presentation is a call to ‘put data before visualization’. We take the example of a popular display used in the pharmaceutical industry to present results of clinical trials in solid tumor oncology. A ‘waterfall plot’ is typically used to show the maximum change in tumor size during the treatment period compared to baseline. Unfortunately, these plots mask critical information about changes in tumor size over time. We show how simple ‘spaghetti’ plots offer a useful alternative to waterfall plots, in order to understand drug effects and support decision-making in early clinical development.
Speaker
Dr. Francois Mercer, Senior Principal Scientist, Roche
Francois Mercier is a biostatistician in oncology clinical development. He owns a Ph.D. in the Analysis and Modeling of Biological Systems and currently works at Roche-Genentech, in Basel. He has published >40 scientific papers and has made >30 public talks about biostatistics and clinical pharmacology in various forums.
Over the past 20+ years, he had many opportunities to appreciate the impact that “good” data visualization can have in enabling decision-making and problem-solving in drug development. He also observed the opposite, that is the disastrous effect that junk charts can have in introducing mistrust and confusion.
This is why, in his daily work, he promotes “unbiased and effective data viz” in early clinical development.
Results of the Ontology Alignment Evaluation Initiative 2021
The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus).
The OAEI 2021 campaign offered 13 tracks and was attended by 21 participants. This paper is an overall presentation of that campaign
Enhancing Access to Clinical Trial Data for Secondary Use
While the life sciences R&D industry is re-imagining clinical trial design in the age of digitalization, historical clinical trial data remains an important source of evidence that could inform today’s drug discovery and development.
In many pharma companies, it takes considerable time for an internal business function, such as research, to gain access to the company’s historical clinical data assets. Specifically, secondary use of clinical trial data needs approval by the appropriate internal governance function (legal/compliance/medical). Inter alia, this approval is granted upon verifying that the informed consent form (signed by the patient involved in the clinical trial) provides permission for the company to access the data for secondary use.
Furthermore, in a global clinical trial, patients will be recruited from many different nation-states, with their different languages, and each nation-state will have a national competent agency (regulatory agency) which may have a view on the eligibility of the secondary use of the data from that clinical trial.
At the Pistoia Alliance, we are exploring how the FAIRification of such data, along with advanced analytics including AL/ML/NLP powered systems can be used to access/share/reuse patient data from historical clinical trials. All this takes into account any applicable regulations and legislation.
In this talk, our presenters will highlight the key aspects to be considered in our path towards finding a common solution to this common problem.
Featured Topics
- F.A.I.R. and Shared Data, Transforming the Ecosystem to drive Insights and Advanced Analytics
- The principal legal and ethical considerations regarding the repurposing of clinical trial data
- Policy-based data access
Learning Objectives
At the conclusion of this session, participants should be able to:
- Recognize the role of advanced analytics in transforming data sharing
- Consider the pathways afforded to us for the repurposing of (‘the secondary use of’) clinical trial data within the current legal and ethical frameworks for data privacy and confidentiality.
- Evaluate data access through purpose and policy
Speakers
- Benjamin Szilagyi, MSc, VP, Head Insights Data & Experimental Analytics, Roche
- Francis Crawley, Executive Director, GCP Alliance
- Chris Edwards, Solution Architect, Patient Data, AstraZeneca
Data Curation as a Key Element in Successful Data Science Strategy
Data science is becoming a key discipline in pharmaceutical research. A successful Data Science strategy requires high-quality structured, integrated data collected from internal sources (for example bioassays) and external sources (literature, patents, drug labels, etc). Manual curation by domain experts is the key approach that allows companies to get from unstructured data spread across thousands of sources to high-quality datasets in order to gain new insights, make discoveries, and speed up drug development.
Curation at scale is a complex process that involves the development of strict protocols and standards, sophisticated infrastructure, and management of a large number of curators. The complexity of manual curation makes it a costly – but necessary – process, and that is why it is so important to get it right. This webinar will explore the processes, best practices, and pitfalls of manual data curation.
Data Science is the future: Join this webinar to learn about how to get there with high-quality, manual curation.
Featured Topics
- Key learnings about manual curation best practices from the leader in manual curation based on 20 years of experience in curating biological knowledge and data.
- FAIRification of Clinical Trial Data at Roche
Learning Objective
At the conclusion of this session, participants should be able to:
- Employ key strategies and avoid pitfalls when curating data to generate high-quality structured knowledge and datasets.
Sponsored by:

Speaker Bios
Frank Schacherer, PhD, VP Products and Solutions, QIAGEN Digital Insights, QIAGEN GmbH
Dr. Frank Schacherer leads QIAGEN’s program for information systems, knowledge content and machine learning in discovery research. He joined QIAGEN in 2014 with the acquisition of BIOBASE where he served as a managing director. Dr. Schacherer has more than 20 years of experience in management, software, and database development. He holds a Ph.D. in bioinformatics. His current interest is in translating the promise of data science and AI into useful solutions for understanding biological systems.
Rama Balakrishnan, PhD, Biomedical Ontology Specialist, Genentech
Rama received her Ph.D. in Biophysics from SUNY Buffalo(NY) and was a post-doctoral researcher in the Biochemistry Department at Stanford University(CA). She then moved to managing genomics databases and developing ontologies for biomedical domains also at Stanford. She continues to contribute to data curation and ontology development at Genentech/Roche.
Joshua Bernal, Data Curator, Genentech
Josh studied Biology at UC Berkeley and moved into Data Management shortly after. He has 15 years of combined CRO, Vendor, and Pharma Data Management and Data Curation experience.
HELM Wiki
Home of the HELM notation – representing complex biomolecules
Unified Data Model (UDM)
Collaborative Observational Health Research Using OHDSI Methods
This webinar will present clinical data standardization and harmonization using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), standard vocabularies, and standardized analytics. It will include contributions to the European Health Data & Evidence Network (EHDEN), an Innovative Medicines Initiative (IMI 2) consortium with 22 partners operating in Europe.
Speaker
Maxim Moinat is a Data Engineer, specialized in Medical Informatics at The Hyve, in Utrecht NL.
The Architecture of FAIR Data Platforms: COLID and EDISON at Bayer and Roche
Bayer and Roche are leading biopharmaceutical companies, which each have a diverse and distributed ecosystem of platforms to manage data and metadata used by different parts of each organization.
Corporate Linked Data Made FAIR (COLID) was developed at Bayer as an open-source technical solution for corporate environments that provides a FAIR metadata repository for corporate assets based upon semantic models. COLID assigns URIs as persistent and globally unique identifiers to any resource. The incorporated network proxy ensures that these URIs are resolvable and can be used to directly access those assets. The data model of COLID uses RDF and provides content through a SPARQL endpoint to consumers. COLID is both a management system for resolvable identifiers and an asset catalog. It is the core service to realize Linked Data in corporate environments and therefore an essential cornerstone for FAIR data management at Bayer.
The EDISON platform at Roche enables prospective FAIRification of data at the point of entry to the company, by harmonizing, automating, and integrating very heterogeneous and complex processes across multiple departments, building in data standards and quality checks at every step of the process. The EDISON platform is built as an ecosystem of self-contained microservices to ensure maximum performance, scalability, and low maintenance. The current scope of EDISON is clinical non-CRF data, but the platform is scalable and flexible to cover a large variety of data models, both clinical and non-clinical.
This webinar will present the technical details for each of these FAIR data platforms. They each enable seamless access across their respective corporate data ecosystems. They both exploit machine-readable, FAIR Knowledge Graphs to allow for accessing and combining multiple and disparate reference data systems which serve non-experts with intuitive and user-friendly ways of finding and exploring FAIR data. The two webinar presentations will be followed by a Q&A panel discussion.
Speakers
- Goekhan Coskun, Principal Information Strategist, Bayer, Germany
- Holmfridur Thosteinsdottir, Head of Clinical & Biomarker Informatics, Roche, Switzerland
For more information about the Pistoia Alliance’s FAIR Implementation project, please contact us.
Collaborative Observational Health Research Using OHDSI Methods
This webinar will present clinical data standardization and harmonization using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), standard vocabularies, and standardized analytics. It will include contributions to the European Health Data & Evidence Network (EHDEN), an Innovative Medicines Initiative (IMI 2) consortium with 22 partners operating in Europe.
Speaker
Maxim Moinat is a Data Engineer, specialized in Medical Informatics at The Hyve, in Utrecht NL.