Delivering Data Driven Value

Pharmaceutical CMC Process Ontology CoI Meeting – December 2025

The Pistoia Alliance is building an advanced semantic architecture to create a Pharmaceutical CMC Process Ontology and a shared lexicon and taxonomy that extend the ISA88 framework. The work aims to standardize laboratory and plant production process recipes, to enable seamless digital technology transfers, and improve process data integration for genealogy tracking and advanced analytics. The upcoming Community of Interest meeting will showcase progress to date and outline how you can get involved as we define the scope for Phase 3 of this project.

Speakers

ZS Associates and the CMC Process Ontology Steering Committee

Data Driven Innovation Session Executive Summary

Part of the Pistoia Alliance’s 2025 Fall Conference, this session explored how FAIR data, ontology standardization, and AI-ready data foundations are becoming essential to accelerating innovation across life sciences. Speakers from AstraZeneca, Amino Data, Pfizer, CAS, and Genentech showcased large-scale efforts to harmonize assays, implement AI-enabled data management, quantify the business value of FAIR, and unify key scientific definitions through the Pharma General Ontology. Across presentations and panel discussions, a clear theme emerged: AI success depends on trusted, governed, interoperable data, supported by pre-competitive collaboration, measurable ROI, and hybrid human–AI workflows. The session emphasized that FAIR is now a strategic prerequisite—ensuring data is findable, reliable, and ready for AI-driven science.

IDMP Ontology Community of Interest Meeting December 2025

Collaborative Implementation of IDMP Standards

Bringing Stakeholders to the Table

Join us for a special IDMP-O Community of Interest webinar. We’ll kick off with a brief recap of the great progress achieved by the IDMP-O collaboration over the last 4 years, followed by an interactive panel discussion on why a strong IDMP-O community is essential to accelerate IDMP adoption across the pharma industry. We are delighted to welcome EMA representatives to the Community of Interest and look forward to an insightful conversation with industry leaders driving IDMP implementation. Together we’ll explore the future of this community-driven collaboration and its impact on global standards.

Agenda
  • Introduction
  • IDMP-O Status Overview – A short recap of the last 4 years and current status
  • Panel discussion on the importance of building an IDMP-O community
  • Wrap-up and next steps
Meeting Host
  • Aditya Tyagi, IDMP-O Project Manager at Pistoia Alliance
Panelists
  • Sheila Elz, Senior Regulatory Information Manager at Boehringer Ingelheim
  • Isabel Chícharo, Head of Regulatory Data Management at European Medicines Agency
  • Gang Xue, Senior Director at Johnson & Johnson
  • Frits Stulp, Partner Life Sciences at Implement Consulting Group
  • Heiner Oberkampf, CEO & Co-Founder at ACCURIDS

FAIR Business Value Framework: An Introduction

As data becomes a defining asset in life sciences, organizations are shifting from viewing FAIR data principles (Findable, Accessible, Interoperable, and Reusable) as a compliance or IT objective to recognizing them as a strategic enabler of business value. Yet, despite growing adoption, a structured understanding of FAIR’s financial and operational return remains challenging.

This webinar presents the latest outcomes of the Pistoia Alliance’s FAIR working group to qualify and quantify the business impact of FAIR. Drawing on the collaborative review of case studies, workshops, surveys, and executive interviews, the resulting FAIR Business Value Report revealed how FAIR maturity drives measurable outcomes in productivity, innovation, and data-driven decision-making.

The session will introduce a qualitative framework that structures FAIR’s business drivers across 3 levels: strategic drivers, benefit areas and quantifiable metrics. The webinar will also provide a first look at the forthcoming FAIR Value Calculator, a practical tool to help organizations model and benchmark the ROI of their FAIR investments.

Unified Product Data Strategy

The presentation introduces Accurids’ Unified Product Data Strategy, a comprehensive approach to help pharmaceutical companies comply with the European Medicines Agency’s new IDMP and PMS structured data mandates. It highlights the regulatory urgency—where non-compliance could mean loss of marketing authorizations—and outlines how Accurids’ IDMP Data Standardization Fabric enables automation, interoperability, and data quality monitoring across the entire product lifecycle. By transforming EMA data into a unified knowledge graph, enabling automated quality checks, and aligning internal and external data, the solution reduces manual effort, enhances efficiency, and ensures audit readiness. With phased implementation, measurable ROI, and direct integration into regulatory systems, the strategy empowers pharma companies to achieve faster, safer, and more reliable submissions while supporting patient safety and innovation

The Unified Product Data Strategy

 

Using the ACCURIDS IDMP Data Standardization Fabric to Master Your Product Data Lifecycle for Faster, and Streamlined Submissions

What if you could master the quality of both internal and external product data (like from the EMA) and turn the entire IDMP mandate into a commercial advantage? Discover how the ACCURIDS IDMP Data Standardization Fabric creates a unified data strategy — the foundation for streamlining submissions and enabling a smoother product lifecycle.  

Agenda
  • The Medicinal Product Data Governance Gap: How fragmented data can create significant risks and delays in submissions and slows your time to peak sales
  • Introducing the ACCURIDS IDMP Data Fabric: A strategic way to unify product data with smart connectors and built-in quality checks
  • The Path to Production: Our proven, phased approach – from a 100-day pilot to a full-scale Enterprise Data Registry in just over a year
  • The Business Case & ROI: A practical look at the return on investment, based on reducing manual data reconciliation efforts along the example of EMA PMS Data Alignment.
 
Speakers
  • Arne Balzer, Life Science Expert, Accurids
  • Heiko Waldmüller, Senior Consultant: Pharma & Digital Health Solutions, Accurids

Evolving Challenges in Chemical Interoperability – Rob Owen

Rob Owen’s presentation explores Pfizer’s journey in managing chemical data formats, beginning with the transition from V2000 to V3000 and the interoperability challenges this created—particularly around stereochemistry, reactions, and degeneracy in molecular representations. It details the company’s strategic decision to maintain compatibility with both formats, the reliance on Chemdraw interpretations, and the move toward CXSMILES to address the “lossy” nature of SMILES and InChI. The shift toward web-based, text-friendly formats like CDXML, the complexities of copy-paste and reaction handling, and the need for consistent rendering across multiple toolkits are emphasized. Owen advocates for open standards, broader vendor support, and focusing on functionality rather than file-format lock-in, all while acknowledging the evolving role of SaaS solutions and the importance of enabling chemists with flexible, interoperable tools.

Interoperability in Cheminformatics – Gerd Blanke

The challenges of data exchange in a FAIR world

In his presentation, Gerd Blanke of StructurePendium Technologies GmbH addresses the persistent challenge of interoperability in cheminformatics, focusing on the critical role of reliable chemical exchange formats for structures and reactions. He explains that existing formats are often lossy, leading to data quality issues, higher costs, and unpredictable workloads—especially during database mergers where discrepancies and legacy structures emerge. Poorly implemented formats undermine FAIR principles and hinder AI readiness. Blanke calls for regular cross-industry dialogue among data engineers, vendors, and format owners to share experiences, identify limitations, and agree on standardized improvements, positioning the Pistoia Alliance as an ideal forum to coordinate these efforts.

Interoperability in Cheminformatics – Susan Leung

The challenges of data exchange in a FAIR world

In this presentation, Susan Leung from AstraZeneca examines the interoperability challenges in cheminformatics, especially in multi-vendor DMTA (Design–Make–Test–Analyse) ecosystems where diverse tools, formats, and modalities must exchange data. She highlights that current data exchange formats (e.g., SMILES, molfile, CDXML, HELM) can be lossy, inconsistent, and subject to conflicting standards, creating problems in representation, search, and identity. Case studies illustrate issues with stereochemistry encoding, biopolymer representation, and toolkit incompatibilities, particularly when multiple standards or format extensions are in play. Leung emphasizes the need for better education, transparent communication, and systematic feedback processes, proposing that whether improving existing formats or creating new ones, the guiding principles must be clarity, documentation, and collaboration.

Challenges in Cheminformatics: The View of An Independent Consultant

In this presentation, independent consultant Thomas Doerner outlines five major challenges in cheminformatics from his experience working with large pharmaceutical and chemical companies: unFAIR chemical data (born in ELNs without early standardization), inconsistent representation of complex compounds (e.g., organometallics, polymers, nanomaterials), limitations of traditional chemical graphs for real-life substances that require additional contextual data, hesitancy and technical gaps in adopting open-source cheminformatics tools, and the need to integrate cheminformatics into “non-classic” environments like cloud-native platforms and corporate data lakes. He stresses that these issues hinder data findability, interoperability, and reuse, and calls for the Pistoia Alliance community to agree on priority challenges, understand the business value of solving them, and form representative working groups to develop solutions collaboratively.

How to Keep Linear Compute Scaling with Ever-Growing Data?

This presentation by Ramil Nugmanov addresses the challenge of maintaining linear compute scaling when working with ever-growing datasets in AI-assisted drug discovery, particularly for DNA-encoded libraries (DELs) containing billions of molecules. By breaking combinatorial libraries into fragments and using SMILES concatenation with placeholder atoms, the method avoids storing every entity, reducing memory use from gigabytes to ~10 MB and compute time to minutes. Search efficiency is improved by fragmenting queries, reconstructing fingerprints on the fly, and using bitwise operations to calculate Tanimoto similarity without full molecule reconstruction. The approach enables rapid similarity search, efficient CPU cache usage, and parallelization, while avoiding deep neural architectures that are inefficient for such data. The key message: don’t apply standard solutions blindly—design fragment-based, resource-efficient methods for massive chemical spaces.

IDMP Ontology July 2025 Community of Interest Meeting

Join us for another IDMP Ontology webinar where we discuss important achievements of our IDMP-O project. These sessions are designed to improve data alignment and interoperability across the pharmaceutical industry.

Agenda
  • Introduction
  • Status and Progress of Phase 4 use cases
    • Batch Tracking
    • Regulatory Data Alignment: EMA PMS and other jurisdictions
  • Recent Progress of IDMP Ontology
  • Introduction to Sustainability Model of IDMP-O
  • Upcoming Events
Speakers
  • Aditya Tyagi, Pistoia Alliance
  • Fabian Muttach, Boehringer Ingelheim
  • Raphael Sergent, Accurids
  • Elisa Kendall, EDMC
  • Toby Broom, CrownPoint Technologies
  • Cameron Gibbs, CrownPoint Technologies