Delivering Data Driven Value

Unified Product Data Strategy

The presentation introduces Accurids’ Unified Product Data Strategy, a comprehensive approach to help pharmaceutical companies comply with the European Medicines Agency’s new IDMP and PMS structured data mandates. It highlights the regulatory urgency—where non-compliance could mean loss of marketing authorizations—and outlines how Accurids’ IDMP Data Standardization Fabric enables automation, interoperability, and data quality monitoring across the entire product lifecycle. By transforming EMA data into a unified knowledge graph, enabling automated quality checks, and aligning internal and external data, the solution reduces manual effort, enhances efficiency, and ensures audit readiness. With phased implementation, measurable ROI, and direct integration into regulatory systems, the strategy empowers pharma companies to achieve faster, safer, and more reliable submissions while supporting patient safety and innovation

The Unified Product Data Strategy

Using the ACCURIDS IDMP Data Standardization Fabric to Master Your Product Data Lifecycle for Faster, and Streamlined Submissions

What if you could master the quality of both internal and external product data (like from the EMA) and turn the entire IDMP mandate into a commercial advantage? Discover how the ACCURIDS IDMP Data Standardization Fabric creates a unified data strategy — the foundation for streamlining submissions and enabling a smoother product lifecycle.

Agenda

The Medicinal Product Data Governance Gap: How fragmented data can create significant risks and delays in submissions and slows your time to peak sales
Introducing the ACCURIDS IDMP Data Fabric: A strategic way to unify product data with smart connectors and built-in quality checks
The Path to Production: Our proven, phased approach – from a 100-day pilot to a full-scale Enterprise Data Registry in just over a year
The Business Case & ROI: A practical look at the return on investment, based on reducing manual data reconciliation efforts along the example of EMA PMS Data Alignment.

Speakers

Arne Balzer, Life Science Expert, Accurids
Heiko Waldmüller, Senior Consultant: Pharma & Digital Health Solutions, Accurids

Evolving Challenges in Chemical Interoperability – Rob Owen

Rob Owen’s presentation explores Pfizer’s journey in managing chemical data formats, beginning with the transition from V2000 to V3000 and the interoperability challenges this created—particularly around stereochemistry, reactions, and degeneracy in molecular representations. It details the company’s strategic decision to maintain compatibility with both formats, the reliance on Chemdraw interpretations, and the move toward CXSMILES to address the “lossy” nature of SMILES and InChI. The shift toward web-based, text-friendly formats like CDXML, the complexities of copy-paste and reaction handling, and the need for consistent rendering across multiple toolkits are emphasized. Owen advocates for open standards, broader vendor support, and focusing on functionality rather than file-format lock-in, all while acknowledging the evolving role of SaaS solutions and the importance of enabling chemists with flexible, interoperable tools.

Interoperability in Cheminformatics – Gerd Blanke

The challenges of data exchange in a FAIR world

In his presentation, Gerd Blanke of StructurePendium Technologies GmbH addresses the persistent challenge of interoperability in cheminformatics, focusing on the critical role of reliable chemical exchange formats for structures and reactions. He explains that existing formats are often lossy, leading to data quality issues, higher costs, and unpredictable workloads—especially during database mergers where discrepancies and legacy structures emerge. Poorly implemented formats undermine FAIR principles and hinder AI readiness. Blanke calls for regular cross-industry dialogue among data engineers, vendors, and format owners to share experiences, identify limitations, and agree on standardized improvements, positioning the Pistoia Alliance as an ideal forum to coordinate these efforts.

Interoperability in Cheminformatics – Susan Leung

The challenges of data exchange in a FAIR world

In this presentation, Susan Leung from AstraZeneca examines the interoperability challenges in cheminformatics, especially in multi-vendor DMTA (Design–Make–Test–Analyse) ecosystems where diverse tools, formats, and modalities must exchange data. She highlights that current data exchange formats (e.g., SMILES, molfile, CDXML, HELM) can be lossy, inconsistent, and subject to conflicting standards, creating problems in representation, search, and identity. Case studies illustrate issues with stereochemistry encoding, biopolymer representation, and toolkit incompatibilities, particularly when multiple standards or format extensions are in play. Leung emphasizes the need for better education, transparent communication, and systematic feedback processes, proposing that whether improving existing formats or creating new ones, the guiding principles must be clarity, documentation, and collaboration.

Challenges in Cheminformatics: The View of An Independent Consultant

In this presentation, independent consultant Thomas Doerner outlines five major challenges in cheminformatics from his experience working with large pharmaceutical and chemical companies: unFAIR chemical data (born in ELNs without early standardization), inconsistent representation of complex compounds (e.g., organometallics, polymers, nanomaterials), limitations of traditional chemical graphs for real-life substances that require additional contextual data, hesitancy and technical gaps in adopting open-source cheminformatics tools, and the need to integrate cheminformatics into “non-classic” environments like cloud-native platforms and corporate data lakes. He stresses that these issues hinder data findability, interoperability, and reuse, and calls for the Pistoia Alliance community to agree on priority challenges, understand the business value of solving them, and form representative working groups to develop solutions collaboratively.

How to Keep Linear Compute Scaling with Ever-Growing Data?

This presentation by Ramil Nugmanov addresses the challenge of maintaining linear compute scaling when working with ever-growing datasets in AI-assisted drug discovery, particularly for DNA-encoded libraries (DELs) containing billions of molecules. By breaking combinatorial libraries into fragments and using SMILES concatenation with placeholder atoms, the method avoids storing every entity, reducing memory use from gigabytes to ~10 MB and compute time to minutes. Search efficiency is improved by fragmenting queries, reconstructing fingerprints on the fly, and using bitwise operations to calculate Tanimoto similarity without full molecule reconstruction. The approach enables rapid similarity search, efficient CPU cache usage, and parallelization, while avoiding deep neural architectures that are inefficient for such data. The key message: don’t apply standard solutions blindly—design fragment-based, resource-efficient methods for massive chemical spaces.

IDMP Ontology July 2025 Community of Interest Meeting

Join us for another IDMP Ontology webinar where we discuss important achievements of our IDMP-O project. These sessions are designed to improve data alignment and interoperability across the pharmaceutical industry.

Agenda

Introduction
Status and Progress of Phase 4 use cases
- Batch Tracking
- Regulatory Data Alignment: EMA PMS and other jurisdictions
Recent Progress of IDMP Ontology
Introduction to Sustainability Model of IDMP-O
Upcoming Events

Speakers

Aditya Tyagi, Pistoia Alliance
Fabian Muttach, Boehringer Ingelheim
Raphael Sergent, Accurids
Elisa Kendall, EDMC
Toby Broom, CrownPoint Technologies
Cameron Gibbs, CrownPoint Technologies

Pharma General Ontology (PGO) Phase 1

Data inter-operability is a key enabler to accelerate life science workflows. Yet, many concrete hurdles exist. For example, organizations use different definitions for similar core concepts, requiring significant amounts of mappings across data sets even within a given company. The first objective of the Pharma General Ontology (PGO) project is to define a set of agreed-upon core entities, the PGO “core concepts”, and recommendations for associating controlled terminologies to service data exchange among Pharmaceutical Industry stakeholders. By reusing community-agreed terms and definitions, PGO aims to help organizations enhance data interoperability, integration, discovery, and reuse.

The webinar will address:

PGO vision
Deliverables Phase 1 & Next Steps
Q&A Session with the participants

Speakers

Philippe Rocca-Serra / Astrazeneca / Senior Director – FAIR Collaboration – Data Office
Martin Romacker / Roche / Product Manager Roche Data Marketplace (RDM)
Markus Hartmann / Merck Group / Global Product Lead Data Semantics
Birgit Meldal / Pfizer / Senior Manager, Enterprise Data Standards and Ontologies
Peter McQuilton / GSK / Senior Product Owner – Reference Data, Information & Data Architecture and Ontologies
Elena Businaro / Chiesi Group / R&D and PPM Digital Strategy and Execution Support Head
Joshua Daniel Valdez / Novo Nordisk / Head of Ontology and Semantic Engineering

Moderation: Giovanni Nisato / Pistoia Alliance / Project Manager PGO

FAIR 2024 Business Survey Report

Insights and Recommendations

The FAIR 2024 Business Survey Report presents key insights and recommendations on how pharmaceutical and life science companies are applying FAIR Data Principles across their organizations. Based on input from survey respondents and industry experts, the report highlights emerging business value, challenges, and the growing need for clear ROI frameworks to support FAIR’s strategic impact.

CMC Process Ontology Community of Interest May 2025

The Pistoia Alliance is building a pharmaceutical CMC process ontology based on the ISA88/95 framework.

Our aim is to standardize laboratory and plant production process recipes to establish standardized definitions, facilitate digital technology transfers and integration with execution systems to capture structured process data for material lot genealogy tracking, streamlined technology transfers, and advanced process analytics; thereby enhancing efficiency and transparency throughout the pharmaceutical production lifecycle.

Having shown value of a semantic approach to CMC process management during the initial PoC phase, we have now continued with the next phase of the work to move beyond the PoC to a usable Ontology.

For more information please visit our dedicated project page: CMC Ontology.

AI-Ready Data and Why FAIR Data Matters in Life Science Companies

As life-science organizations race to adopt (generative) AI, one point begins to stand out: your AI is only as good as your data. While large language models (LLMs) offer powerful capabilities, they’re not tailored to specialized scientific data—and do need a solid data foundation. Making data Findable, Accessible, Interoperable, and Reusable (FAIR) enables AI systems to deliver more accurate, reliable, and cost-effective outcomes.

Key points include:

Why many AI projects are still fundamentally reliant on robust data management
How FAIR Data complements LLMs through explicit semantics and structure
The critical role of data quality and governance in AI success

Whether you’re a data steward, scientist, or innovation leader, this session will help you get more perspective aligning your data and AI strategies for maximum impact. Join us on May 21 to explore why durable AI strategy needs a robust data strategy including FAIR Principles. Don’t let unstructured data hold your AI back—make it FAIR.

Speakers

Angelika Fuchs, Roche, Chapter Lead, Data Products & Platforms
Martin Robbins, Ontoforce, Product Manager
Tom Plasterer, XponentL Data, Managing Director, Knowledge Graph & FAIR Data Capability
Ted Slater, EPAM, Managing Principal, Scientific Informatics Consulting

Hosted by Giovanni Nisato, Project Manager, Pistoia Alliance

Projects

Communities

Delivering Data Driven Value

Resource Tag: Delivering Data Driven Value

Unified Product Data Strategy

The Unified Product Data Strategy

Using the ACCURIDS IDMP Data Standardization Fabric to Master Your Product Data Lifecycle for Faster, and Streamlined Submissions

Agenda

Speakers

Evolving Challenges in Chemical Interoperability – Rob Owen

Interoperability in Cheminformatics – Gerd Blanke

The challenges of data exchange in a FAIR world

Interoperability in Cheminformatics – Susan Leung

The challenges of data exchange in a FAIR world

Challenges in Cheminformatics: The View of An Independent Consultant

How to Keep Linear Compute Scaling with Ever-Growing Data?

IDMP Ontology July 2025 Community of Interest Meeting

Agenda

Speakers

Pharma General Ontology (PGO) Phase 1

Speakers

FAIR 2024 Business Survey Report

Insights and Recommendations

CMC Process Ontology Community of Interest May 2025

AI-Ready Data and Why FAIR Data Matters in Life Science Companies

Speakers