ChEMBL 20 takes the HELM

ChEMBL 20 incorporates the Pistoia Alliance’s HELM annotation.

3rd February 2015 – The European Bioinformatics Institute (EMBL-EBI) has released version 20 of ChEMBL, the database of compound bioactivity data and drug targets. ChEMBL now incorporates the Hierarchical Editing Language for Macromolecules (HELM), the macromolecular representation standard recently released by the Pistoia Alliance.

HELM can be used to represent simple macromolecules (e.g. oligonucleotides, peptides and antibodies) complex entities (e.g. those with unnatural amino acids) or conjugated species (e.g. antibody-drug conjugates). Including the HELM notation for ChEMBL’s peptide-derived drugs and compounds will, in future, enable researchers to query that content in new ways, for example in sequence- and chemistry-based searches.

Initially created at Pfizer, HELM was released as an open standard with an accompanying toolkit through a Pistoia Alliance initiative, funded and supported by its member organisations. EMBL-EBI joins the growing list of HELM adopters and contributors, which include Biovia, ACD Labs, Arxspan, Biomax, BMS, ChemAxon, eMolecules, GSK, Lundbeck, Merck, NextMove, Novartis, Pfizer, Roche, and Scilligence. All of these organisations have either built HELM-based infrastructure, enabled HELM import/export in their tools, initiated projects for the incorporation of HELM into their workflows, published content in HELM format, or supplied funding or in-kind contributions to the HELM project.

“Bridging the biological and chemical worlds is becoming more important as drug structures become more sophisticated,” says John Overington, creator of ChEMBL and head of Computational Chemical Biology at EMBL-EBI. “We’re pleased to use HELM as a way to represent the many bioactive peptides in ChEMBL.”

Sergio Rotstein, Director of Research Business Technology at Pfizer, adds “By being the first major content provider to include HELM notation for biomolecules in their product, EMBL-EBI continues to set itself apart as a cutting-edge player in bioinformatics. The computational representation of complex macromolecules with a level of rigor previously afforded only to small molecules is a fundamental enabler for biologics research. This release will hopefully pave the way for EMBL-EBI and indeed other content providers to include HELM for the ever-growing set of interesting biological entities in their products.”

According to Overington, the latest release of ChEMBL also offers an extensive set of structural alerts that can be used to identify features of compounds that may be undesirable in a drug-discovery setting, and mechanism-of-action classifications for known herbicides, fungicides and insecticides. In addition, new bioactivity data includes 12 new sets of screening results from the MMV Malaria Box and a large set of in vitro DMPK and physiochemical property data for more than 5,700 publicly disclosed drugs and compounds, deposited by AstraZeneca.

About the Pistoia Alliance

The Pistoia Alliance is a global, not-for-profit alliance of life science companies, vendors, publishers, and academic groups that work together to lower barriers to innovation in R&D. Our projects transform R&D innovation through pre-competitive collaboration. We bring together the key constituents to identify the root causes that lead to R&D inefficiencies. We develop best practices and technology pilots to overcome common obstacles. Our members collaborate as equals on open projects that generate significant value for the worldwide life sciences community.


The European Bioinformatics Institute (EMBL-EBI) is a global leader in the storage, analysis and dissemination of large biological datasets. By sharing our expertise and through collaboration, we help researchers realise the potential of ‘big data’, enhancing their ability to exploit complex information to make discoveries that benefit mankind. We are part of the European Molecular Biology Laboratory (EMBL), a non-profit, intergovernmental organisation funded by 21 member states and two associate member states. We are located on the Wellcome Genome Campus in Hinxton, Cambridge in the United Kingdom.

Posted in Pistoia Alliance Blog, Pistoia Alliance News and tagged .

Leave a Reply

Your email address will not be published. Required fields are marked *