It is easy to be preoccupied with the day-to-day tasks of a project, particularly when development work is progressing full pace. However, it is interesting to look more widely and consider the impact our work has had on the scientific community.
HELM was released in 2013, with a single user – Pfizer (who invented it), but were shortly followed by ChemAxon and a steady stream of organisations which represent a wide section of the informatics community. We have also gained recognition from regulators who endorsed HELM as an acceptable format in ISO 11238.
The list of HELM users is now very healthy, and we appreciate our enthusiastic and engaged community. Here are some of the groups who are using HELM.
Many of the HELM users will be discussing their implementations at the upcoming CINF symposium at the ACS Boston meeting 19-23rd August. This will be an excellent opportunity to find out the latest information from those working in the field.
Novartis makes extensive use of HELM for nucleotide registration and analysis. The open-source HELM tools are integrated with the internal informatics landscape.
Yohann Potier said, “HELM allows Novartis to accurately describe its chemically-modified constructs using an industry standard for registration.”
As the originators of the HELM standard, Pfizer has based their entire macromolecular registration infrastructure on HELM and its associated biomolecule toolkit.
Sergio Rotstein said, “While the enablement of biomolecular registration was already of great value to Pfizer, the establishment of HELM as an industry standard provided even greater value by facilitating cross-company interoperability and biomolecular data exchange, a very desirable outcome in our increasingly collaborative industry”
Starting with HELM Roche has developed the HELM Antibody Editor (HAbE) to enable especially the convenient handling of complex antibody in innovative formats for their analysis, visualization, manipulation and registration.
Most recent is the implementation of HELM2 at Roche to describe, register and manage therapeutic oligonucleotides and their derivates. This was facilitated by the improved monomer handling and support for ambiguous nucleotides within the HELM 2 toolkit.
Merck has been slowly adopting the HELM notation across our Discovery Chemistry Modalities organization focusing first on simple linear peptides and oligonucleotides. Using the Pistoia HELM editor for creation, editing and registration of monomers and chemical modifiers, our Modalities chemists can now work confidently with their monomers across multiple environments including our biopolymer registration system, our BioviaDraw platform and our tools within Insight for Excel. In 2018 we anticipate incorporating complex, macrocyclic biopolymers into the HELM supported workflows, peptide metabolite identification support and antibody-peptide conjugates. All of this facilitated by the easy to use tools leveraging HELM notation as a foundation.
Internal registration systems and tools are all based on HELM.
We are grateful to all funders, including the above plus GSK and BMS.
Scientific software providers
Applications use Biovia’s proprietary SCSR (Self-Contained Sequence representation, an extension of the V3000 molfile) format, but there is extensive ability to import, export and convert to and from HELM. Pipeline Pilot Chemistry Collection contains importers and exporters, HELM readers and writers including XHELM, and components to interconvert between macromolecules represented by HELM, full chemistry and SCSR. HELM support is available in Insight, the Draw and Pipette sketchers, biological registration and the chemistry cartridge.
Biomolecule Toolkit and the macromolecule sketcher BioEddie are natively supporting the HELM standard. The tools provide capabilities for managing a centralized monomer library, registering and performing uniqueness checks of macromolecules, generating a HELM notation from small molecule representations and sequences, and representing modalities with partially or fully unknown chemical structures.
Roland Knispel, Project Lead for Biologics Informatics at ChemAxon, said, “Our HELM-based tools are helping our customers to manage chemically modified sequence-based modalities. A single environment for various types of modalities, improved data quality and utilizing an industry-wide standard for data exchange are the key benefits reported back to us by our users. By market demand our platform is now being integrated into solutions provided by IDBS and other partners.”
Dotmatics has adopted the Pistoia Alliance’s HELM notation as part of its biologics discovery suite. Dotmatics Bioregister reads and writes sequence entities in HELM format, allowing users to work with these entities within the Dotmatics Suite and also to exchange HELM-format data with other HELM-compliant systems. Additionally, we are currently implementing support for HELM in Dotmatics’ analysis and visualization application, Vortex, allowing advanced analytical techniques to be applied directly to HELM-represented entities. These capabilities are available to current Vortex customers in the daily stable builds accessible from the Dotmatics Support website.
IDBS leverages ChemAxon’s Biomolecule Toolkit and BioEddie in its E-WorkBook suite and therefore includes HELM support.
Paul Gouldson, Vice President Strategic Solutions said, “IDBS has been supporting open standards in EWB since its inception. We use and develop integrations to open source tools and have supported examples with AniML, HELM, ADF, SVG and HTML tools.”
Next Move software:
The HELM format is supported by Sugar&Splice both for reading and writing peptides and nucleic acids, thus enabling conversion of all-atom structures from SMILES (for example) through HELM and back to SMILES. This support includes inline HELM (allowing structural data to be roundtripped even when monomers are missing from the HELM database), xHELM and partial support for HELM 2.0 ambiguity codes. The NCBI uses Sugar&Splice to generate HELM strings for all biopolymer entries in PubChem.
RDKit includes the ability to convert DNA, RNA and peptides to and from HELM and a large number of other notations including: FASTA, PDBBlock and standard sequence notation.
PerkinElmer integrated the Pistoia Alliance HELM standard into ChemDraw®, enabling chemists and biologists to easily describe complex molecular structures, rapidly create biopolymeric structures, and share their information in an industry-standard, publication-ready format. “We look forward to continuing to work with HELM as a standard to better serve the research community by providing modern tools that foster collaboration and enable faster discoveries,” says Pierre Morieux, ChemDraw Global Marketing Manager at PerkinElmer.
Being involved in the HELM tool development from the early hours, quattro research provides solutions for registration of biomolecules based on HELM notation. With a focus on antibodies and ADCs (antibody-drug-conjugate), we have developed the HELM Antibody Editor (HAbE) together with Roche. The xHELM format for data exchange, the ambiguity support of the HELM2 toolkit and a monomer service are additional Pistoia hosted projects developed and maintained by quattro research, in addition to our internal development and research. Many of these tools are now open source and hosted on GitHub, made available to all who are looking into adopting HELM as a standard
ChEMBL is one of the largest public drug discovery databases, containing information about approved drugs, clinical candidates and lead optimization data, including 1.7 million distinct compounds and more than 11,000 targets.
“A large proportion of new drugs are now biotherapeutics, but many could not be adequately represented by our traditional sequence or structure formats. HELM gives us a great solution, allowing us to accurately describe drugs such as modified peptides and antibodies” says Anna Gaulton, Senior Data Integration Officer for the EMBL-EBI Chemogenomics Team.
The ChEMBL database team have worked with the HELM project on methods to fragment peptide structures and define new monomers. These have been used to converted more than 20,000 natural and modified peptide structures to HELM and create a publicly-available library of more than 2800 peptide monomers.
PubChem is an open chemistry database at the National Institutes of Health (NIH), which provides information on chemical structures, identifiers, chemical and physical properties, biological activities, patents, health, safety, toxicity data, and many others to several million scientists worldwide.
Pubchem contains over 500,000 structures represented in the HELM notation. Many of these are complex, for example, only 65% of the HELM peptides are exclusively made up of amino acids.