Cast your mind back to the nineteen seventies.  Think Glam Rock, Morecambe and Wise, Watney’s Red Barrel, the Ford Capri, endless strikes and every man sporting long hair, a dodgy tie and enormous sideburns.

Paul HadlandIn those turbulent times, many pharmaceutical companies were still managing some of their R&D information systems on paper.  Those that had moved to computer-based systems were tied to the mainframe, with little or no interactive access.  All updates and queries were performed in batch mode – apart from online searches of external systems, which were carried out mostly on teletypes.

And yet, even then, pharmaceutical companies were actively collaborating on an IT project that they believed would be of mutual benefit.

In the early seventies, ICI Plant Protection produced an index of 20,000 organic chemicals from a dozen or so supplier catalogues. Paper copies of this index were shown to a meeting of the UK chapter of the Chemical Notation Association (now the Chemical Structure Association), and an inter-company project was launched with the aim of producing a database of all organic chemical suppliers’ catalogues.  The name of the project was CAOCI: the Commercially Available Organic Chemicals Index.

Coding into Wisswesser Line Notation (WLN – a predecessor of SMILES) and punching (on to 80-column punch cards – the preferred input format) were carried out by UK and US R&D Information staff in six pharmaceutical companies: ICI, Wellcome, Glaxo, Boots,

Pfizer and Beecham.  ICI co-ordinated and consolidated the system using their internally developed software, CROSSBOW, and within 18 months produced printouts of Molecular Formula and WLN KWIC indexes which were easily decipherable by trained information scientists and some adventurous chemists – such folk existed, even then.

ICI sold the CROSSBOW system to its collaborators, who now had a standard product for substructure searching using bit, string and connection table searches – not so different from what is done today.  With CAOCI, they also had most organic chemical supplier catalogues on a single searchable database.

CAOCI went from strength to strength, suppliers began to get much more involved and it went on to be managed (along with CROSSBOW and other ICI database systems) by Fraser-Williams Scientific Systems. Later on, it was renamed the Fine Chemicals Directory, and after a couple of changes in ownership, became the Available Chemicals Directory (ACD).

ACD is still with us today, and is now owned by Accelrys. (For details please contact Keith Taylor ([email protected]).

So the principles underlying PISTOIA have a firm foundation, and there is a strong precedent for collaboration across the industry leading to enduring value.

Just ask any R&D chemist!

Now, here are some questions to (hopefully) stimulate discussion:

  • Are there any other examples of long-lived, successful collaborations?
  • Is today’s environment more hostile or more open to such initiatives?
  • What role should commercial partners play in cross-industry collaborations?
  • If CAOCI was launched from scratch today, what would it look like, and how would it operate?


S. Barrie Walker: J. Chem. Inf. Comput. Sci., 1983, 23 (1), pp 3–5

(Paul Hadland started working on CAOCI in 1978 and has been involved with Pharmaceuticals R&D ever since.)

Posted in Cheminformatics, Collaboration & community, Standards & deliverables | 2 Comments

We don’t suffer a lack of ideas for potential Pistoia Alliance projects. The trick is getting a critical mass (including people and funds) behind an idea to turn it into an active project driving toward a solution. With the latest round of portfolio projects, we worked with David Seemungal of Cubase Consulting to conduct an indicative valuation of the various proposed activities. I asked David to explain the methodology, which we plan to employ as a way of calibrating expectations around future proposed activities.

My first exposure to the Pistoia Alliance was at the Dragons’ Den meeting in London, where I served as one of the dragons listening to the pitches and determining whether to invest in any of the proposals. Frankly, I ended up holding onto my play money, because none of the proposals were clear about their long-term value and the return investors would see on their investment. That’s not to say the projects didn’t have value—they just weren’t making that value clear in the pitch. So it’s been rewarding to be working with the Alliance to help put some quantitative measures of value onto its activities.

Assessing value is a bit easier for my firm, Cubase Consulting, which is primarily concerned with drugs or diagnostics, where you can track things like future sales. Pistoia Alliance efforts are “virtual” projects aimed at “softer” things like time savings, where the value is seen in freeing people’s time to work on other tasks. We defined value as having five components:

  • Cost: Savings in annual operating costs
  • Productivity: Organization is able to achieve more output at the same cost
  • Time to decision: Organization is able to make a go/no go decision more quickly at the same cost
  • Quality: Organization is able to deliver higher quality output at the same cost
  • Risk reduction: Able to comply with regulatory environment and avoid penalties

The first step in performing the valuation is to determine how these components factor in each project. The result is a “valuation tapestry” that assesses whether each component is a large (red), medium (orange), or low (yellow) component of the project’s value, as shown in Figure 1. So looking at the controlled substance compliance effort (first row in the table), the largest component of the value of this project is in risk reduction, whereas a project like tranSMART (last row in the table) spans most of the other four components.

Table showing the relative values of new Pistoia Alliance projects.

Figure 1: The valuation tapestry for proposed new Pistoia Alliance projects.

Assigning numbers to the values required us to make some assumptions. First, we had to come up with some measures of how many scientists at what levels would be impacted within organizations. We made some rough estimates about how head counts in global R&D are typically apportioned between discovery, development, and operations and admin, as well as how external costa and alliances compare to internal payrolls and administrative costs. Second, we then looked specifically at discovery scientists and technicians and broke out their typical daily duties as percentages. These numbers were in turn related back to the head count estimates determined in the first step.

Our valuation methodology then implemented a rather simple algorithm.

A = [Current prevalence of issue] x [Current resolution method] x [Current resolution resources]

B = [Current prevalence of issue] x [Proposed resolution method] x [Proposed resolution resources]

A – B = resources saved through adoption of Pistoia Alliance solution

For simplicity, we assumed each project had an implementation cost of $0.5 million incurred in 2013, with benefits from projects commencing in 2014. We also assumed a rather high private equity discount rate of 25% and a valuation period of five years. We based all valuations on the net present value (NPV), which is essentially the sum of all future cash flows over that five year period.

This kind of quantitation of ROI adds an important dimension to assessing the value of pursuing projects that can be taken together with other, softer reasons to determine which projects to carry forward. Of course, value, like beauty, is often in the eye of the beholder. What do you think of our methodology? And does your organization attach quantitative measures of ROI to determine which projects you should undertake?

Posted in Workflows & business processes | Tagged , , , , , , | Leave a comment

The Pistoia Alliance is partnering with the TM Forum (a non-profit ndustry association focused on enabling service provider agility and innovation) to explore the use of mobile apps in life science R&D. An initial meeting was held at the TM Forum headquarters in Morristown, New Jersey on 26 Oct 2012. The Pistoia Alliance “App Strategy” team shared its vision for the creation of a vibrant mobile community for life science researchers that would foster innovative thinking among researchers and accelerate research breakthroughs globally. Supporting mobile apps in this space will necessitate back-end cloud environments to support data-sharing, information navigation, and high-performance computation to enable desired workflows—elements that are firmly in the wheelhouse of TM Forum member companies.

The focus of the meeting was information sharing—to learn from each other and share perspectives. Invited speakers covered a wide range of topics relevant to the mobile app world. Speakers from the Pistoia Alliance community covered chemistry, biology/bioinformatics, translational medicine, and health IT, while TM Forum covered all things “cloud” including standards, security and computer systems validation, app store models, and novel visualization capabilities.

Chemistry is one area that already has seen some app development, and current mobile app functionality was reviewed, including advanced cheminformatics tools, access to public databases like ChemSpider, and novel ways to foster collaboration and data sharing via apps like ODDT and data appification (such as TB Mobile, which makes available a set of molecules with activity against mycobacterium tuberculosis and known targets in CDD). The Pistoia members described some of the current limitations in mobile apps for chemistry, in particular the difficulty involved in scaling up app capabilities in terms of data size. Mobile apps are making excellent progress in providing functionality for working on small data sets, but working with large data collections will require a new cloud-based infrastructure. It was indeed striking to see that in general biology mobile apps were not as prevalent as chemistry apps, though Life Technologies has collected a few examples.

Another topic covered by several groups was the use of cloud environments to power compute-intensive tasks and support workflows in gene sequencing and genomics. This was predicted to become increasingly routine in both life sciences research and clinical environments. An intriguing discussion covered how, and even whether, mobile apps could actually enhance work in these areas capabilities. In the healthcare domain, crowd sourcing approaches are being used extensively to transform how doctors and patients interact with each other and with their data. Moving from complex static web pages to simplified app designs that make complex information easy to understand has led to accelerated adoption in healthcare and provides a useful framework to emulate as we move to harness the power of apps to both simplify and transform life science R&D. Further, the emerging use of open source platforms such as tranSMART (an effort that the Pistoia Alliance is championing) drives the need for cloud-based data-sharing environments that apps could leverage to drive communication and data analysis.

Many presentations discussed the differentiating features of mobile apps (due to their design) that might make them compelling to life scientists. The camera standard on mobile platforms is a powerful differentiator, and the group brainstormed on what that capability could do to transform thinking. One novel app using the camera could potentially be used to render static pages into 3D models of a molecule or a protein, or even a pathway analysis view. QR codes or other visual recognition modalities might also be useful—could pointing the camera at an instrument bring information on that instrument (such as when it was last used or calibrated) directly to the user?

It was enlightening to hear such a diverse set of opinions from the audience at this meeting. The Pistoia Alliance App Strategy team is now working on developing potential projects we could do in partnership with the TM Forum. We welcome input for this ongoing discussion.

Posted in Bioinformatics, Cheminformatics, The life science cloud | Tagged , , , , , , , | Leave a comment

John Wise head shotThe Pistoia Alliance is signposting the Hierarchical Editing Language for Macromolecules (HELM) developed at Pfizer as a way to solve the problem with how to consistently represent large molecules, such as proteins, peptides, oligonucleotides, and small molecule drugs. These complex structures have long challenged informatics because they are large enough to be unwieldy and impractical to represent at the atomic level, while the presence of non-natural chemical modifications makes it impossible to represent them by sequence alone. HELM offers a solution, and a group at Pfizer has recently published a paper on this language for macromolecule representation. We urge interested parties to give it a read.

Example HELM representation

A sample of a HELM representation.

Posted in Cheminformatics, Standards & deliverables | Tagged , , | Leave a comment

John Wise head shotThe Pistoia Alliance App Strategy was outlined during a webinar presentation on 9 November. The presentation began with a brief overview of the three proposed phases of the Pistoia Alliance AppStore, and continued on with a lively question and answer session that covered all facets of our strategy and “appification” in life science R&D. Nearly 100 Pistoia members and other interested parties signed on for the webinar. If you missed it, you can check out the entire 70-minute session, which was recorded.

Posted in The life science cloud, Workflows & business processes | Tagged , , | Leave a comment

This Wednesday, October 3, I’ll be delivering a webinar titled “Practical cheminformatics workflows with mobile apps.” It’s a free webinar and only runs half an hour, and not only does it describe a few of the common cheminformatics workflows that currently can be handled with mobile apps, it also describes the general philosophy needed in R&D that’s behind the Pistoia appification strategy. You can get all the details on the webinar and links to view it over at the American Chemical Society website.

Posted in Cheminformatics, The life science cloud | Tagged , | Leave a comment

In Part I of this series, Sean Ekins outlined the need for a way to share data about rare and neglected research. This need inspired the Open Drug Discovery Teams (ODDT) mobile app, developed for the Dragons’ Den session at the Pistoia F2F meeting in February and launched on the Apple app store in April 2012. The full evolution of ODDT is chronicled on my blog. In this entry, I invited Sean to discuss how the app works. The work on ODDT demonstrates the importance of the Pistoia Alliance’s new appification strategy, which aims to make informatics tools accessible to scientists and the broader community interested in using mobile devices to conduct and communicate about science.

Sean Ekins

Recognizing that parent-led disease organizations like and use Twitter and actively blog to promote the study of their children’s diseases, the ODDT app tracks Twitter hashtags and Google Alerts corresponding to certain diseases and aggregates links to articles and other information under topic headings. The app is chemistry aware, enabling scientists to tweet molecules they are making, want to share with others, or need to find. Structure-activity data can also be shared in the app, giving motivated citizen scientists, such as parents and patients, who want to learn about scientific software the opportunity to work with tools similar to those used in larger research organizations. All information aggregated by ODDT is crowd curated; users can endorse or disapprove links to improve both the quantity and quality of the data reported in the app.

ODDT helps parent-led organizations highlight their causes and endorse content relevant to their communities, ensuring rapid and more substantive conversations that can lead to more effective collaboration. In the process of developing and communicating ODDT, we have actively raised the profile of these diseases, bringing them to the attention of thousands of people through mentions on blogs, in papers, posters, and oral presentations, and even through an IndieGoGo crowdfunding campaign.  We are only at the beginning of what we could achieve with the rare and neglected disease communities.

ODDT has also demonstrated the need to supplement the academic publication system, which locks most important discoveries behind paywalls. By having access to raw data in a readily usable form, anyone can easily incorporate the data into their own projects and avoid unnecessary duplication of effort. It also brings a benefit to scientists, as a parent or patient may see the research and offer to fund it or assist in its commercialization.

ODDT capitalizes on the shift towards low-cost, consumer-friendly apps and serves as a flagship effort to bring together professional scientists, charitable foundations, and concerned citizens in an open context that breaks down institutional or geographic barriers. ODDT illustrates how the Pistoia Alliance can help inspire the development of a new mobile app and jumpstart connections between communities. It’s the basis of our developing app strategy. The strategy and the store will provide way for customers and developers to generate additional new scientific apps. In the case of ODDT, exposing the app in the Pistoia Alliance app store would show organizations a pilot of what could be developed into a commercial product organizations to mix private documents with external data. This could result in a mobile, cloud-based complement to existing chemical databases that could spur ideas for future discovery programs.

In the interest of full disclosure, Sean donates time to Phoenix Nest, BioGan,, and These and many other rare disease groups could benefit from your support.

Posted in Collaboration & community, The life science cloud | Tagged , , | Leave a comment

One of the great things about mobile apps is that they are low-profile, easy-to-adopt tools that theoretically could remove traditional barriers between information sources. In fact, as I discussed in my last entries, they have the potential to create a whole new ecosystem of information and users. Nowhere is this more evident, or more important, than in the area of rare and neglected disease research, where disparate (and often desperate) information seekers need better ways to access and share information. I’ve invited Pistoia Alliance board member Sean Ekins to talk about the Open Drug Discovery Teams (ODDT) app we developed together as a “skunk works project” for the Dragons’ Den session at the Pistoia Alliance F2F meeting in February 2012. The app has come a long way since that initial presentation—but more on that in Part 2.

Sean Ekins

Today we are seeing parents becoming bona fide world experts in rare diseases out of necessity, as they form foundations and companies to fund research to help their sick children. Take Jill Wood, who founded when her son was diagnosed with Sanfilippo Syndrome, or Lori Sames, who founded to promote study on her daughter’s disease, giant axonal neuropathy.  There are many others. Organizations like these don’t have the scale of the NIH, the Bill and Melinda Gates Foundation, or the Michael J. Fox Foundation. But they are virtual pharmas—and they are highly motivated to find cures. What they lack is a way to rapidly access, gather, and share information on disease research that may be occurring anywhere globally. What if we could help these parents and the researchers working on rare diseases work more collaboratively, as Alex Mackenzie and co-authors recently suggested?

At the opposite end of the spectrum are diseases such a tuberculosis and malaria that affect millions rather than the one in a million suffering from a given rare diease. Despite their prevalence and the considerable amount of funding they can receive, these diseases are often classed as neglected because progress is slow, not well coordinated, and rarely utilizes informatics, computational tools, and other technologies to capitalize on knowledge accumulated globally.

In both of these cases, information exists. Research is occurring somewhere, but it’s occurring in a vacuum. The challenge is how to connect the right people to the right information at the right time so that they can share, partner, and collaborate. A few pioneering open notebooks scientists have explored ways to make science more open. Mat Todd at the University of Sydney, for instance, uses his blog as a window into his ongoing research, and many scientists have taken to tweeting discoveries to their contacts. But such efforts are still akin to shouting on a street corner. How can we boost the signal to connect those doing the shouting with those who most want to listen, wherever they are in the world?

In our next entry, we will describe the genesis of the ODDT app (developed jointly by Collaborations in Chemistry and Molecular Materials Informatics) and its impact on rare and neglected disease research.

In the interest of full disclosure, Sean donates time to Phoenix Nest, BioGan,, and These and many other rare disease groups could benefit from your support.

Posted in Collaboration & community, The life science cloud | Tagged , , | Leave a comment

In my last entry, I posited that technology should NOT be a barrier to “appifying” R&D workflows. So why haven’t apps taken off so far in R&D? I’d argue that it comes down to the paradigm shift that mobile technology has created in computing. Turning an existing software workflow into an app requires distilling the core essence of a task to a minimalistic, simple user interface that focuses on what really matters. Further, because even apps designed for niche use will be compared to the current menagerie of consumer apps that many of us use daily, scientists will expect a quality user experience.

That’s a tall order for the typical software vendor, which is why the Pistoia Alliance is pursuing an “app strategy” aimed at bringing together three necessary components to “appify” R&D:

  • The technology components needed to produce useful R&D apps
  • The scientific software engineers with the skill to assemble apps
  • The life science researchers who want to use the finished products

By encouraging all parties to come together, the Pistoia Alliance seeks to create a community where professionals can engage one another to map out solutions to problems with a high signal to noise ratio. And we think this community will be a powerful force for innovation. By offering a quality user experience, focused features sets, a shallow learning curve, and ease of installation, app versions of scientific workflows will be more likely to be used by large numbers of people, and will be more likely to be used more frequently—wherever there is a way for the tool to be useful. That will mean more productive scientists, and more opportunities to innovate.

Look for more information on our app strategy in the coming weeks. In the meantime, I’d love to hear what apps you would like to use (or are currently using) for R&D.

Posted in The life science cloud, Workflows & business processes | Tagged , , | Leave a comment

The computing transformation being effected by mobile computing may not be one we fully appreciate while burying our heads and thumbs in the latest cool app or game. Yet this transformation is likely the most important since the introduction of the personal computer. It’s not just about the inherent portability of smartphones and tablets. It’s that the transformation marks a complete change to the underlying platform. This stands in contrast to the last 30 years, where despite changes to the user experience, the fundamental concepts driving the technology would be familiar on the whole to late 1980s power users.

Making software available on a mobile device requires a complete code rewrite based on entirely different underlying concepts. Here are a few to consider:

  • A mobile app is always a modular component rather than a monolith
  • User interaction is done by touchscreen, which has very different properties compared with keyboard and mouse
  • Computational resources must be presumed expensive, because battery life is trump
  • Networking and communication features are core features rather than add-ons

The most popular mobile device platforms—iOS and Android—made a clean break from legacy software when it came to design. They have also set a high bar for user experience, such that users now take ease of use, trivial installation, and automatic versioning for granted. With the introduction of numerous high quality apps for almost every conceivable purpose, the number of users has expanded geometrically, and likewise has the level of satisfaction: mobile devices have become an integral part of the lifestyle of many people who regard a laptop or desktop computer as just a tool.

So why has the mobile device revolution not yet made a significant impact in the R&D branch of life sciences? Some argue that it’s difficult to adapt certain science-specific functionality, such as chemical structure drawing, to the mobile form factor. Others mention that bioinformatics and computer assisted drug design involve lengthy calculations on big datasets, tasks better suited to traditional computers. And portability and facile network communication is not necessarily a positive when it comes to handling sensitive proprietary data.

But none of these issues is insurmountable. For example, my company, Molecular Materials Informatics, has demonstrated that chemical structures can be drawn on a on a palm-sized device just as effectively as on desktop software. Big data/long calculation workflow scenarios can be handled by installing data and software on servers and, through the creative use of open-source components, protocols, algorithms, and a flexible API (application programming interface), building the connections so that apps can plug into these resources. And while security issues will always be important for enterprise app deployment, methods exist for ensuring data privacy in cloud-based environments, and the option always exists for companies to deploy services in-house.

What other issues have prevented apps from taking off in life science? I’ll give my thoughts in my next entry.

Posted in The life science cloud, Workflows & business processes | Tagged , , | 1 Comment