Sequence Services

Currently, many R&D functions in life science companies build and maintain their own infrastructure and associated workflows to search gene repositories effectively. Yet this infrastructure confers little competitive advantage. Further, maintaining even a core set of sequence databases and associated software tools, as the Red Queen told Alice "takes all the running you can do to keep in the same place."

The sequence services working group aims to define and demonstrate shared, hosted services for securely storing and mining both proprietary and public domain gene databases. Suppliers will provide the securely managed services to subscribers, who will benefit by significantly reducing their individual infrastructure management costs while positioning their organizations to meet future challenges in gene data management, most notably the data deluge that will be unleashed by initiatives such as the 1000 Genomes and other NGS projects.

Schematic of sequence services phase 1 concept

In Phase I, the sequence services working group focused primarily on developing non-functional requirements including security, performance, scalability, availability, maintenance, and business models with a secondary emphasis on scientific functionality. Four vendors were selected in 2010 to provide proofs of concept: Cognizant/Eagle Genomics, Constellation Technologies/Microsoft, Infosys, and Thomson-Reuters. After testing and "ethical" hacking by AT&T, the proofs of concept were publicly demonstrated and published in April 2011.

Phase 2 will build on the Phase 1 achievements by providing a platform for analyzing and storing next-generation sequencing (NGS) data. Potential benefits of such a platform include

  • The ability to collaborate securely and easily with other organisations and individuals without any risk to company firewalls
  • The ability to store large amounts of data in an extensible way. This is a problem within Pharma where the internal capacity planning cycles are typically much longer than the time over which demand varies. This applies equally well to compute as to storage.
  • Cost reduction and conversion from capital expense to operational expense.
  • Availability of the latest public data and applications, outsourcing and thereby sharing the cost of managing these sometimes rapidly changing resources.

A detailed RFP was released in July 2011 and 10 responses were received from a wide range of vendors. Three proposals were selected for shared risk funding.

Click below to learn more about the proposals, which will be demonstrated at the Pistoial Alliance Conference and Members' Meeting in April 2012.

Constellation/GeneStack

Contellation logoThe partnership of Constellation and GeneStack brings together two innovative companies with world leading technologies from the best research institutions in their field and large company levels of service. The strength of the IT and bioinformatics expertise within this partnership will be able to future-proof a bioinformatics on the cloud service.

Constellation (www.constellationtechnologies.com) is a technology company supplying “bioinformatics solutions on the cloud” to large and small life science companies. Using IT technologies developed within the CERN and the UK research programmes, the company supplies large Pharma and small biotech companies with secure specialist solutions to their bioinformatics big data problems. Also within Constellation’s part of the consortium is Microsoft’s Azure cloud platform and STFC, one of Europe’s leading research institutes. Constellation was one of the companies that provided a proof of concept for Phase 1 of the sequence services project.

GeneStack (www.genestack.com) is a new company building a universal platform for bioinformatics application development. Founded by former European Bioinformatics Institute employees, the company possesses wide expertise in computer science and bioinformatics, ranging from functional genomics data analysis, algorithm design and data visualization to next-generation sequencing data processing and quality control. GeneStack also brings to this consortium their partner JetBrains, a software development company with 11 years of experience building high-quality professional tools and services.

Eagle Genomics/Cycle Computing

 

Cycle and Eagle logosEagle Genomics and Cycle Computing’s participation in Sequence Services Phase 2 builds on the success of Eagle’s participation in Phase 1. The consortia will leverage Cycle’s in-depth knowledge of building highly scalable solutions on top of the Amazon Cloud and Eagle’s domain expertise in bioinformatics. The bioinformatics platform developed by the partnership will reuse existing components from Phase 1. We will provide additional tools from the bioinformatics world, augmented with new and improved versions of tools already in use within both companies.

 The trans-Atlantic partnership between Eagle, based in Cambridge UK, and Cycle, based in the USA, offers both sides an opportunity to validate their business models. The extensive use of open-source tools and open-access data within the platform demonstrates the power of open innovation. Combining best of breed in bioinformatics and cloud computing from numerous sources allows us to meet Pistoia's functional requirements within the time frame. Components being used in the platform include Cycle Computing’s powerful Cycle Server high performance compute interface, SysMO-DB’s SEEK asset catalogue, and the Bio Linux workstation.

 Eagle and Cycle are excited about the prospect of working on this platform together and demonstrating it at the Pistoia Conference in Boston in April 2012. This innovative approach to platform development should allow the end result to be highly flexible and adaptable to each end-user’s needs. It also ensures maximum levels of security and data integrity.

Hewlett-Packard Company

HP_High_Res_logo-colour_small

HP and partners will develop a sequence service platform that will encourage collaboration, cut cost, open standards, and provide business opportunities for third parties. HP will build a highly scalable platform to enable multi-tenant collaboration around the analysis of next-generation sequences to enable the discovery of new targets and new biomarkers and develop new insights for more targeted healthcare delivery.

HP is strongly aligned with the overall Pistoia Alliance vision to enable more effective networked and externalized life science research via improved knowledge transfer, systems interoperability, and a collaboration environment for working groups. HP appreciates that more effective linkage and leverage of information is at the heart of changing healthcare and delivering more targeted therapies. For the future HP offers the ability to create innovative selective collaboration environments, accommodate new use cases, extend linkages to other data sets and types, and further extend capabilities of search and inference engines.

HP would like to consult extensively with members of the Pistoia Alliance during Phase 2 of this project in order to understand more about user needs, how HP can partner with Alliance members to further the goals of Pistoia, and where to focus investment. Offer your views and comments to This e-mail address is being protected from spambots. You need JavaScript enabled to view it .

Sequence Services Team

  • Simon Thornber (project lead), GSK
  • Claus Stie Kallesøe, Lundbeck
  • Ralf Wahl, Novartis
  • Cary O’Donnell, AZ

Phase 1 outputs

Cognizant/Eagle

This phase 1 platform delivers Ensembl, Plasmapper, and a Gene Alias service.

View presentation describing service »

Contact Cognizant to evaluate the service.

Constellation/Microsoft

As part of a suite of “bioinformatics on the cloud” services provided by Constellation, this service enables users at all levels to access applications and tools on either Windows or Linux platforms. 

Contact Constellation to evaluate the service.


Infosys

This secure hosted platform of trusted federated workspace authenticates user logins with corporate identity, enabling scientists to access post-genomic data management services that can be scaled for performance to provide a controlled view of the public-private data interlink.

 

View presentation describing the service »

Contact Infosys to evaluate the service.

Thomson Reuters

Thomson Reuters logo

More for members

Pistoia Alliance members can login to Basecamp to get the latest information published by the Sequence Services working group.

Not an Alliance member?

Join now!