21 November 2014 | Opinion | By BioSpectrum Bureau
Cloud Computing in Biomarker Discovery
Mr Aaron Hudson & Mr Jordan Stockton
Singapore: Across industries, business needs are driving institutions to utilize cloud computing for a myriad of reasons including; simplicity of deployment, faster-processing time and reduced operating costs. In the research-intensive life sciences industry, the need for systems that can accommodate massive data sets is driving labs to adopt cloud-based technologies. These solutions also enable researchers to communicate and collaborate with internal and external stakeholders by supporting enhanced sharing, analysis, and reporting capabilities. Additionally, simplified, cloud-based user interfaces are expanding the number and diversity of people who can productively interact with biological data.
The AB SCIEX-Illumina OneOmics Initiative
Recently, AB SCIEX with Illumina embarked on the global OneOmics project which enables the integration of proteomic and genomic data analysis and bringing together SWATH acquisition based next-generation proteomics (NGP) and next-generation sequencing (NGS) tools in a cloud environment. Simply put, AB SCIEX's SWATH Proteomics Cloud Tool Kit suite of beta applications is now hosted in BaseSpace, Illumina's cloud-based informatics environment, enabling a single location for genomics and proteomics big data.
AB SCIEX's patented SWATH Proteomics software solves the "missing data problem" where reproducible proteome research is made feasible for the first time across many samples. The software allows thousands of proteins and peptides to be examined at one time with almost no methods development. The reproducibility is something missing in the traditional "shotgun" proteomics approach. With these new cloud-based applications, customers can process data up to 25 times faster than in non-cloud based computer systems.
Until now, Omics researchers around the globe have never had a single, off-the-shelf solution to store, manage, analyse, and compare data generated from different technologies. With the new tools developed in the OneOmics project, they can go beyond simple biomarker identification, and begin to understand the multi-dimensional molecular mechanisms associated with disease and cellular function.
Benefits of Using BaseSpace for Proteomics
There are six key benefits the OneOmics cloud computing solution delivers to the global research community. (In addition to the ability to put genomics and proteomics data in the same place.)
1. Lightning-fast procession of big Omics data
Proteomics analyses of complex biological samples can quickly generate hundreds of gigabytes of data. Especially in studies that include technical and biological replicates, study sizes can grow exponentially and processing the data can take days and occupy valuable computing resources. By using highly parallelized computing resources in the cloud, users can process their SWATH data up to 25 times faster, cutting the time spent on processing from days to hours.
2. Data visualization
With the Protein Expression Browser, a feature in the SWATH Cloud Toolkit, users can conveniently summarize their SWATH proteomics experiments, producing biologist-friendly graphics. And with the Cloud Toolkit's proprietary algorithms, they can convert their raw data into statistically rigorous assignments of protein fold-change. From producing a feature table heat map, to identifying alternatively regulated splice variants, to determining gene ontology (GO), all the results generated can be visualized in the tool and presented simply, making the data easier for biologists to understand.
3. Data-sharing across the world
Life science research organizations have all experienced the hassle of sending a data file as large as 50 GB (equivalent to 10 samples) on a USB, via post. The researcher risks the parcel getting lost or damaged in transit. Using BaseSpace, these organizations can share their data anywhere in the world, securely, with the click of a button. As more research projects become multi-lab or even multi-national collaborations, the cloud ensures that logistics do not limit productivity.
4. Enhanced security
BaseSpace ensures that user data are protected through physical, electronic, and administrative measures. Hosted on Amazon Web Services (AWS), the platform is compliant with a wide variety of industry security standards. When data are being transmitted in BaseSpace, all information is encrypted using the SSL protocol, and all data stored in BaseSpace is encrypted. Users do not have to worry about their research data getting lost or being accessed by outsiders.
5. Simplified IT
With the SWATH Cloud Toolkit on BaseSpace, scientists and biologists can eliminate the tedious IT processes needed to manage, analyse, compare and store their data. The platform can help automate bioinformatics analyses using cloud-based applications and ensure that users have installed the latest version, providing a scalable, secure storage environment that meets the rigorous demands of multi-omics research.
6. Ability to take in more projects
Cloud solutions like BaseSpace enable researchers to do more with less. Simple processes, fast analysis, and easy-to-understand data enables core-labs to take on more projects. In an era where drug development and disease research are pressed to show increased return on investment, the scientific community needs to invest in new technologies to accelerate the move to personalized medicine.
To gain a competitive edge, life sciences researchers can now capitalize on the benefits of cloud technologies, giving them the flexibility to experiment with new analytical methods, and minimizing their investments in technical infrastructure. Cloud-based solutions deliver a slew of advantages, such as fostering closer collaborations, saving time and resources, and eliminating tedious IT processes. The result is that scientists can now take advantage of a growing ecosystem of on-line analysis resources, giving them a competitive advantage in a rapidly evolving industry.