Researchers call for global genomic data repository in the cloud

Researchers from Canada, Europe and the United States are calling on major funding agencies to set up a global genomic data repository in the cloud available to authorized researchers worldwide.

In an article at Nature, they note that cloud services offer massive storage and compute power on a pay-as-you-go basis, with multiple users sharing hardware. They applaud the U.S. National Institutes of Health's decision in March to lift its restriction on the use of cloud computing for storing genomic data sets.

"The NIH turnaround is part of a growing suite of efforts aimed at addressing the fact that in the human genomics research community, the challenges of accessing big data sets are now blocking scientists' ability to do research, and especially to replicate and build on previous work," they state.

The researchers urge NIH and similar agencies elsewhere to create and pay for a public archive of genomic data in the cloud.

In the International Cancer Genome Consortium, groups from 17 countries have amassed two petabytes of data--roughly 500,000 DVDs-worth--in just five years. Downloading that data through a typical university Internet connection would take more than 15 months, the researchers say, and the hardware to just store it would cost more than $1 million.

They contend that major cloud services such as from Google, Microsoft and Amazon are as secure as most academic data centers, if not more so.

With a cloud repository, data would then only need to be copied once and researchers would only have to pay for temporary storage while the analysis was in progress. Access would only be provided to authorized researchers, an announcement on the article notes.

Google and Amazon have been making a play to woo genomics businesses with their DNA storage capabilities, a business expected to be worth $1 billion in the next three years.

The Fred Hutchinson Cancer Research Center in Seattle is among to the growing list of facilities looking to replace its legacy software and computer systems with cloud services.

To learn more:
- read the Nature article
- find the announcement