Cloud storage to enable massive cancer cell database

Johns Hopkins researchers are relying on cloud storage of thousands of cell samples to discern the most effective treatment for cancer patients.

Supported by a five-year, $3.75 million grant from the National Cancer Institute, the project aims to help physicians better predict how cancer will behave, since it can spread rapidly in one patient and glacially in another.

The team of experts in cancer and engineering are creating a database of samples collected through a process called high-throughput cell phenotyping, according to an announcement. The data is stored on computers at the Los Alamos National Laboratory.

Denis Wirtz, a Johns Hopkins professor of chemical and biomolecular engineering and associate director of the university's Institute for NanoBio Technology, explained that the project goes beyond storing pictures of the size and shape of cancer cells.

"We also extract information about what is happening inside the cells and at the genetic level. We make notes of the age and gender of the patient and any treatment received. Looked at as a whole, this information can help us identify a 'signature' for a certain type of cancer. That gives us a better idea of how it spreads and how it responds to certain drugs," Wirtz said.

The data allows researchers to trace the course of the disease from initial testing through treatment and outcome. The project's initial focus will be on pancreatic cancer, with breast and prostate cancer to be addressed in the future. The database eventually also will store samples from other major U.S. cancer research centers supported by the National Institutes of Health.

The trend toward using cloud computing to share research is creating significant opportunities for vendors such as Microsoft, Amazon, IBM and Google, according to a recent Reuters story.

It mentions cancer-drug maker Roche's multi-million euro investment in technology to determine how cancer cells in petri dishes react to new medications. To bring external data into the project will require advances in cloud computing, says Bryn Roberts, Roche's head of informatics in drug research and early development.

"The scale of the problem means the solution will be on an international collaborative scale," Roberts told Reuters.

Houston cancer center M.D. Anderson recently pointed to advances in computing capabilities to analyze massive data sets among the "confluence of enabling technologies" behind its "Moon Shots" effort to eradicate eight forms of cancer.

And a report from the Association for Molecular Pathology highlighted the extraordinarily robust bioinformatics infrastructure--both computational hardware and high-level expertise--required to run next-generation sequencing (NGS) of DNA.

"Indeed, the balance of time and effort required for NGS-based research or diagnostics is substantially shifted toward data analysis, as opposed to the technical component required to generate the data," the authors note.

To learn more:
- read the announcement
- here's the Reuters story