Bioinformatics called critical to next-generation DNA sequencing

Laboratories that want to run next-generation sequencing (NGS) of DNA will need extraordinarily robust bioinformatics infrastructure to fully leverage research and diagnostic opportunities, the Association for Molecular Pathology reports in the November issue of the Journal of Molecular Diagnostics.

"The power of NGS to generate hundreds of millions to multigigabase levels of sequence in a single instrument run, while having opened a diversity of research and diagnostic avenues, is concomitantly stretching our ability to process data," according to the report, "Opportunities and Challenges Associated with Clinical Diagnostic Genome Sequencing."

"This unprecedented amount of sequencing information poses bottlenecks that vary, depending on application, at the level of data extraction, analysis, and interpretation," the report continues. "These challenges have become part and parcel of the biomedical research community where investigators have increasingly needed to incorporate bioinformatics and biostatistics into their armamentarium."

The infrastructure needs to include both computational hardware and high-level expertise, the report contends. "Indeed, the balance of time and effort required for NGS-based research or diagnostics is substantially shifted toward data analysis, as opposed to the technical component required to generate the data," the authors note.

Tools required include data management, storage, analysis and archiving for large data sets, according to the report. The most "computationally intensive" step of the basic sequencing process is converting image data into sequence reads, called base calling.

"There is a continuing need to reduce error rates, especially as platforms are pushed to generate longer reads," the report's authors note, adding that "mapping many reads to the reference genome requires highly efficient and accurate algorithms."

The "NGS data deluge" requires programming expertise and specialized servers to handle and store massive amounts of data--several terabytes of raw data for the typical experiment, according to the report. User-friendly informatics tools to analyze the data are critical, the report notes, but adds that cloud computing could reduce the need to purchase potentially prohibitively expensive servers for storage.

The demand for DNA sequencing could increase even further if the medical community adopts a recent recommendation by the Presidential Commission for the Study of Bioethical Issues. In its report, "Privacy and Progress in Whole Genome Sequencing," the commission recommended including DNA sequencing data in standardized electronic health records to advance medical research and clinical care.

To learn more:
- read the report

Suggested Articles

Data and analytics company Health Catalyst reported fourth-quarter 2019 revenue grew 21% to $43.5 million, beating Wall Street estimates.

Key lawmakers are fed up with what they see as poor VA leadership over a new Cerner EHR system and vowed to take tighter control of the project.

The latest report from The Leapfrog Group, an independent hospital safety watchdog group, looked specifically at surgical volumes of 2,100 hospitals.