Software mines literature for clues on cancer protein

IBM and the Baylor College of Medicine are touting software that can mine research papers for clues on the workings of a protein implicated in most cancers.

The software parsed text in 60,000 research articles for clues to the behavior of enzymes called kinases that act on the protein, called p53, and regulate its behavior. It then listed other proteins mentioned in the literature that likely were undiscovered kinases. So far, seven in 10 of its predictions have been correct, reports Technology Review.

In a previous study, the software parsed research literature published before 2003 to determine whether it could predict the p53 kinases that have been discovered since; then, it found seven of nine of them.

With kinases discovered at about one per year, the software could speed up the discovery rate and potentially herald a faster approach to developing new drugs, according to the article. So far the software is configured only to look for kinases, but could be used to search for other enzymes as well.

The Baylor collaboration is an extension of text-analyzing tools that IBM already offers to pharmaceutical companies that mine publications, patents, and molecular databases.

Research literature is being produced at a rate far outstripping humans' ability to keep up. Last year more than one million new articles were added to the U.S. National Library of Medicine's Medline database of biomedical research papers, which now contains 23 million items, the article says.

Researchers in the U.K. have used text mining to identify Alzheimer's disease biomarkers. In a study published in the Journal of Translational Medicine, the authors were able to identify 25 biomarker candidates by mining publicly available databases, and said the practice could be applied to other disorders.

Meanwhile, a new tool created with IBM's Watson technology, called WatsonPaths, pulls from reference materials, clinical guidelines and medical journals in real time to help doctors diagnose patients and solve medical problems.

To learn more:
- find the Technology Review article