New software tools may improve detection of cancer biomarkers
The growth of genomic and proteomic data has ushered in a new era of molecular medicine in which cancer detection, diagnosis and treatment are tailored to each individual's molecular profile, according to researchers at Georgia Institute of Technology in Atlanta. This personalized medicine approach requires researchers to discover and link biomarkers, such as genes or proteins, to specific disease behaviors, such as the rate of tumor progression and different responses to treatments.

According to the researchers, two new software programs that help address this challenge have recently earned silver-level compatibility certification from the National Cancer Institute's (NCI) cancer Biomedical Informatics Grid (caBIG). The programs seek to improve the process of identifying cancer biomarkers from gene expression data.

Developed by May Dongmei Wang, PhD, and her team in the department of biomedical engineering at Georgia Tech and Emory University in Atlanta, the programs—caCORRECT and omniBioMarker—remove noise and artifacts, and identify and validate biomarkers from microarray data.

caCORRECT, or chip artifact CORRECTion, is a software program that improves the quality of collected microarray data, leading to improved biomarker selection. Widely used Affymetrix microarrays contain thousands of probes, each including a 25-oligo sequence, which are used to detect mRNA expression levels. Wang and colleagues said that caCORRECT removes noise and artifacts from the data, while retaining high-quality genes on the array. The software also can recover lost information that was obscured.

"caCORRECT is a quality assurance tool that allows researchers to utilize and trust imperfect experimental microarray data that they spent a tremendous amount of time and money to generate," Wang added. "caCORRECT improves the downstream analysis of microarray data and should be used before conducting biomarker selection, therapeutic target studies, or pathway analysis studies in bioinformatics and systems biology."

Candidate cancer biomarkers are typically genes expressed at different levels in cancer patients compared to healthy subjects. Wang and her team said that omniBioMarker searches these groups of patient data for genes with the highest potential for determining whether a patient has cancer. However, because individual genes are not expressed independently, the software also identifies groups of genes that act in concert.

omniBioMarker software seeks to fine-tune biomarker selection to a particular dataset or clinical problem based on prior biological knowledge. It also applies analysis parameters for each clinical problem. The parameters are optimal when the software selects genes that are known to be relevant biomarkers based on clinical observations and laboratory experiments available in literature and public databases, according to the developers.

Wang and her team have been working on getting two more software programs certified—Q-IHC, a tool that analyzes and quantifies multi-spectral images, such as quantum dot-stained histopathological images, and omniVisGrid, a grid-based tool that visualizes data and analysis processes of microarrays, biological pathways and clinical outcomes.

Funding to develop the programs was provided by the National Institutes of Health (NIH), the Georgia Cancer Coalition, Microsoft Research and Hewlett-Packard.