Format

Send to

Choose Destination
BMC Bioinformatics. 2008 Apr 25;9:214. doi: 10.1186/1471-2105-9-214.

M-BISON: microarray-based integration of data sources using networks.

Author information

1
Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. bdaigle@stanford.edu

Abstract

BACKGROUND:

The accurate detection of differentially expressed (DE) genes has become a central task in microarray analysis. Unfortunately, the noise level and experimental variability of microarrays can be limiting. While a number of existing methods partially overcome these limitations by incorporating biological knowledge in the form of gene groups, these methods sacrifice gene-level resolution. This loss of precision can be inappropriate, especially if the desired output is a ranked list of individual genes. To address this shortcoming, we developed M-BISON (Microarray-Based Integration of data SOurces using Networks), a formal probabilistic model that integrates background biological knowledge with microarray data to predict individual DE genes.

RESULTS:

M-BISON improves signal detection on a range of simulated data, particularly when using very noisy microarray data. We also applied the method to the task of predicting heat shock-related differentially expressed genes in S. cerevisiae, using an hsf1 mutant microarray dataset and conserved yeast DNA sequence motifs. Our results demonstrate that M-BISON improves the analysis quality and makes predictions that are easy to interpret in concert with incorporated knowledge. Specifically, M-BISON increases the AUC of DE gene prediction from .541 to .623 when compared to a method using only microarray data, and M-BISON outperforms a related method, GeneRank. Furthermore, by analyzing M-BISON predictions in the context of the background knowledge, we identified YHR124W as a potentially novel player in the yeast heat shock response.

CONCLUSION:

This work provides a solid foundation for the principled integration of imperfect biological knowledge with gene expression data and other high-throughput data sources.

PMID:
18439292
PMCID:
PMC2396182
DOI:
10.1186/1471-2105-9-214
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center