A bayesian network approach to the prediction of gene functions

A bayesian network approach to the prediction of gene functions(2006)

引用 23|浏览7
暂无评分
摘要
The knowledge of gene functions is crucial for understanding many biological mechanisms such as cell cycles, regulatory pathways, and diseases. Computational prediction of gene functions from multiple genomics and proteomics data sources posed a challenging research problem. In this dissertation, Bayesian network methods were developed for supervised learning and unsupervised joint learning of gene functions. Based on probability theory and graph theory, Bayesian networks provide a powerful language to describe complex entities and their relationships. When the function labels for the training set are available, we designed a supervised-learning Bayesian method for the function prediction of yeast genes from multiple genomic data sets. This is the first method that allows biologists to apply virtually all sorts of experimental data, and public data to study the novel gene functions. The test data source in our experimental study includes genomic properties, gene ontology, mRNA expression profiles, etc. We found that ontology based data source generally outperform other data sources. For the unsupervised learning of gene functions, we focused on a Bayesian network model that determines the genes that have functions related to human prostate cancer based on integrating microarray, mass spectrometry, and text-based data bases. We reported 14 genes (biomarkers) that may contribute to the development of prostate cancer. This study represented the first study of cancer biomarkers using cross-platform data sets. The issues of representation and computation of heterogeneous biological data sets in Bayesian network models are also studied. This dissertation shows that both supervised and unsupervised learning of gene functions can be effectively studied using Bayesian network technology.
更多
查看译文
关键词
multiple genomic data set,bayesian network approach,public data,cross-platform data set,Bayesian network model,gene function,test data source,experimental data,heterogeneous biological data set,data source,proteomics data source
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要