Gene selection for tumor classification using neighborhood rough sets and entropy measures.

Journal of Biomedical Informatics(2017)

引用 94|浏览74
暂无评分
摘要
Display Omitted We extend the neighborhood rough set model to deal with real-value gene expression data sets.We propose an entropy measure to evaluate neighborhood classes.We propose an efficient entropy-based gene selection algorithm for searching a compact gene subset. With the development of bioinformatics, tumor classification from gene expression data becomes an important useful technology for cancer diagnosis. Since a gene expression data often contains thousands of genes and a small number of samples, gene selection from gene expression data becomes a key step for tumor classification. Attribute reduction of rough sets has been successfully applied to gene selection field, as it has the characters of data driving and requiring no additional information. However, traditional rough set method deals with discrete data only. As for the gene expression data containing real-value or noisy data, they are usually employed by a discrete preprocessing, which may result in poor classification accuracy. In this paper, we propose a novel gene selection method based on the neighborhood rough set model, which has the ability of dealing with real-value data whilst maintaining the original gene classification information. Moreover, this paper addresses an entropy measure under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data. The utilization of this measure can bring about a discovery of compact gene subsets. Finally, a gene selection algorithm is designed based on neighborhood granules and the entropy measure. Some experiments on two gene expression data show that the proposed gene selection is an effective method for improving the accuracy of tumor classification.
更多
查看译文
关键词
Entropy measure,Gene expression data,Gene selection,Neighborhood rough sets,Tumor classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要