Unsupervised Variance Based Preprocessing Of Microarray Data

2009 22ND IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(2009)

引用 1|浏览9
暂无评分
摘要
Data preprocessing is an important step in preparation of DNA microarray data for further analysis. There is a significant amount of genes that do not influence the final classification. One of the reasons to eliminate such genes is the increasing computational complexity of supervised machine learning methods, especially in modern microarray experiments with hundreds of samples. This empirical study aims to measure differences in classification performance when different numbers of gene expression measurements are removed in a preprocessing phase. Simple unsupervised gene selection based on variance level of genes across all samples was used to remove genes with extremely low level of variance. This study shows the importance of combining unsupervised and supervised feature selection techniques along with classification algorithm. It was shown that gene expression values removed using simple unsupervised gene selection method are not of significant importance to the final results of supervised gene selection followed by classification.
更多
查看译文
关键词
gene expression,machine learning,bioinformatics,classification algorithms,feature selection,dna,data preprocessing,empirical study,unsupervised learning,computational complexity,microarray data,support vector machines,data mining,gene selection,genetics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要