Spark Based Classification Of Microarray Data Using Scalable Artificial Neural Network

INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS(2017)

引用 3|浏览17
暂无评分
摘要
Microarray data has a major drawback of a curse of dimensionality, where the number of features are huge in comparison with that of samples. The data retrieved from microarray cover the varieties in its nature, and changes observed with time. The vast amount of raw gene expression data often leads to computational and analytical challenges, including classification of the dataset into correct groups or classes. In this paper, various feature selection techniques based on statistical tests are proposed using Spark framework. After selecting the relevant features using various statistical tests, Artificial Neural Network (ANN) based on Spark framework (sf-ANN) is proposed, which runs on a scalable cluster with multiple nodes. The performance of sf-ANN is tested with the help of microarray datasets of various dimensions. A detailed comparative analysis in terms of execution time is presented on sf-ANN classifier based on Spark framework and conventional system (data is stored on a standalone machine) respectively, in order to examine its performance.
更多
查看译文
关键词
artificial neural network, big data, feature selection, machine learning, microarray, Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要