Multi-class Classification for Breast Cancer with High Dimensional Microarray Data Using Machine Learning Classifier.

Lecture notes on data engineering and communications technologies(2022)

引用 0|浏览2
暂无评分
摘要
Breast cancer is one of the leading causes of cancer related deaths among women. Early detection of breast cancer is very important for proper treatment and decreasing the death risk among women. Most cancer prediction study focused on binary classification of breast cancer. This study focused on multi-class classification of breast cancer with high dimensional microarray data. The dataset involved 38 cancer patients, 3 categories: normal (9), early tumour (12), and late tumor (17), and 39,426 microarray biomarkers. Boruta’s feature selection algorithm selected 28 important microarray biomarkers. The performance of support vector machine, multinomial logistic regression, Naïve Bayes, and random forest were evaluated based on macro and micro accuracy, sensitivity, and precision. Results showed that multinomial logistic regression, Naïve Bayes and random forest exhibits overfitting issue. However, support vector machine performed well in multi-classification of breast cancer (macro_acctest = 86.7%, macro_sentest = 77.8%, and macro_prectest = 62.0%). In future work, bagging, and boosting with over sampling techniques can be considered to improve multi-class classification of breast cancer using high dimensional microarray data.
更多
查看译文
关键词
high dimensional microarray data,breast cancer,classification,machine learning,multi-class
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要