Machine learning-based prediction model for distant metastasis of breast cancer

Hao Duan, Yu Zhang, Haoye Qiu,Xiuhao Fu, Chunling Liu, Xiaofeng Zang, Anqi Xu, Ziyue Wu, Xingfeng Li,Qingchen Zhang,Zilong Zhang,Feifei Cui

COMPUTERS IN BIOLOGY AND MEDICINE(2024)

引用 0|浏览0
暂无评分
摘要
Background: Breast cancer is the most prevalent malignancy in women. Advanced breast cancer can develop distant metastases, posing a severe threat to the life of patients. Because the clinical warning signs of distant metastasis are manifested in the late stage of the disease, there is a need for better methods of predicting metastasis. Methods: First, we screened breast cancer distant metastasis target genes by performing difference analysis and weighted gene co-expression network analysis (WGCNA) on the selected datasets, and performed analyses such as GO enrichment analysis on these target genes. Secondly, we screened breast cancer distant metastasis target genes by LASSO regression analysis and performed correlation analysis and other analyses on these biomarkers. Finally, we constructed several breast cancer distant metastasis prediction models based on Logistic Regression (LR) model, Random Forest (RF) model, Support Vector Machine (SVM) model, Gradient Boosting Decision Tree (GBDT) model and eXtreme Gradient Boosting (XGBoost) model, and selected the optimal model from them. Results: Several 21-gene breast cancer distant metastasis prediction models were constructed, with the best performance of the model constructed based on the random forest model. This model accurately predicted the emergence of distant metastases from breast cancer, with an accuracy of 93.6 %, an F1-score of 88.9 % and an AUC value of 91.3 % on the validation set. Conclusion: Our findings have the potential to be translated into a point-of-care prognostic analysis to reduce breast cancer mortality.
更多
查看译文
关键词
Breast cancer distant metastasis,Predictive model,Biomarkers,Machine learning,Weighted correlation network analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要