Classifying 2-year recurrence in patients with dlbcl using clinical variables with imbalanced data and machine learning methods

Computer Methods and Programs in Biomedicine(2020)

引用 11|浏览25
•We set the complete remissions (CR) as the time node for the incidence of relapsed/refractory DLBCL, which is continuing to rise and risk-tailored early diagnostic and primary prevention strategies. Our study aims at setting the classifiers to different the patients with the relapsed/refractory DLBCL from the ones who become steady after their first reach CR and set the probability models to provide some reference for the clinicians to identify their patients at high risk.•The relapsed/refractory DLBCL not only performed as the major cause of the high mortality but also cause the class imbalance between the recurrence and non-recurrence population. This might significantly reduce the accuracy of machine learning models. To deal with class-imbalance data problems, SMOTE sampling, the Cost-sensitive methods, and the ensemble learning methods are applied in the data aspect and the model aspect, respectively.•We have set both classifiers and probability predicting models for 2 years recurrence hazard in DLBCL patients who first reached their CR periods. As SVM cannot provide the possibility for each sample, the platt scaling has applied to satisfy such needs.
Relapsed/refractory DLBCL,Imbalanced data,Classification and possibility prediction,Machine learning,Indicators
AI 理解论文
Chat Paper