Predicting recurrence of venous thromboembolism in anticoagulated cancer patients using real-world data and machine learning.

Journal of Clinical Oncology(2022)

引用 3|浏览13
暂无评分
摘要
e18742 Background: The clinical predictors of venous thromboembolism (VTE) recurrence in patients with cancer are not well known. Our objective was to develop a predictive model for risk of VTE recurrence in anticoagulant-treated cancer patients within the first 6 months following VTE diagnosis. Methods: Using the EHRead technology, based on Natural Language Processing (NLP) and machine learning (ML), the unstructured clinical data in EHRs from 9 Spanish hospitals between 2014 and 2018 was extracted and analyzed. The study population, comprising all adult anticoagulated cancer patients with VTE, was downsampled to prevent bias and class imbalance. A total of 94 patient characteristics were explored, and Random Forest (RF) feature selection was performed to identify the most relevant predictors for VTE recurrence. Multiple algorithms were used to train different prediction models, which were subsequently validated in a hold-out dataset. The model with the best performance metrics (i.e., ROC-AUC) was selected as the final model. Results: From a source population of 2,893,208 patients, 21,227 anticoagulant-treated patients with VTE and active cancer (53.9% male, median age of 70 years) were identified. Across the study period, yearly incidence of VTE remained relatively stable, ranging from 2.7 to 3.9%. The most common type of VTE was deep vein thrombosis (68.2% of patients), followed by pulmonary embolism (28.4%). Regarding primary cancer locations, the most frequent were colorectal (10.1%) and lung cancer (8.5%). Of all trained and validated models, the RF approach yielded the best performance, with a ROC-AUC = 0.72. The following predictors of VTE recurrence were identified: pulmonary embolism, deep vein thrombosis, metastasis, adenocarcinoma, hemoglobin values, serum creatinine values, platelet count, leukocyte count, family history of VTE, and patients’ age. Conclusions: Using NLP and ML, we were able to use the real-world data in EHRs to build a predictive model of VTE recurrence in cancer patients based on individual clinical features. These results may improve the clinical management of VTE recurrence in this population.
更多
查看译文
关键词
venous thromboembolism,machine learning,cancer patients,real-world real-world data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要