Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers

ANNALS OF SURGICAL ONCOLOGY(2024)

引用 0|浏览1
暂无评分
摘要
Objective Patients with gastrointestinal malignancies represent a heterogenous population, even among those with similar stage and treatment pathways. Here, we used dimensionality reduction in the National Cancer Database (NCDB) to inform unsupervised clustering of patients with three gastrointestinal malignancies and examined outcomes among these computationally-derived groups. Methods The NCDB was queried for three cohorts of patients receiving multimodal therapy: stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. Multiple correspondence analysis (MCA), a dimensionality reduction technique well-suited for categorical variables such as demographic data in the NCDB, was performed on this cohort with variables including demographic and tumor characteristics. Principal components were analyzed to derive clusters. Outcomes for each cluster were compared using Kaplan-Meier survival methods. Results For esophageal (n = 11,399), gastric (n = 2033), and colon (n = 72,057) cancer, the same four variables were identified as highly representative. The principal variables were income quartile, education quartile, age quartile, and insurance type. Survival analysis demonstrated significant differences in overall survival between clusters in esophageal (p < 0.0001) and colon (p < 0.0001) cancer, but not gastric cancer (p = 0.56). Clusters defined by high income, high education, younger age, and private insurance fared better. Conclusions Using MCA, we identified combinations of 4 demographic variables in the NCDB with stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. These groupings had significantly different survival outcomes in colon and esophageal cancer. This work serves as proof-of-concept for the utility of unsupervised clustering for outcomes research in surgical malignancies and identifies at-risk populations.
更多
查看译文
关键词
Gastrointestinal cancers,NCDB,Dimensionality reduction,Disparities,Unsupervised clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要