Cross-Study Replicability in Cluster Analysis

arxiv(2023)

引用 2|浏览1
暂无评分
摘要
In cancer research, clustering techniques are widely used for ex-ploratory analyses, playing a critical role in the identification of novel cancer subtypes and patient management. As data collected by multiple research groups grows, it is increasingly feasible to investigate the replicability of clustering procedures, that is, their ability to consistently recover biologi-cally meaningful clusters across several data sets. In this paper, we review methods for replicability of clustering analyses, and discuss a novel frame-work for evaluating cross-study clustering replicability, useful when two or more studies are available. Our approach can be applied to any clustering al-gorithm and can employ different measures of similarity between partitions to quantify replicability, globally (i.e., for the whole sample) as well as lo-cally (i.e., for individual clusters). Using experiments on synthetic and real gene expression data, we illustrate the usefulness of our procedure to evalu-ate if the same clusters are identified consistently across a collection of data sets.
更多
查看译文
关键词
cluster analysis,cross-study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要