Visually comparing multiple partitions of data with applications to clustering

VDA(2009)

引用 24|浏览8
暂无评分
摘要
Tightly coupled visualization and analysis is a powerful approach to data exploration especially for clustering. We describe such a specific integration of analysis and visualization for the evaluation of multiple partitions of a data set. Partitions are decompositions of a dataset into a family of disjoint subsets. They may be the results of clustering, of groupings of categorical dimensions, of binned numerical dimensions, of predetermined class labeling dimensions, or of prior knowledge structured in mutually exclusive format (one data item associated with one and only one outcome). Partition or cluster stability analysis can be used to identify near-optimal structures, build ensembles, or conduct validation. We extend Parallel Sets to a new visualization tool which provides for the mutual comparison and evaluation of multiple partitions of the same dataset. We describe a novel layout algorithm for informatively rearranging the order of records and dimensions. We provide examples of its application to data stability and correlation at the record, cluster, and dimension levels within a single interactive display.
更多
查看译文
关键词
visualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要