Relevant Subspace Clustering: Mining the Most Interesting Non-redundant Concepts in High Dimensional Data

ICDM(2009)

引用 89|浏览28
暂无评分
摘要
Subspace clustering aims at detecting clusters in any subspace projection of a high dimensional space. As the number of possible subspace projections is exponential in the number of dimensions, the result is often tremendously large. Recent approaches fail to reduce results to relevant subspace clusters. Their results are typically highly redundant, i.e. many clusters are detected multiple times in several projections. In this work, we propose a novel model for relevant subspace clustering (RESCU). We present a global optimization which detects the most interesting non-redundant subspace clusters. We prove that computation of this model is NP-hard. For RESCU, we propose an approximative solution that shows high accuracy with respect to our relevance model. Thorough experiments on synthetic and real world data show that RESCU successfully reduces the result to manageable sizes. It reliably achieves top clustering quality while competing approaches show greatly varying performance.
更多
查看译文
关键词
subspace clustering,interesting non-redundant concepts,relevance model,novel model,interesting non-redundant subspace cluster,relevant subspace clustering,top clustering quality,possible subspace projection,relevant subspace cluster,high dimensional data,high accuracy,subspace projection,computational modeling,np hard,approximation algorithms,data mining,clustering algorithms,cost function,redundancy,probability density function,global optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要