Large-Scale Subspace Clustering by Independent Distributed and Parallel Coding

IEEE Transactions on Cybernetics(2022)

引用 6|浏览51
暂无评分
摘要
Subspace clustering is a popular method to discover underlying low-dimensional structures of high-dimensional multimedia data (e.g., images, videos, and texts). In this article, we consider a large-scale subspace clustering (LS 2 C) problem, that is, partitioning million data points with a millon dimensions. To address this, we explore an independent distributed and parallel framework by dividing big data/variable matrices and regularization by both columns and rows. Specifically, LS 2 C is independently decomposed into many subproblems by distributing those matrices into different machines by columns since the regularization of the code matrix is equal to a sum of that of its submatrices (e.g., square-of-Frobenius/ $\ell _{1}$ -norm). Consensus optimization is designed to solve these subproblems in a parallel way for saving communication costs. Moreover, we provide theoretical guarantees that LS 2 C can recover consensus subspace representations of high-dimensional data points under broad conditions. Compared with the state-of-the-art LS 2 C methods, our approach achieves better clustering results in public datasets, including a million images and videos.
更多
查看译文
关键词
Distributed and parallel computing,least-squares regression (LSR),low-rank representation (LRR),over-high dimensional big data,sparse subspace clustering (SSC),subspace clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要