Large-Scale Subspace Clustering by Independent Distributed and Parallel Coding

Jun Li,Zhiqiang Tao,Yue Wu,Bineng Zhong,Yun Fu

IEEE Transactions on Cybernetics（2022）

引用 6|浏览51

暂无评分

摘要

Subspace clustering is a popular method to discover underlying low-dimensional structures of high-dimensional multimedia data (e.g., images, videos, and texts). In this article, we consider a large-scale subspace clustering (LS ² C) problem, that is, partitioning million data points with a millon dimensions. To address this, we explore an independent distributed and parallel framework by dividing big data/variable matrices and regularization by both columns and rows. Specifically, LS ² C is independently decomposed into many subproblems by distributing those matrices into different machines by columns since the regularization of the code matrix is equal to a sum of that of its submatrices (e.g., square-of-Frobenius/

$\ell _{1}$

-norm). Consensus optimization is designed to solve these subproblems in a parallel way for saving communication costs. Moreover, we provide theoretical guarantees that LS ² C can recover consensus subspace representations of high-dimensional data points under broad conditions. Compared with the state-of-the-art LS ² C methods, our approach achieves better clustering results in public datasets, including a million images and videos.

查看译文

关键词

Distributed and parallel computing,least-squares regression (LSR),low-rank representation (LRR),over-high dimensional big data,sparse subspace clustering (SSC),subspace clustering

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要