Sparse PCA via $\ell _{2,p}$2,p-Norm Regularization for Unsupervised Feature Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence(2023)

引用 12|浏览181
In the field of data mining, how to deal with high-dimensional data is an inevitable topic. Since it does not rely on labels, unsupervised feature selection has attracted a lot of attention. The performance of spectral-based unsupervised methods depends on the quality of the constructed similarity matrix, which is used to depict the intrinsic structure of data. However, real-world data often contain plenty of noise features, making the similarity matrix constructed by original data cannot be completely reliable. Worse still, the size of a similarity matrix expands rapidly as the number of samples rises, making the computational cost increase significantly. To solve this problem, a simple and efficient unsupervised model is proposed to perform feature selection. We formulate PCA as a reconstruction error minimization problem, and incorporate a $\ell _{2,p}$ -norm regularization term to make the projection matrix sparse. The learned row-sparse and orthogonal projection matrix is used to select discriminative features. Then, we present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically. Finally, experiments on both synthetic and real-world data sets demonstrate the effectiveness of our proposed method.
Unsupervised feature selection,principal component analysis, $\ell _{2,p}$ ℓ 2 , p -norm,sparse learning
AI 理解论文
Chat Paper