Unsupervised Feature Selection for Multi-View Clustering on Text-Image Web News Data.

CIKM '14: 2014 ACM Conference on Information and Knowledge Management Shanghai China November, 2014(2014)

引用 37|浏览81
暂无评分
摘要
Unlabeled high-dimensional text-image web news data are produced every day, presenting new challenges to unsupervised feature selection on multi-view data. State-of-the-art multi-view unsupervised feature selection methods learn pseudo class labels by spectral analysis, which is sensitive to the choice of similarity metric for each view. For text-image data, the raw text itself contains more discriminative information than similarity graph which loses information during construction, and thus the text feature can be directly used for label learning, avoiding information loss as in spectral analysis. We propose a new multi-view unsupervised feature selection method in which image local learning regularized orthogonal nonnegative matrix factorization is used to learn pseudo labels and simultaneously robust joint $l_{2,1}$-norm minimization is performed to select discriminative features. Cross-view consensus on pseudo labels can be obtained as much as possible. We systematically evaluate the proposed method in multi-view text-image web news datasets. Our extensive experiments on web news datasets crawled from two major US media channels: CNN and FOXNews demonstrate the efficacy of the new method over state-of-the-art multi-view and single-view unsupervised feature selection methods.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要