Unsupervised Feature Selection: Minimize Information Redundancy of Features

Technologies and Applications of Artificial Intelligence(2010)

引用 6|浏览0
暂无评分
摘要
This paper proposes an unsupervised feature selection method to remove the redundant features from datasets. The major contributions are twofold. First, we propose an eigen-decomposition method to rank the hyperplanes (which describes the relations between features) based on their linear dependency characteristic, and then design an efficient Gaussian-elimination method to sequentially remove the feature that is best represented by the rest of the features. Second, we provide a proof showing that our method is similar to removing the features that contribute the most to the Principal Components with the smallest eigenvalue, but considering the effect of each removal of features with complexity about max(O(nm), O(n2)) instead of O(n3), where n is the number of features and m is the number of observations. We perform experiments on an artificial and real-world datasets. The results show that our method can almost perfectly remove those dependent features without losing any independent dimension in the artificial dataset and outperforms two other competitive algorithms in the realworld datasets.
更多
查看译文
关键词
competitive algorithm,efficient gaussian-elimination method,linear dependency characteristic,eigendecomposition,redundant feature,eigendecomposition method,unsupervised feature selection,eigen-decomposition method,eigenvalue,gaussian-elimination,realworld datasets,unsupervised feature selection method,computational complexity,dependent feature,feature extraction,artificial dataset,information redundancy minimization,gaussian processes,pca,redundant features removal,principal components,gaussian elimination method,minimize information redundancy,eigenvalues and eigenfunctions,principal component analysis,unsupervised learning,real-world datasets,principal component,redundancy,decomposition method,gaussian elimination,vectors,feature selection,eigenvalues
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要