Using Canonical Correlation Analysis for Parallelized Attribute Reduction.

PRICAI(2016)

引用 0|浏览15
暂无评分
摘要
Attribute reduction in rough sets theory has been widely used in classification. Classical attribute reduction algorithm only considers correlation between condition attributes and decision attributes, which ignores the relationship among condition attributes themselves. Moreover, when faced with large-scale data, running time of classical attribute reduction algorithm has been increasing. Aiming to solve these two problems, a parallelized reduction algorithm called P-CCARoughReduction is proposed in this paper. The algorithm employs canonical correlation analysis named CCAFusion and parallelized attribute reduction algorithm named P-RoughReduction. CCAFusion divides the original set of attributes into two subsets randomly. Then the correlations of these two subsets of features are analyzed. After that, the attributes are fused into one collection according to the derived correlations. P-RoughReduction algorithm is based on a distributed framework MapReduce which parallelizes the classical attribute reduction algorithm according to the attribute importance in rough sets theory. It is shown that P-CCARoughReduction algorithm through experiments on 50000 samples not only performs well on time, the classification accuracy has also been significantly improved.
更多
查看译文
关键词
Canonical correlation analysis, Rough sets theory, Attribute reduction, MapReduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要