Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems.

Knowledge-Based Systems(2019)

引用 44|浏览53
暂无评分
摘要
Feature selection for mixed and incomplete data in terms of numerical and categorical features with missing values has currently gained considerable attention. The development of the neighborhood rough sets-based feature selection method is an important step in improving classification performance, especially in incomplete data with mixed continuous numerical and categorical features. In this paper, a novel feature selection method based on the neighborhood rough sets using Lebesgue and entropy measures in incomplete neighborhood decision systems is proposed, and the method has the capacity to handle mixed and incomplete datasets; further, it can simultaneously maintain the original classification information. First, a Lebesgue measure based on the neighborhood tolerance class is developed to study the positive region and dependency degree. To thoroughly analyze the uncertainty, noise and incompleteness of incomplete neighborhood decision systems, some neighborhood tolerance entropy-based uncertainty measures are presented based on Lebesgue and entropy measures. Then, by combining an algebraic view with an information view in neighborhood rough sets, the neighborhood tolerance dependency joint entropy is defined in incomplete neighborhood decision systems. Moreover, all the corresponding properties are discussed, and the relationships among these measures are established to meaningfully convey the knowledge essence and investigate the uncertainty of incomplete neighborhood decision systems. Finally, for all high-dimensional datasets, the Fisher score method is used to preliminarily eliminate irrelevant features to significantly reduce the computational complexity, and a heuristic feature selection algorithm is designed to improve the classification performance of mixed and incomplete datasets. Experiments under an instance and fifteen public datasets demonstrate that the proposed feature selection method is effective in selecting the most relevant features, achieving great classification ability for incomplete neighborhood decision systems.
更多
查看译文
关键词
Neighborhood rough sets,Feature selection,Neighborhood entropy,Lebesgue measure,Incomplete neighborhood decision systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要