Information-theoretic partially labeled heterogeneous feature selection based on neighborhood rough sets

Hongying Zhang, Qianqian Sun, Kezhen Dong

Int. J. Approx. Reason.(2023)

引用 8|浏览8
暂无评分
摘要
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of partially labeled heterogeneous feature selection (i.e., some samples, which own numerical and categorical features, have no labels). Existing solutions typically adopt linear correlations between features. In this paper, three different monotonic uncertainty measures are defined on equivalence classes and neighborhood classes to study the partially labeled heterogeneous feature selection by exploring the nonlinear correlations. First, consistent entropy and monotonic neighborhood entropy, based on classical rough set theory and neighborhood rough set theory, are proposed to construct a uniform measure for feature selection in heterogeneous datasets. Furthermore, a maximal neighborhood entropy strategy is developed by considering the inconsistency of neighborhood classes described by the features and partial labels. Finally, two feature selection algorithms are presented by three novel monotonic uncertainty measures. The comparative experiments demonstrate the effectiveness and superiority of the newly proposed feature selection measures.(c) 2022 Elsevier Inc. All rights reserved.
更多
查看译文
关键词
Feature selection,Monotonic entropy,Partially labeled heterogeneous data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要