Incremental approaches for heterogeneous feature selection in dynamic ordered data
Information Sciences(2020)
摘要
Feature selection can identify essential features and reduce the dimensionality of features, improving the classification ability of a learning model. In this study, we consider data with a preference-order relation, i.e., ordered data. In the big data era, ordered data contain noise and exhibit heterogeneous features (including numerical and categorical features) and dynamic characteristics (i.e., new objects are added and obsolete objects are removed with evolving time). The dominance-based neighborhood rough set (DNRS) considers the preference order relation of heterogeneous features and demonstrates fault tolerance; thus, it can be applied well to heterogeneous feature selection in ordered data. At present, DNRS-based heterogeneous feature selection methods are only used for static ordered data. For dynamic ordered data, existing heterogeneous feature selection approaches are highly time-consuming because they are required to recalculate knowledge from scratch when multiple objects vary. Motivated by this issue, we utilize a matrix-based method in this work to study incremental heterogeneous feature selection based on DNRS in dynamic ordered data. First, we define neighborhood dominance conditional entropy (NDCE) as the uncertainty measure and introduce a non-monotonic feature selection strategy based on this measure. Second, the neighborhood dominance relation matrix and its diagonal matrix are defined to calculate NDCE in matrix form. Third, the updating mechanisms of the diagonal matrix are studied when objects vary and used to update NDCE. Lastly, two incremental feature selection algorithms are proposed when multiple objects are added to or deleted from heterogeneous ordered data. Experiments are performed on public data sets. Experimental results verify that the proposed incremental algorithms are effective and efficient for updating feature subsets in dynamic heterogeneous ordered data.
更多查看译文
关键词
Heterogeneous ordered decision system,Dominance-based neighborhood rough set,Feature selection,Matrix-based incremental algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络