Binary Gravitational Subspace Search for Outlier Detection in High Dimensional Data Streams

Advanced Data Mining and Applications(2022)

引用 1|浏览1
暂无评分
摘要
In recent years, technology has continued to rapidly evolve, resulting in the generation of high-dimensional data streams. Combining the streaming scenario and high dimensionality is a particularly complex task specifically for outlier detection. This is due to the data stream’s unique properties, such as restricted space and time, and concept drift, in addition to the influence of the curse of dimensionality in high-dimensional space. Typically, interesting knowledge including outliers resides in low-dimensional subspaces of the full feature space. Finding these subspaces is considered an NP-Hard problem and requires careful attention, especially in the context of data streams. To address these issues, we proposed BGSSA (Binary Gravitational Subspace Search Algorithm), a novel metaheuristic-based subspace search method for outliers in high dimensional data streams. The idea behind is to adapt the binary GSA algorithm by producing the top best solutions instead of a single one in the original method to find, for each streaming window, relevant subspaces composed of independent features, where outlier detection will be performed. The relevance of a subspace is evaluated by the contrast measure. Experiments on real and synthetic datasets confirm the feasibility of our solution as well as its performance improvement in comparison with the main approaches studied in the literature.
更多
查看译文
关键词
Outlier detection, Data streams, High dimensional data, Subspace
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要