Fading histograms in detecting distribution and concept changes

I. J. Data Science and Analytics(2017)

引用 18|浏览11
暂无评分
摘要
The remarkable number of real applications under dynamic scenarios is driving a novel ability to generate and gather information. Nowadays, a massive amount of information is generated at a high-speed rate, known as data streams. Moreover, data are collected under evolving environments. Due to memory restrictions, data must be promptly processed and discarded immediately. Therefore, dealing with evolving data streams raises two main questions: (i) how to remember discarded data? and (ii) how to forget outdated data? To maintain an updated representation of the time-evolving data, this paper proposes fading histograms. Regarding the dynamics of nature, changes in data are detected through a windowing scheme that compares data distributions computed by the fading histograms: the adaptive cumulative windows model (ACWM). The online monitoring of the distance between data distributions is evaluated using a dissimilarity measure based on the asymmetry of the Kullback–Leibler divergence. The experimental results support the ability of fading histograms in providing an updated representation of data. Such property works in favor of detecting distribution changes with smaller detection delay time when compared with standard histograms. With respect to the detection of concept changes, the ACWM is compared with 3 known algorithms taken from the literature, using artificial data and using public data sets, presenting better results. Furthermore, we the proposed method was extended for multidimensional and the experiments performed show the ability of the ACWM for detecting distribution changes in these settings.
更多
查看译文
关键词
Data streams,Fading histograms,Data monitoring,Distribution changes,Concept changes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要