PrefixCDD: Effective Online Concept Drift Detection over Event Streams using Prefix Trees.

COMPSAC(2023)

引用 0|浏览1
暂无评分
摘要
Process mining focuses on applying data mining techniques over business process data. Recently, with the improvements in sensoring, collection, and storage of event data, a big demand for both shorter mining time and adaptive models of streaming process events arose. This increased the interest in streaming process mining. Some techniques within this field attempt to identify drifts (change points) from evolving process data streams. Existing work on supervised and unsupervisedlearning approaches over data streams have several limitations with regards to the nature of the drifts, the excessive storage required to store and process the stream, and the performance over real-world datasets. This paper contributes PrefixCDD, an efficient unsupervised-learning novel approach for online concept drift detection (CDD) over event streams. Our proposed approach utilizes a data structure, where the data stream components are stored in a set of prefix-trees. It transforms then the discrete data into continuous one using a Principal Component Analysis (PCA) approach over the trees. Then, ADWIN is used to focus on up-to-date information, making it appealing to work with the decaying mechanism logic behind our algorithm. Using six artificial and three real-life datasets, PrefixCDD outperforms state-of-the-art techniques in terms of detecting existing drifts of different natures, discovering them shortly after they appear, and the overall execution time.
更多
查看译文
关键词
Process Mining, Stream Process Mining, Concept Drift Detection, Unsupervised-Learning, Event Streams
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要