SSS: An Accurate and Fast Algorithm for Finding Top-k Hot Items in Data Streams

2018 IEEE International Conference on Big Data and Smart Computing (BigComp)(2018)

引用 4|浏览38
暂无评分
摘要
Finding top-k hot items in a data stream is a critical problem in big data management. It benefits various kinds of applications, such as data mining, databases, network traffic measurement, etc. However, as the speed of data streams become increasingly large, it becomes more and more challenging to design an accurate and fast algorithm for this problem. There are several existing algorithms, including Space-Saving, Frequent, Lossy counting, with Space-Saving being the most widely used among them. Unfortunately, all these existing algorithms cannot achieve high memory efficiency and high accuracy at the same time. In this paper, we propose an enhanced algorithm, named Scoreboard Space-Saving (SSS), which not only achieves much higher accuracy, but also works at fast and constant speed. The key idea of SSS is to predict whether each incoming item is a hot item or not by scoring. Experimental results show that SSS algorithm achieves up to 62.4 times higher accuracy than Space-Saving.
更多
查看译文
关键词
Data Structures,Finding Top-k Items,Space-Saving
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要