SpaceSaving: An Optimal Algorithm for Frequency Estimation and Frequent items in the Bounded Deletion Model.
arxiv(2021)
摘要
In this paper, we propose the first deterministic algorithms to solve the frequency estimation and frequent item problems in the bounded-deletion model. We establish the space lower bound for solving the deterministic frequent items problem in the boundeddeletion model, and propose Lazy SpaceSaving(+/-) and SpaceSaving(+/-) algorithms with optimal space bound. We develop an efficient implementation of the SpaceSaving(+/-) algorithm that minimizes the latency of update operations using novel data structures. The experimental evaluations testify that SpaceSaving(+/-) has accurate frequency estimations and achieves very high recall and precision across different data distributions while using minimal space. Our experiments clearly demonstrate that, if allowed the same space, SpaceSaving +/- is more accurate than the state-of-the-art protocols with up to logU-1/logU of the items deleted, where.. is the size of the input universe. Moreover, motivated by prior work, we propose Dyadic SpaceSaving(+/-), the first deterministic quantile approximation sketch in the bounded-deletion model.
更多查看译文
关键词
frequency estimation,frequent items,deletion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要