Staggered HLL: Near-continuous-time cardinality estimation with no overhead

Computer Communications(2022)

引用 1|浏览15
暂无评分
摘要
Most of existing cardinality estimation algorithms do not support natively interval queries under a sliding window model and are thereby insensitive to data recency. We present Staggered-HyperLogLog (ST-HLL), a probabilistic data structure that takes inspiration from HyperLogLog (HLL) and provides nearly continuous-time estimation of cardinality rates, rather than absolute counts. Our solution has zero-bit overhead with respect to vanilla HLL and negligible additional computational complexity. It is based on a periodic staggered reset of HLL registers and a register equalization operation at query times to compensate for staggered counting. We tested ST-HLL over both synthetic and real Internet traffic traces, showing its ability to track variations of the flow cardinality, quickly adapting to variations under non-stationary flow arrival processes. We show that for the same amount of memory footprint, our algorithm improves the accuracy up to a factor 2x with respect to the state-of-the-art solution, Sliding HLL.
更多
查看译文
关键词
Network monitoring,Probabilistic data structures,HyperLogLog counters,Cardinality estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要