Randomized Multi-Pass Stream Ng Skyline Algorithms

International Conference on Very Large Data Bases(2009)

引用 58|浏览35
暂无评分
摘要
We consider external algorithms for skyline computation yvithout pre-processing. Our goal is to develop an algorithm with a good worst case guarantee while performing well on average. Due to the nature of disks, it is desirable that such algorithms access the input as a stream (even if in multiple passes). Using the tools of randomness, proved to be useful in many applications, we present an efficient multi-pass streaming algorithm; RAND, for skyline computation. As far as we are aware, RAND is the first randomized skyline algorithm in the literature.RAND is near-optimal for the streaming model, which we prove via a simple lower bound. Additionally, our algorithm is distributable and can handle partially ordered domains on each attribute. Finally, we demonstrate the robustness of RAND via extensive experiments on both real and synthetic datasets. RAND is comparable to the existing algorithms in average case and additionally tolerant to simple modifications of the data, while other algorithms degrade considerably with such variation.
更多
查看译文
关键词
skyline computation,algorithms access,existing algorithm,external algorithm,randomized skyline algorithm,average case,good worst case guarantee,simple modification,efficient multi-pass,extensive experiment,Randomized multi-pass
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要