Range Aggregation With Set Selection

Yufei Tao,Cheng Sheng, Chin-Wan Chung,Jong-Ryul Lee

IEEE Transactions on Knowledge and Data Engineering（2014）

引用 13|浏览52

暂无评分

摘要

In the classic range aggregation problem, we have a set $$$ of objects such that, given an interval $$$, a query counts how many objects of $$$ are covered by $$$. Besides COUNT, the problem can also be defined with other aggregate functions, e.g., SUM, MIN, MAX and AVERAGE. This paper studies a novel variant of range aggregation, where an object can belong to multiple sets. A query (at runtime) picks any two sets, and aggregates on their intersection. More formally, let $$S_{1},\ldots, S_{m$ be $$$ sets of objects. Given distinct set ids $$$, $$$ and an interval $$$, a query reports how many objects in $$S_{i}\mathop{\rm\cap\kern 0pt}\displaylimits S_{j$ are covered by $$$. We call this problem range aggregation with set selection (RASS). Its hardness lies in that the pair $$(i, j$ can have $${m\choose 2$ choices, rendering effective indexing a non-trivial task. The RASS problem can also be defined with other aggregate functions, and generalized so that a query chooses more than 2 sets. We develop a system called RASS to power this type of queries. Our system has excellent efficiency in both theory and practice. Theoretically, it consumes linear space, and achieves nearly-optimal query time. Practically, it outperforms existing solutions on real datasets by a factor up to an order of magnitude. The paper also features a rigorous theoretical analysis on the hardness of the RASS problem, which reveals invaluable insight into its characteristics.

查看译文

关键词

aggregate functions,rass problem,set theory,indexing,rendering (computer graphics),range aggregation with set selection,linear space,theory,rendering,query counts,nearly optimal query time,range aggregation,query processing,index,silicon,aging

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要