Categorical Range Maxima Queries

SIGMOD/PODS'14: International Conference on Management of Data Snowbird Utah USA June, 2014(2014)

引用 9|浏览40
暂无评分
摘要
Given an array A[1...n] of n distinct elements from the set {1, 2, ..., n} a range maximum query RMQ (a, b) returns the highest element in A[a...b] along with its position. In this paper, we study a generalization of this classical problem called Categorical Range Maxima Query (CRMQ) problem, in which each element A[i] in the array has an associated category (color) given by C[i] is an element of [sigma]. A query then asks to report each distinct color c appearing in C[a...b] along with the highest element (and its position) in A[a...b] with color c. Let p(c) denote the position of the highest element in A[a...b] with color c. We investigate two variants of this problem: a threshold version and a top-k version. In threshold version, we only need to output the colors with A[p(c)] more than the input threshold T, whereas top-k variant asks for k colors with the highest A[p(c)] values.In the word RAM model, we achieve linear space structure along with O(k) query time, that can report colors in sorted order of All. In external memory, we present a data structure that answers queries in optimal O(1+ k/B) I/O's using almost-linear O(n log* n) space, as well as a linear space data structure with O(log* n + k/B) query I/Os. Here k represents the output size, log* n is the iterated logarithm of n and B is the block size. CRMQ has applications to document retrieval and categorical range reporting - giving a one-shot framework to obtain improved results in both these problems. Our results for CRMQ not only improve the existing best known results for three-sided categorical range reporting but also overcome the hurdle of maintaining color uniqueness in the output set.
更多
查看译文
关键词
I/O Efficiency,Categorical Queries
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要