Near-optimal range reporting structures for categorical data

SODA(2013)

引用 34|浏览8
暂无评分
摘要
Range reporting on categorical (or colored) data is a well-studied generalization of the classical range reporting problem in which each of the N input points has an associated color (category). A query then asks to report the set of colors of the points in a given rectangular query range, which may be far smaller than the set of all points in the query range. We study two-dimensional categorical range reporting in both the word-RAM and I/O-model. For the I/O-model, we present two alternative data structures for three-sided queries. The first answers queries in optimal O(lgB N + K/B) I/Os using O(N lg* N) space, where K is the number of distinct colors in the output, B is the disk block size, and lg* N is the iterated logarithm of N. Our second data structure uses linear space and answers queries in O(lgB N + lg(h) N + K/B) I/Os for any constant integer h ≥ 1. Here lg(1) N = lg N and lg(h) N = lg(lg(h−1) N) when h > 1. Both solutions use only comparisons on the coordinates. We also show that the lgB N terms in the query costs can be reduced to optimal lg lgB U when the input points lie on a U x U grid and we allow word-level manipulations of the coordinates. We further reduce the query time to just O(1) if the points are given on an N x N grid. Both solutions also lead to improved data structures for four-sided queries. For the word-RAM, we obtain optimal data structures for three-sided range reporting, as well as improved upper bounds for four-sided range reporting. Finally, we show a tight lower bound on one-dimensional categorical range counting using an elegant reduction from (standard) two-dimensional range counting.
更多
查看译文
关键词
algorithms,design,models of computation,counting problems,theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要