MulRF: A Multi-dimensional Range Filter for Sublinear Time Range Query Processing

IEEE Transactions on Knowledge and Data Engineering(2024)

引用 0|浏览6
暂无评分
摘要
Range query is an important operation on big multi-dimensional data. This paper studies the problem of multi-dimensional range query filtering for speeding up the range query processing by avoiding reading the useless data. To solve the problem, a novel multi-dimensional range filter is proposed to filter the multi-dimensional range queries, while the existing one-dimensional range filters can not provide efficient filtering. Based on the multi-dimensional range filter, an efficient range query processing algorithm is presented. It can directly return the locations of the I/O units that contain the data in the query result without any access to the input dataset. The time complexity of the algorithm is $O(3^{m}h)$ , where $h$ is the number of I/O units partially overlapping with a range query, and $m$ is the dimension number. Since $m$ is usually $o(\sqrt{\log n})$ , it is a sublinear time algorithm if $V=O(n)$ , where $n$ is the size of the input dataset, $V=\prod _{i=1}^{m}d_{i}$ , and $d_{i}$ is the number of distinct values on the $i$ -th dimension of the dataset for $1\leq i\leq m$ . Experimental results show that the multi-dimensional range filter has low false positive rate and good filtering efficiency. The proposed range query processing algorithm achieves at least 3 $\sim$ 7 times improvement compared to the one-dimensional filter based algorithms on different datasets.
更多
查看译文
关键词
Multi-dimensional Data,Range Query,Range Filter,Sublinear Time
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要