Efficient query processing for uncertain data

Efficient query processing for uncertain data(2011)

引用 23|浏览12
暂无评分
摘要
Applications with uncertain data pose many challenges for data management and query processing. This dissertation advances the state of the art for efficient query processing over uncertain data. We study three types of probabilistic queries: nearest-neighbor queries, skyline queries and the general select-project-join queries, all of which could leverage a probability threshold for pruning such that only results that satisfy the query with probabilities over the given threshold are returned. For nearest-neighbor queries, we design novel indexes and data structures to monitor the pruning status and uncover pruning opportunities. For skyline queries, we propose two filtering schemes to quickly identify interesting instances whose skyline probabilities are over the threshold: i) by bounding an instance's skyline probability, and ii) by comparing the instance with others based on dominance relationship. In applications of skyline analysis where "thresholding'' is not desirable, we propose the problem of computing all skyline probabilities and for the first time present two worst-case sub-quadratic algorithms for it. We further give an efficient algorithm to solve the online version of the problem. Finally, we study the general select-project-join (SPJ) queries under the Orion uncertainty model and propose optimization rules to leverage the threshold for early pruning of unqualified tuples. We also extend our study to SPJ queries with duplicate elimination. We adopt a general tuple uncertainty model for this case and design new techniques for handling duplicate elimination. Our experiments on various data sets show that our techniques are both effective and efficient.
更多
查看译文
关键词
early pruning,uncertain data,duplicate elimination,efficient query processing,various data set,skyline query,skyline analysis,skyline probability,data structure,data management,nearest-neighbor query
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要