Certain Answers Meet Zero-One Laws.
SIGMOD/PODS '18: International Conference on Management of Data Houston TX USA June, 2018(2018)
摘要
Query answering over incomplete data invariably relies on the standard notion of certain answers which gives a very coarse classification of query answers into those that are certain and those that are not. Here we propose to refine it by measuring how close an answer is certainty. This measure is defined as the probability that the query is true under a random interpretation of missing information in a database. Since there are infinitely many such interpretations, to pick one at random we adopt the approach used in the study of asymptotic properties and 0-1 laws for logical sentences, and define the measure as the limit of a sequence. We show that in the standard model of missing data, the 0-1 law is observed: this limit always exists and can be only 0 or 1 for a very large class of queries. Thus, query answers are either almost certainly true, or almost certainly false. We prove that almost certainly true answers are precisely those returned by the naive evaluation of the query. When databases satisfy constraints, the measure is defined as the conditional probability of the query being true if the constraints are true. This too is defined as a limit, and we prove that it always exists, can be an arbitrary rational number, and is computable. For some constraints, such as functional dependencies, the 0-1 law continues to hold. As another refinement of the notion of certainty, we introduce a comparison of query answers: an answer with a larger set of interpretations that make it true is better. We identify the precise complexity of such comparisons, and of finding sets of best answers, for first-order queries.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络