Exploiting Correlations for Expensive Predicate Evaluation
SIGMOD/PODS'15: International Conference on Management of Data Melbourne Victoria Australia May, 2015(2014)
摘要
User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. Selection queries involving UDF predicates tend to be expensive, either in terms of monetary cost or latency. In this paper, we study ways to efficiently evaluate selection queries with UDF predicates. We provide a family of techniques for processing queries at low cost while satisfying user-specified precision and recall constraints. Our techniques are applicable to a variety of scenarios including when selection probabilities of tuples are available beforehand, when this information is available but noisy, or when no such prior information is available. We also generalize our techniques to more complex queries. Finally, we test our techniques on real datasets, and show that they achieve significant savings in cost of up to $80\%$, while incurring only a small reduction in accuracy.
更多查看译文
关键词
Approximate Query Processing,User defined functions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络