Optimizing Queries Over Partitioned Tables In Mpp Systems

SIGMOD/PODS'14: International Conference on Management of Data Snowbird Utah USA June, 2014(2014)

引用 20|浏览36
暂无评分
摘要
Partitioning of tables based on value ranges provides a powerful mechanism to organize tables in database systems. In the context of data warehousing and large-scale data analysis partitioned tables are of particular interest as the nature of queries favors scanning large swaths of data. In this scenario, eliminating partitions from a query plan that contain data not relevant to answering a given query can represent substantial performance improvements. Dealing with partitioned tables in query optimization has attracted significant attention recently, yet, a number of challenges unique to Massively Parallel Processing (MPP) databases and their distributed nature remain unresolved.In this paper, we present optimization techniques for queries over partitioned tables as implemented in Pivotal Greenplum Database. We present a concise and unified representation for partitioned tables and devise optimization techniques to generate query plans that can defer decisions on accessing certain partitions to query run-time. We demonstrate, the resulting query plans distinctly outperform conventional query plans in a variety of scenarios.
更多
查看译文
关键词
query optimization,MPP systems,partitioning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要