On Spatial Joins In Mapreduce

25TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2017)(2017)

引用 14|浏览24
暂无评分
摘要
This paper provides the first attempt for a full-fledged query optimizer for MapReduce-based spatial join algorithms. The optimizer develops its own taxonomy that covers almost all possible ways of doing a spatial join for any two input datasets. The optimizer comes in two flavors; cost-based and rule-based. Given two input data sets, the cost-based query optimizer evaluates the costs of all possible options in the developed taxonomy, and selects the one with the lowest cost. The rule-based query optimizer abstracts the developed cost models of the cost-based optimizer into a set of simple easy-to-check heuristic rules. Then, it applies its rules to select the lowest cost option. Both query optimizers are deployed and experimentally evaluated inside a widely used open-source MapReduce-based big spatial data system. Exhaustive experiments show that both query optimizers are always successful in taking the right decision for spatially joining any two datasets of up to 500GB each.
更多
查看译文
关键词
Hadoop,MapReduce,Spatial Join,Query Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要