An Efficient Theta-Join Query Processing Algorithm on MapReduce Framework

IS3C '12 Proceedings of the 2012 International Symposium on Computer, Consumer and Control(2012)

引用 5|浏览17
暂无评分
摘要
As the rapid development of hardware and network technology, cloud computing has become an important research topic. For applications of large-scale data processing, such as data warehouse, Map Reduce is the most famous platform for parallel data processing in cloud computing. To support the star-join queries in data warehouse, Scatter-Gather-Merge (SGM) proposes an efficient algorithm on the Map Reduce framework. However, SGM supports only the equi-join queries. Nonequi-join queries may cause SGM to fail. In this paper, we propose a method to cope with theta-join queries, i.e., both equi-join and nonequi-join queries. Our proposed method uses a novel manipulation of keys for partitioning data. The key manipulation matches up the Map Reduce paradigm, and makes theta-join queries workable on the Map Reduce platform. Our experimental results show that the proposed method achieves similar performance to SGM, but our method supports more join-query types. Our method performs even better than SGM in some query types of high data selectivity.
更多
查看译文
关键词
mapreduce framework,efficient theta-join query processing,data warehouse,large-scale data processing,nonequi-join query,partitioning data,cloud computing,map reduce,parallel data,theta-join query,high data selectivity,operating systems,algorithm design and analysis,data handling,internet,radio frequency,benchmark testing,data processing,data warehouses,operating system,algorithm design,search engines,search engine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要