An Efficient Theta-Join Query Processing Algorithm on MapReduce Framework
IS3C '12 Proceedings of the 2012 International Symposium on Computer, Consumer and Control(2012)
摘要
As the rapid development of hardware and network technology, cloud computing has become an important research topic. For applications of large-scale data processing, such as data warehouse, Map Reduce is the most famous platform for parallel data processing in cloud computing. To support the star-join queries in data warehouse, Scatter-Gather-Merge (SGM) proposes an efficient algorithm on the Map Reduce framework. However, SGM supports only the equi-join queries. Nonequi-join queries may cause SGM to fail. In this paper, we propose a method to cope with theta-join queries, i.e., both equi-join and nonequi-join queries. Our proposed method uses a novel manipulation of keys for partitioning data. The key manipulation matches up the Map Reduce paradigm, and makes theta-join queries workable on the Map Reduce platform. Our experimental results show that the proposed method achieves similar performance to SGM, but our method supports more join-query types. Our method performs even better than SGM in some query types of high data selectivity.
更多查看译文
关键词
mapreduce framework,efficient theta-join query processing,data warehouse,large-scale data processing,nonequi-join query,partitioning data,cloud computing,map reduce,parallel data,theta-join query,high data selectivity,operating systems,algorithm design and analysis,data handling,internet,radio frequency,benchmark testing,data processing,data warehouses,operating system,algorithm design,search engines,search engine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要