Fine-Grained Data Partitioning Framework For Distributed Database Systems

WWW '14: 23rd International World Wide Web Conference Seoul Korea April, 2014(2014)

引用 4|浏览7
暂无评分
摘要
With the increasing size of web data and widely adopted parallel cloud computing paradigm, distributed database and other distributed data processing systems, for example Pregel and GraphLab, use data partitioning technology to divide the large data set. By default, these systems use hash partitioning to randomly assign data to partitions, which leads to huge network traffic between partitions.Fine-grained partitioning can better allocate data and minimize the number of nodes involved within a transaction or job while balancing the workload across data nodes as well. In this paper, we propose a novel prototype system, LuTe, to provide highly efficient fine-grained partitioning scheme for these distributed systems. LuTe maintains a lookup table for each partitioned data set that maps a key to a set of partition ID(s). We use a novel lookup table technology that provides low cost of reading and writing lookup table. LuTe provides transaction support and high concurrency writing with Multi Version Concurrency Control (MVCC) as well.We implemented a prototype distributed DBMS on Post-gresql and used LuTe as a middle -ware to provide fine-grained partitioning support. Extensive experiments conducted on a cluster demonstrate the advantage of the proposed approach. The evaluation results show that in comparison with other state-of-the-art lookup table salutations, our approach can significantly improve throughput by about 20% to 70% on TPC-C benchmark.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要