VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal

EDBT '12: Proceedings of the 15th International Conference on Extending Database Technology(2012)

引用 16|浏览0
暂无评分
摘要
We propose a compact and efficient index structure for massive data sets. Several indexing techniques are widely-used and well-known such as binary trees and B+trees. Unfortunately, we find that these techniques suffer major two shortcomings when applied to massive sets; first, their indices are so large they could overflow regular main memory, and, second, they suffer from a variety of penalties (e.g., conditional branches, low cache hits, and TLB misses), which restricts the number of instructions executed per processor cycle. Our state-of-the-art index structure, called VAST-Tree, classifies branch nodes into multiple layers. It applies existing techniques such as cache-conscious, aligned, and branch-free structures to the top layers of branch nodes in trees. Next, it applies the adaptive compression technique to save space and harness data parallelism with SIMD instructions to the middle and bottom layers of branch nodes. Moreover, a processor-friendly compression technique is applied to leaf nodes. The end result is that trees are much more compact and traversal efficiency is high. We implement a prototype and show its resulting index size and performance as compared to binary trees, and the hardware-conscious technique called FAST which currently offers the highest performance. Compared to current alternatives, VAST-Tree compacts the branch nodes by more than 95%, and the overall index size by 47-84% given that there are 230 keys. With 228 keys, it has roughly 6.0-times and 1.24-times throughput and saves the memory consumption by more than 94.7% and 40.5% as compared to binary trees and FAST, respectively.
更多
查看译文
关键词
massive data tree traversal,binary tree,adaptive compression technique,state-of-the-art index structure,efficient index structure,hardware-conscious technique,indexing technique,overall index size,branch node,resulting index size,conditional branch,indexation,simd,algorithms,tree traversal,compression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要