A tunable compression framework for bitmap indices

ICDE(2014)

引用 64|浏览112
暂无评分
摘要
Bitmap indices are widely used for large read-only repositories in data warehouses and scientific databases. Their binary representation allows for the use of bitwise operations and specialized run-length compression techniques. Due to a trade-off between compression and query efficiency, bitmap compression schemes are aligned using a fixed encoding length size (typically the word length) to avoid explicit decompression during query time. In general, smaller encoding lengths provide better compression, but require more decoding during query execution. However, when the difference in size is considerable, it is possible for smaller encodings to also provide better execution time. We posit that a tailored encoding length for each bit vector will provide better performance than a one-size-fits-all approach. We present a framework that optimizes compression and query efficiency by allowing bitmaps to be compressed using variable encoding lengths while still maintaining alignment to avoid explicit decompression. Efficient algorithms are introduced to process queries over bitmaps compressed using different encoding lengths. An input parameter controls the aggressiveness of the compression providing the user with the ability to tune the tradeoff between space and query time. Our empirical study shows this approach achieves significant improvements in terms of both query time and compression ratio for synthetic and real data sets. Compared to 32-bit WAH, VAL-WAH produces up to 1.8× smaller bitmaps and achieves query times that are 30% faster.
更多
查看译文
关键词
val-wah data set,compression ratio,one-size-fits-all approach,data warehouses,data structures,data compression,query efficiency,word length,query time,bit vector,input parameter,binary representation,tunable compression framework,read-only repositories,bitmap indices,fixed encoding length size,specialized run-length compression techniques,scientific databases,bitwise operations,bitmap compression schemes,variable encoding lengths,query processing,decoding,indexes,encoding,vectors,computer architecture,algorithm design and analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要