Histograms As A Side Effect Of Data Movement For Big Data

MOD(2014)

引用 45|浏览64
暂无评分
摘要
Histograms are a crucial part of database query planning but their computation is resource-intensive. As a consequence, generating histograms on database tables is typically performed as a batch job, separately from query processing. In this paper, we show how to calculate statistics as a side effect of data movement within a DBMS using a hardware accelerator in the data path. This accelerator analyzes tables as they are transmitted from storage to the processing unit, and provides histograms on the data retrieved for queries at virtually no extra performance cost. To evaluate our approach, we implemented this accelerator on an FPGA. This prototype calculates histograms faster and with similar or better accuracy than commercial databases. Moreover, the FPGA can provide various types of histograms such as Equi-depth, Compressed, or Max-diff on the same input data in parallel, without additional overhead.
更多
查看译文
关键词
FPGA,Statistics,Histogram,Query Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要