ATUN-HL: Auto Tuning of Hybrid Layouts Using Workload and Data Characteristics.
ADBIS(2018)
摘要
Ad-hoc analysis implies processing data in near real-time. Thus, raw data (i.e., neither normalized nor transformed) is typically dumped into a distributed engine, where it is generally stored into a hybrid layout. Hybrid layouts divide data into horizontal partitions and inside each partition, data are stored vertically. They keep statistics for each horizontal partition and also support encoding (i.e., dictionary) and compression to reduce the size of the data. Their built-in support for many ad-hoc operations (i.e., selection, projection, aggregation, etc.) makes hybrid layouts the best choice for most operations.
更多查看译文
关键词
Big data, Hybrid storage layouts, Auto tuning, Parquet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络