Agile Query Processing in Statistical Databases: A Process-In-Memory Approach

KSEM (1)(2019)

引用 1|浏览37
暂无评分
摘要
Statistical database systems are designed to answer queries on summarized data (or macro data), while queries on raw records are not allowed in such database systems. As macro data can offer aggregate information about the database, it is also an effective way to use statistical queries to provide analytical results in semantic databases. However, traditional statistical databases are proposed for security protection, i.e., hiding the raw records from user queries. Few studies are toward query optimizations on aggregate queries in statistical databases. In this paper, we propose a new process-in-memory (PIM) based processing scheme called agile query for accelerating queries in statistical databases. We present two new designs in the agile query. First, we propose an in-memory index to cache aggregate operators (e.g., sum, min, max, count, and average) in the main memory. The aggregate queries that hit in the inmemory index can be evaluated in the memory and no I/O operation will be incurred. Second, we propose to incrementally update the in-memory operator index so that we can ensure the consistency between the cached data and the original data records. We implement the agile query processing framework on top of MySQL and conduct experiments over various sizes of datasets to compare our design with the traditional method in MySQL. The results show that our proposal achieves up to 9 times higher throughput than MySQL under the skewed Zipf query set, and averagely gets about 2 times higher throughput under the random and uniform distributed queries.
更多
查看译文
关键词
Query processing, Statistical database, Processing in memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要