Accelerating Database Queries for Advanced Data Analytics: A New Approach

Hanfeng Chen, Laurie Hendren,Bettina Kemme, Feburary

semanticscholar(2021)

引用 0|浏览0
暂无评分
摘要
The rising popularity of data science in recent times has resulted in the diversification of data processing systems. The current ecosystem of data processing software consists of conventional database implementations, traditional numerical computational systems, and more recent efforts that build a hybrid of these two systems. As many organizations are building complex applications that integrate all the three types of data processing systems, there is a need to look at a holistic optimization strategy that can work with any of the three, or their combinations. In this paper, we propose an advanced analytical system HorsePower, based on HorseIR, an array-based intermediate representation (IR). The system is designed for the translation of conventional database queries, statistical languages, as well as the mix of these two into a common IR, allowing to combine query optimization and compiler optimization techniques at an intermediate level of abstraction. Our experiments compare HorsePower with the column-based database system MonetDB and the array programming language MATLAB, and show that we can achieve significant speedups for standard SQL queries, for analytical functions written in MATLAB and for advanced data analytics combining queries and UDFs. The results show a promising new direction for integrating advanced data analytics into database systems by using a holistic compilation approach and exploiting a wide range of compiler optimization techniques.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要