Trinity: In-Database Near-Data Machine Learning Acceleration Platform for Advanced Data Analytics

Ji-Hoon Kim, Seunghee Han,Kwanghyun Park, Soo-Young Ji,Joo-Young Kim

IEEE ACCESS(2024)

引用 0|浏览1
暂无评分
摘要
The ability to perform machine learning (ML) tasks in a database management system (DBMS) is a new paradigm for conventional database systems as it enables advanced data analytics on top of well-established capabilities of DBMSs. However, the integration of ML in DBMSs introduces new challenges in traditional CPU-based systems because of its higher computational demands and bigger data bandwidth requirements. To address this, hardware acceleration has become even more important in database systems, and the computational storage device (CSD) placing an accelerator near storage is considered as an effective solution due to its high processing power with no extra data movement cost. In this paper, we propose Trinity, an end-to-end database system that enables in-database, in-storage platform that accelerates advanced analytics queries invoking trained ML models along with complex data operations. By designing a full stack from DBMS's internal software components to hardware accelerator, Trinity enables in-database ML pipelines on the CSD. On the software side, we extend the internals of conventional DBMSs to utilize the accelerator in the SmartSSD. Our extended analyzer evaluates the compatibility of the current query with our hardware accelerator and compresses compatible queries into a 24-byte numeric format for efficient hardware processing. Furthermore, the predictor is extended to integrate our performance cost models to always offload queries into the optimal hardware backend. The proposed SmartSSD cost model mathematically models our hardware, including host operations, data transfers, FPGA kernel execution time, and the CPU cost model uses polynomial regression ML models to predict complex CPU latency. On the hardware side, we introduce the in-database processing accelerator (i-DPA), a custom FPGA-based accelerator. i-DPA includes database page decoder to fully exploit the bandwidth benefit of near-storage processing. It also employs dynamic tuple binding to enhance the overall parallelism and hardware utilization. i-DPA;s architecture having heterogeneous computing units with a reconfigurable on-chip interconnect also allows seamless data streaming, enabling task-level pipeline across different computing units. Finally, our evaluation shows that Trinity improves the end-to-end performance of analytics queries by 15.21x on average and up to 57.18x compared to the conventional CPU-based DBMS platform. We also show that the Trinity's performance can linearly scale up with multiple SmartSSDs, achieving nearly up to 200x speedup over the baseline with four SmartSSDs.
更多
查看译文
关键词
Data analysis,Computer architecture,Field programmable gate arrays,Mathematical models,Hardware acceleration,Database systems,Predictive models,Computational storage device,database,data analytics,end-to-end system,hardware accelerator,machine learning,near-data processing,SmartSSD
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要