Modularis: Modular Data Analytics for Hardware, Software, and Platform Heterogeneity

arxiv(2021)

引用 0|浏览25
暂无评分
摘要
Today's data analytics displays an overwhelming diversity along many dimensions: data types, platforms, hardware acceleration, etc. As a result, system design often has to choose between depth and breadth: high efficiency for a narrow set of use cases or generality at a lower performance. In this paper, we pave the way to get the best of both worlds: We present Modularis-an execution layer for data analytics based on fine-grained, composable building blocks that are as generic and simple as possible. These building blocks are similar to traditional database operators, but at a finer granularity, so we call them sub-operators. Sub-operators can be freely and easily combined. As we demonstrate with concrete examples in the context of RDMA-based databases, Modularis' sub-operators can be combined to perform the same task as a complex, monolithic operator. Sub-operators, however, can be reused, can be offloaded to different layers or accelerators, and can be customized to specialized hardware. In the use cases we have tested so far, sub-operators reduce the amount of code significantly-or example, for a distributed, RDMA-based join by a factor of four-while having minimal performance overhead. Modularis is an order of magnitude faster on SQL-style analytics compared to a commonly used framework for generic data processing (Presto) and on par with a commercial cluster database (MemSQL).
更多
查看译文
关键词
modularis data analytics,platform heterogeneity,hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要