Towards a Peer-to-Peer Data Distribution Layer for Efficient and Collaborative Resource Optimization of Distributed Dataflow Applications
CoRR(2023)
摘要
Performance modeling can help to improve the resource efficiency of clusters
and distributed dataflow applications, yet the available modeling data is often
limited. Collaborative approaches to performance modeling, characterized by the
sharing of performance data or models, have been shown to improve resource
efficiency, but there has been little focus on actual data sharing strategies
and implementation in production environments. This missing building block
holds back the realization of proposed collaborative solutions.
In this paper, we envision, design, and evaluate a peer-to-peer performance
data sharing approach for collaborative performance modeling of distributed
dataflow applications. Our proposed data distribution layer enables access to
performance data in a decentralized manner, thereby facilitating collaborative
modeling approaches and allowing for improved prediction capabilities and hence
increased resource efficiency. In our evaluation, we assess our approach with
regard to deployment, data replication, and data validation, through
experiments with a prototype implementation and simulation, demonstrating
feasibility and allowing discussion of potential limitations and next steps.
更多查看译文
关键词
Scalable Data Analytics,Distributed Dataflows,Performance Modeling,Data Sharing,Resource Management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要