Computing Resource Allocation for Heterogeneous Coded Distributed Computing

2022 31st Wireless and Optical Communications Conference (WOCC)(2022)

引用 1|浏览17
暂无评分
摘要
Millions of matrix dimensions in matrix multiplication will have high requirements on node computing power and storage space. Coded Distributed Computing (CDC) can solve this problem by dividing large-dimensional matrices into small matrices and then assigning them to machines in the computing cluster to perform matrix multiplication in parallel. In order to adapt to the reality that computer clusters are usually composed of heterogeneous workers with different computing capabilities, and overcome the performance limitations of CDC based on the isomorphism of computing power, Coded Elastic Computing (CEC) is proposed. However, the existing CEC discards the received information and directly starts a new round of computation after an elastic event occurs, resulting waste of computing time and resources. In this paper, we propose to employ the received information to redesign the allocation scheme. We first determine the offline machine number and the data segment it should have returned as the missing part of decoding that needs to be recomputed. We then count the total number of lost data for each segment of data and calculate the amount of tasks that each machine should undertake. Finally, the amount of tasks actually undertaken by each machine is calculated by solving the system of linear equations. Through experiments, we show the effectiveness of our proposed allocation scheme, in terms of saving resources and time, and accelerating the calculation speed, when compared with the original scheme.
更多
查看译文
关键词
Coded Distributed Computing,Heterogeneous Network,Coded Elastic Computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要