Modeling Advanced Collective Communication Algorithms On Cell-Based Systems
PPOPP(2010)
摘要
This paper presents and validates performance models for a variety of high-performance collective communication algorithms for systems with Cell processors. The systems modeled include a single Cell processor, two Cell chips on a Cell Blade, and a cluster of Cell Blades. The models extend PLogP, the well-known point-to-point performance model, by accounting for the unique hardware characteristics of the Cell (e.g., heterogeneous interconnects and DMA engines) and by applying the model to collective communication. This paper also presents a micro-benchmark suite to accurately measure the extended PLogP parameters on the Cell Blade and then uses these parameters to model different algorithms for the barrier, broadcast, reduce, all-reduce, and all-gather collective operations. Out of 425 total performance predictions, 398 of them see less than 10% error compared to the actual execution time and all of them see less than 15%.
更多查看译文
关键词
Algorithms,Performance,Measurement,Collective communication,Algorithms,Modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络