Predicting and reining in application-level slowdown on spatial multitasking GPUs

Journal of Parallel and Distributed Computing(2020)

引用 15|浏览97
暂无评分
摘要
Predicting performance degradation of a GPU application at co-location on a spatial multitasking GPU without prior application knowledge is essential in public Clouds. Prior work mainly targets CPU co-location, and is inaccurate and/or inefficient for predicting performance of applications at co-location on spatial multitasking GPUs. Our investigation shows that hardware event statistics caused by co-located applications strongly correlate with their slowdowns. Based on this observation, we present Themis with a kernel slowdown model (Themis-KSM), which performs precise and efficient online application slowdown prediction without prior application knowledge. The kernel slowdown model is trained offline. When new applications co-run, Themis-KSM collects event statistics and predicts their slowdowns simultaneously. In addition, we also propose a two-stage slowdown prediction mechanism (Themis-TSP) for real-system GPUs without any hardware modification. Our evaluation shows that Themis has negligible runtime overhead, and both Themis-KSM and Themis-TSP can precisely predict application-level slowdown with prediction error smaller than 9.5% and 12.8%, respectively. Based on Themis, we also implement an SM allocation engine to rein in application slowdown at co-location. Case studies show that the engine successfully enforces fair sharing and QoS.
更多
查看译文
关键词
Spatial Multitasking GPU,Sharing GPU,Co-location,Slowdown prediction,Performance prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要