Automatic task mapping and heterogeneity-aware fault tolerance: The benefits for runtime optimization and application development

Journal of Systems Architecture(2015)

引用 2|浏览7
暂无评分
摘要
The best mapping of a task to one or more processing units in a heterogeneous system depends on multiple variables. Several approaches based on runtime systems have been proposed that determine the best mapping under given circumstances automatically. Some of them also consider dynamic events like varying problem sizes or resource competition that may change the best mapping during application runtime but only a few even consider that task execution may fail. While aging or overheating are well-known causes for sudden faults, the ongoing miniaturization and the growing complexity of heterogeneous computing are expected to create further threats for successful application execution. However, if properly incorporated, heterogeneous systems also offer the opportunity to recover from different types of faults in hardware as well as in software. In this work, we propose a combination of both topics, dynamic performance-oriented task mapping and dependability, to leverage this opportunity. As we will show, this combination not only enables tolerating faults in hardware and software with minor assistance of the developer, it also provides benefits for application development itself and for application performance in case of faults due to a new metric and automatic data management.
更多
查看译文
关键词
Heterogeneous computing,Accelerators,Fault-tolerant systems,Runtime systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要