Do Nothing, But Carefully: Fault Tolerance with Timing Guarantees for Multiprocessor Systems Devoid of Online Adaptation

2018 IEEE 23rd Pacific Rim International Symposium on Dependable Computing (PRDC)(2018)

引用 2|浏览8
暂无评分
摘要
Many practical real-time systems must be able to sustain several reliability threats induced by their physical environments that cause short-term abnormal system behavior, such as transient faults. To cope with this change of system behavior, online adaptions, which may introduce a high computation overhead, are performed in many cases to ensure the timeliness of the more important tasks while no guarantees are provided for the less important tasks. In this work, we propose a system model which does not require any online adaption, but, according to the concept of dynamic real-time guarantees, provides full timing guarantees as well as limited timing guarantees, depending on the system behavior. For the normal system behavior, timeliness is guaranteed for all tasks; otherwise, timeliness is guaranteed only for the more important tasks while bounded tardiness is ensured for the less important tasks. Aiming to provide such dynamic timing guarantees, we propose a suitable system model and discuss, how this can be established by means of partitioned as well as semi-partitioned strategies. Moreover, we propose an approach for handling abnormal behavior with a longer duration, such as intermittent faults or overheating of processors, by performing task migration in order to compensate the affected system component and to increase the system's reliability. We show by comprehensive experiments that good acceptance ratios can be achieved under partitioned scheduling, which can be further improved under semi-partitioned strategies. In addition, we demonstrate that the proposed migration techniques lead to a reasonable trade-off between the decrease in schedulability and the gain in robustness of the system. The presented approaches can also be applied to mixed-criticality systems with two criticality levels.
更多
查看译文
关键词
Task analysis,Timing,Transient analysis,Program processors,Processor scheduling,Real-time systems,Hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要