RetroDMR: Troubleshooting non-deterministic faults with retrospective DMR.

DATE(2017)

引用 24|浏览94
暂无评分
摘要
The most notorious faults for diagnosis in post-silicon validation are those that manifest themselves in a non-deterministic manner with system-level functional tests, where errors randomly appear from time to time even when applying the same workloads. In this work, we propose a novel diagnostic framework that resorts to dual-modular redundancy (DMR) for troubleshooting non-deterministic faults, namely RetroDMR. To be specific, we log the essential events (e.g., the sequence of thread migration) in the faulty run to record the mapping relationship between threads and their corresponding execution units. Then in the following diagnosis runs, we apply redundant multithreading (RMT) technique to reduce error detection latency, while at the same time we try to follow the thread migration sequence of the original run whenever possible. By doing so, RetroDMR significantly improves the reproduction rate and diagnosis resolution for non-deterministic faults, as demonstrated in our experimental results.
更多
查看译文
关键词
RetroDMR,nondeterministic faults,retrospective DMR,post-silicon validation,system-level functional tests,diagnostic framework,dual-modular redundancy,nondeterministic fault troubleshooting,thread migration sequence,redundant multithreading,RMT,error detection latency,reproduction rate
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要