SpotSDC: Revealing the Silent Data Corruption Propagation in High-Performance Computing Systems

IEEE Transactions on Visualization and Computer Graphics(2021)

引用 14|浏览110
暂无评分
摘要
The trend of rapid technology scaling is expected to make the hardware of high-performance computing (HPC) systems more susceptible to computational errors due to random bit flips. Some bit flips may cause a program to crash or have a minimal effect on the output, but others may lead to silent data corruption (SDC), i.e., undetected yet significant output errors. Classical fault injection analysis...
更多
查看译文
关键词
Data visualization,Transient analysis,Resilience,Tools,Analytical models,Hardware,Computer crashes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要