Remediate: A Scalable Fault-Tolerant Architecture For Low-Power Nuca Cache In Tiled Cmps

Green Computing Conference(2013)

引用 11|浏览11
暂无评分
摘要
Technology scaling and process variation severely degrade the reliability of Chip Multiprocessors (CMPs), especially their large cache blocks. To improve cache reliability, we propose REMEDIATE, a scalable fault-tolerant architecture for low-power design of shared Non-Uniform Cache Access (NUCA) cache in Tiled CMPs. REMEDIATE achieves fault-tolerance through redundancy from multiple banks to maximize the amount of fault remapping, and minimize the amount of capacity lost in the cache when the failure rate is high. REMEDIATE leverages a scalable fault protection technique using two different remapping heuristics in a distributed shared cache architecture with non-uniform latencies. We deploy a graph coloring algorithm to optimize REMEDIATE's remapping configuration. We perform an extensive design space exploration of operating voltage, performance, and power that enables designers to select different operating points and evaluate their design efficacy. Experimental results on a 4x4 tiled CMP system voltage scaled to below 400m V show that REMEDIATE saves up to 50% power while recovering more than 80% of the faulty cache area with only modest performance degradation.
更多
查看译文
关键词
Fault-tolerant cache,Remapping,Aggressive voltage scaling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要