Balanced loop retiming to effectively architect STT-RAM-based hybrid cache for VLIW processors.

SAC 2016: Symposium on Applied Computing Pisa Italy April, 2016(2016)

引用 1|浏览42
暂无评分
摘要
Loop retiming has been extensively studied to maximize instruction-level parallelism (ILP) of multiple function units by rearranging the dependence delays in a uniform loop. Recently loop retiming technique has been proposed to mitigate the migration overhead of STT-RAM-based hybrid cache by changing the interleaved read and write memory access pattern. However, the previous ILP-aware loop retiming is unaware of its impact on the hybrid cache's migration while the migration-aware loop retiming has not fully considered the parallelism of arithmetic and logical units (ALUs) in VLIW processors. This paper models the impacts of loop retiming on both ILP of ALUs and migration overhead in STT-RAM-based hybrid cache. An overall balanced loop retiming solution, considering both of the ALU part and the memory part, is devised to achieve high performance for VLIW processors. The experimental results across a set of benchmarks show that the proposed balanced retiming approach improves performance by 13.1%, 3.5% and 18.1% on average over the cases with no retiming, pure migration-aware retiming and pure ILP-aware retiming, respectively.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要