Smaller Split L-1 Data Caches for Multi-core Processing Systems

Pervasive Systems, Algorithms, and Networks(2009)

引用 9|浏览0
暂无评分
摘要
As more cores (processing elements) are included in a single chip, it is likely that the sizes of per core L-1 caches will become smaller while more cores will share L-2 cache resources. It becomes more critical to improve the use of L-1 caches and minimize sharing conflicts for L-2 caches. In our prior work we have shown that using smaller but separate L-1 array data and L-1 scalar data cache, instead of a larger single L-1 data cache, can lead to significant performance improvements. In this paper we will extend our experiments by varying cache design parameters including block size, associativity and number of sets for L-1 array and L-1 scalar caches. We will also present the affect of separate array and scalar caches on the non-uniform accesses to different (L-1) cache sets exhibited while using a single (L-1) data cache. For this purpose we use third and fourth central moments (skewness and kurtosis), which characterize the access patterns. Our experiments show that for several embedded benchmarks (from MiBench) split data caches significantly mitigate the problem of non-uniform accesses to cache sets (leading to more uniform utilization of cache resources, reduction of conflicts to cache sets, and minimizing hot spots in cache). They also show that neither higher set-associativities nor large block sizes are necessary with split cache organizations.
更多
查看译文
关键词
multi-core processing systems,cache resource,cache design parameter,l-1 data cache,data cache,scalar cache,l-2 cache,l-2 cache resource,l-1 cache,l-1 scalar cache,l-1 scalar data cache,smaller split l-1 data,hot spot,data mining,shape,cache memory,benchmark testing,chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要