Cacao: Complex And Compositional Atomic Operations For Noc-Based Manycore Platforms
ARCHITECTURE OF COMPUTING SYSTEMS(2018)
摘要
Tile-based distributed memory systems have increased the scalability of manycore platforms. However, inter-tile memory accesses, especially thread synchronization suffer from high remote access latencies. Our thorough investigations of lock-based and lock-free synchronization primitives show that there is a concurrency dependent crossover point between them, i.e. there is no one-fits-all solution. Therefore, we propose to combine the conceptual advantages (no retries and lock-free) of both variants by using dedicated hardware support for inter-tile atomic operations. For frequently used and highly concurrent data structures, we show a speedup factor of 23.9 and 35.4 over the lock-based and lock-free implementations respectively, which increases with higher concurrency.
更多查看译文
关键词
Atomic operations, Remote synchronization, Compare-and-swap, Distributed shared memory, Network-on-Chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络