DynAMO: Improving Parallelism Through Dynamic Placement of Atomic Memory Operations

PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023(2023)

引用 0|浏览22
暂无评分
摘要
With increasing core counts in modern multi-core designs, the over-head of synchronization jeopardizes the scalability and efficiency of parallel applications. To mitigate these overheads, modern cachecoherent protocols offer support for Atomic Memory Operations (AMOs) that can be executed near-core (near) or remotely in the on-chip memory hierarchy (far). This paper evaluates current available static AMO execution policies implemented in multi-core Systems-on-Chip (SoC) designs, which select AMOs' execution placement (near or far) based on the cache block coherence state. We propose three static policies and show that the performance of static policies is application dependent. Moreover, we show that one of our proposed static policies outperforms currently available implementations. Furthermore, we propose DynAMO, a predictor that selects the best location to execute the AMOs. DynAMO identifies the different locality patterns to make informed decisions, improving AMO latency and increasing overall throughput. DynAMO outperforms the best-performing static policy and provides geometric mean speed-ups of 1.09x across all workloads and 1.31x on AMO-intensive applications with respect to executing all AMOs near.
更多
查看译文
关键词
multi-core architectures,microarchitecture,atomic memory operations,data placement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要