RON: One-Way Circular Shortest Routing to Achieve Efficient and Bounded-waiting Spinlocks.

Shiwu Lo, Han-Ting Lin, Yao-Hung Hsieh, Chao-Ting Lin,Yu-Hsueh Fang, Ching-Shen Lin, Ching-Chun (Jim) Huang,Kam-Yiu Lam,Yuan-Hao Chang

OSDI(2023)

引用 0|浏览5
暂无评分
摘要
As the number of processor cores increases, the efficiency of accessing shared variables through the lock-unlock method decreases. A NUMA-aware algorithm, which only considers the transmission delay between processors, may not fully utilize the connection network of a multi-core processor. This limits the scalability of a multi-core processor due to the large amount of low- and variable-cost data sharing between cores. The problem is that the reduction in communication cost cannot compensate for the increase in the time complexity of the spinlocks, and the farthest transmission distance becomes longer with more cores. We propose a method called Routing on Network-on-chip (RON)1 to minimize the communication cost between cores by using a routing table and pre-calculating an optimized locking-unlocking order. RON delivers locks and data in a one-way circular manner among cores to (1) minimize global data movement cost and (2) achieve bounded waiting time. Microbenchmarks provide quantitative analysis, while multicore benchmarks show performance under various workloads. In terms of user space performance, RON improves the performance of Google LevelDB by 22.1% and 24.2% compared to ShflLock and C-BO-MCS, respectively. In the kernel space, RON is 1.8 times faster than using ShflLock for Google LevelDB. RON-plock solves the problem of oversubscription with constant space complexity and achieves 3.7 times and 18.9 times better performance than ShflLock-B and C-BO-MCS-B, respectively.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要