How to Manage High-Bandwidth Memory Automatically

SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures Virtual Event USA July, 2020(2020)

引用 9|浏览95
暂无评分
摘要
This paper develops an algorithmic foundation for automated management of the multilevel-memory systems common to new supercomputers. In particular, the High-Bandwidth Memory (HBM) of these systems has a similar latency to that of DRAM and a smaller capacity, but it has much larger bandwidth. Systems equipped with HBM do not fit in classic memory-hierarchy models due to HBM's atypical characteristics. Unlike caches, which are generally managed automatically by the hardware, programmers of some current HBM-equipped supercomputers can choose to explicitly manage HBM themselves. This process is problem specific and resource intensive. Vendors offer this option because there is no consensus on how to automatically manage HBM to guarantee good performance, or whether this is even possible. In this paper, we give theoretical support for automatic HBM management by developing simple algorithms that can automatically control HBM and deliver good performance on multicore systems. HBM management is starkly different from traditional caching both in terms of optimization objectives and algorithm development. Since DRAM and HBM have similar latencies, minimizing HBM misses (provably) turns out not to be the right memory-management objective. Instead, we directly focus on minimizing makespan. In addition, while cache-management algorithms must focus on what pages to keep in cache; HBM management requires answering two questions: (1) which pages to keep in HBM and (2) how to use the limited bandwidth from HBM to DRAM. It turns out that the natural approach of using LRU for the first question and FCFS (First-Come-First-Serve) for the second question is provably bad. Instead, we provide a priority based approach that is simple, efficiently implementable and $O(1)$-competitive for makespan when all multicore threads are independent.
更多
查看译文
关键词
Paging, High-bandwidth memory, Scheduling, Multicore paging, Online algorithms, Approximation algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要