OpenFAM: A Library for Programming Disaggregated Memory

Sharad Singhal,Clarete R. Crasta, Mashood Abdulla, Faizan Barmawer, Dave Emberson, Ramya Ahobala, Gautham Bhat, Rishi Kesh K. Rajak, P. N. Soumya

OPENSHMEM AND RELATED TECHNOLOGIES: OPENSHMEM IN THE ERA OF EXASCALE AND SMART NETWORKS(2022)

引用 0|浏览4
暂无评分
摘要
HPC architectures are increasingly handling workloads such as AI/ML or high performance data analytics where the working data set cannot be easily partitioned, or does not fit into node local memory. This poses challenges for programming models such as OpenSHMEM, which require data in the working set to fit in the symmetric heap. Emerging fabric-attached memory (FAM) architectures enable data to be held in external memory accessible to all compute nodes, thus providing a new approach to handling large data sets. Unfortunately, most HPC libraries do not currently support FAM, and programmers use file system or key-value store abstractions to access data that is resident off-node, resulting in lower application performance because of the deep software stack necessary in the data path. The OpenFAM API treats data in FAM as memory-resident, and provides memory management and data operation APIs patterned after OpenSHMEM. In this paper, we discuss the design of an open-source reference implementation of the API, and demonstrate its efficiency using micro-benchmarks on a 32-node EDR InfiniBand cluster. We conclude with a discussion of future work and relation to OpenSHMEM.
更多
查看译文
关键词
Fabric attached memory, Programming API, Disaggregated memory, OpenFAM implementation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要