Supporting Massive DLRM Inference through Software Defined Memory

Ehsan K. Ardestani,Changkyu Kim,Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal,Fuxun Yu,Ansha Yu,Trung Le,Hector Yuen,Shishir Juluri, Akshat Nanda, Manoj Wodekar,Dheevatsa Mudigere,Krishnakumar Nair,Maxim Naumov, Chris Peterson,Mikhail Smelyanskiy,Vijay Rao

2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)(2022)

引用 7|浏览57
暂无评分
摘要
Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage Class Memory (SCM) for inference enables lower power consumption. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents different techniques to improve performance through a Software Defined Memory. We show how underlying technologies such as Nand Flash and 3DXP differentiate, and relate to real world scenarios, enabling from 5% to 29% power savings.
更多
查看译文
关键词
DLRM,Hierarchical Memory,Software Defined Memory,Recommendation Models,Inference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要