Analyzing the I/O Patterns of Deep Learning Applications

JCC-BD&ET(2021)

引用 0|浏览0
暂无评分
摘要
A traditional HPC storage system is designed to manage an I/O workload dominated by write operation bursts, mainly for applications carrying out simulations and checkpointing partial results. Currently, this context is more diverse because of artificial intelligence applications’ workload, such as machine learning and deep learning. As ML/DL applications are becoming more compute-intensive, they require the power of HPC systems. However, the HPC I/O system could be a bottleneck to scaling these kind of applications, mainly in the training stage. In this paper, we present a methodology for analyzing the I/O patterns of deep learning applications that allows us to understand the DL applications’ I/O in HPC systems. We have applied our approach to serial and distributed DL codes by using the TensorFlow2 and PyTorch framework for the MNIST and CIFAR-10 datasets.
更多
查看译文
关键词
Deep learning,I/O HPC,I/O Patterns,Distributed DL
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要