AIIO: Using Artificial Intelligence for Job-Level and Automatic I/O Performance Bottleneck Diagnosis

HPDC(2023)

引用 0|浏览14
暂无评分
摘要
Manually diagnosing the I/O performance bottleneck for a single application (hereinafter referred to as the "job level") is a tedious and error-prone procedure requiring domain scientists to have deep knowledge of complex storage systems. However, existing automatic methods for I/O performance bottleneck diagnosis have one major issue: the granularity of the analysis is at the platform or group level and the diagnosis results cannot be applied to the individual application. To address this issue, we designed and developed a method named "Artificial Intelligence for I/O" (AIIO), which uses AI and its interpretation technology to diagnose I/O performance bottlenecks at the job level automatically. By considering the sparsity of I/O log files, employing multiple AI models for performance prediction, merging diagnosis results across multiple models, and generalizing its performance prediction and diagnosis functions, AIIO can accurately and robustly identify the bottleneck of an even unseen application. Experimental results show that real and unseen applications can use the diagnosis results from AIIO to improve their I/O performance by at most 146x.
更多
查看译文
关键词
I/O Bottleneck,Job-Level,Diagnosis,Artificial Intelligence,Machine Learning,AI Interpretation,Prediction,Darshan
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要