A Machine Learning Approach For Productive Data Locality Exploitation In Parallel Computing Systems

2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID)(2019)

引用 4|浏览11
暂无评分
摘要
Data locality is of extreme importance in programming distributed-memory architectures due to its implications on latency and energy consumption. Automated compiler and runtime system optimization studies have attempted to improve data locality exploitation without burdening the programmer. However, due to the difficulty of static code analysis, conservatism in compiler optimizations to avoid errors, and cost of dynamic analysis, the efficacy of automated optimizations is limited. Therefore, programmers need to spend significant effort in optimizing locality.In this work, we present an automated code optimization framework that trains neural networks using application profiles for small data sizes that exhibit similar patterns to larger cases. The application is then modified to use the neural network to improve data locality exploitation. We prototype our framework for the Chapel language and integrate with the language stack. We experimentally demonstrate that our framework can learn access patterns and create optimized executables in minutes. The resulting executables perform more than one order of magnitude faster than unoptimized code, and are comparable to manual locality optimization without burdening the programmer and hindering productivity.
更多
查看译文
关键词
data locality, distributed memory, machine learning, optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要