Porting and optimization of solidification application for CPU-MIC hybrid platforms

Periodicals(2018)

引用 11|浏览14
暂无评分
摘要
AbstractModern heterogeneous computing platforms have become powerful HPC solutions, which could be applied to a wide range of real-life applications. In particular, the hybrid platforms equipped with Intel Xeon Phi coprocessors offer the advantages of massively parallel computing, while supporting practically the same parallel programming model as conventional homogeneous solutions. However, there is still an open issue as to how scientific applications can efficiently utilize hybrid platforms with Intel MIC coprocessors. In this article, we propose an approach for porting a real-life scientific application to such hybrid platforms, assuming no significant modifications of the application code. It allows us to take advantage of all the computing components, including two CPUs and two coprocessors, for the parallel execution of computational workloads. In this study, we focus on the parallel implementation of a numerical model of the dendritic solidification process in isothermal conditions. We develop a sequence of steps that are necessary for the porting and optimization of the solidification application to hybrid platforms with Intel coprocessors. The main challenges include not only overlapping data movements with computations, but also ensuring adequate utilization of cores/threads and vector units of processors, as well as coprocessors. To reach this aim, we propose an efficient and flexible method for the workload distribution between heterogeneous computing components. For implementing the potential benefits of the proposed approach, we choose a heterogeneous programming model based on a combination of the offload mode for Intel MIC and OpenMP programming standard. The developed approach allows us to execute the whole application up to 9.33× faster than the original parallel version that uses two CPUs. Furthermore, the CPU-MIC hybrid platforms enable achieving the speedup of about 1.9× that of the CPU platform with 24 cores based on the Ivy Bridge architecture, and about 1.5× that of the Haswell-based CPU platform with 36 cores.
更多
查看译文
关键词
Code optimization, heterogeneous programming model, hybrid architecture, Intel Xeon Phi, load balancing, numerical modeling of solidification, offload, OpenMP, partitioning, vectorization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要