Optimization techniques for efficient HTA programs

Parallel Computing(2012)

引用 27|浏览0
暂无评分
摘要
Object oriented languages can be easily extended with new data types, which facilitate prototyping new language extensions. A very challenging problem is the development of data types encapsulating data parallel operations, which could improve parallel programming productivity. However, the use of class libraries to implement data types, particularly when they encapsulate parallelism, comes at the expense of performance overhead. This paper describes our experience with the implementation of a C++ data type called hierarchically tiled array (HTA). This object includes data parallel operations and allows the manipulation of tiles to facilitate developing efficient parallel codes and codes with high degree of locality. The initial performance of the HTA programs we wrote was lower than that of their conventional MPI-based counterparts. The overhead was due to factors such as the creation of temporary HTAs and the inability of the compiler to properly inline index computations, among others. We describe the performance problems and the optimizations applied to overcome them as well as their impact on programmability. After the optimization process, our HTA-based implementations run only slightly slower than the MPI-based codes while having much better programmability metrics.
更多
查看译文
关键词
performance problem,optimization technique,new data type,data type,efficient hta program,data parallel operation,parallel programming productivity,mpi-based code,performance overhead,initial performance,hta program,efficient parallel code,optimization,parallel programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要