Parallel distributed productivity-aware tree-search using Chapel

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE(2023)

引用 0|浏览3
暂无评分
摘要
With the recent arrival of the exascale era, modern supercomputers are increasingly big making their programming much more complex. In addition to performance, software productivity is a major concern to choose a programming language, such as Chapel, designed for exascale computing. In this paper, we investigate the design of a parallel distributed tree-search algorithm, namely P3D-DFS, and its implementation using Chapel. The design is based on the Chapel's DistBag data structure, revisited by: (1) redefining the data structure for Depth-First tree-Search (DFS), henceforth renamed DistBag-DFS; (2) redesigning the underlying load balancing mechanism. In addition, we propose two instantiations of P3D-DFS considering the Branch-and-Bound (B & B) and Unbalanced Tree Search (UTS) algorithms. In order to evaluate how much performance is traded for productivity, we compare the Chapel-based implementations of B & B and UTS to their best-known counterparts based on traditional OpenMP (intra-node) and MPI+X (inter-node). For experimental validation using 4096 processing cores, we consider the permutation flow-shop scheduling problem for B & B and synthetic literature benchmarks for UTS. The reported results show that P3D-DFS competes with its OpenMP baselines for coarser-grained shared-memory scenarios, and with its MPI+X counterparts for distributed-memory settings, considering both performance and productivity-awareness. In the context of this work, this makes Chapel an alternative to OpenMP/MPI+X for exascale programming.
更多
查看译文
关键词
tree‐search,tree‐search,chapel
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要