Communication Optimization on GPU: A Case Study of Sequence Alignment Algorithms

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2017)

引用 27|浏览27
暂无评分
摘要
Data movement is increasingly becoming the bottleneck of both performance and energy efficiency in modern computation. Until recently, it was the case that there is limited freedom for communication optimization on GPUs, as conventional GPUs only provide two types of methods for inter-thread communication: using shared memory or global memory. However, a new warp shuffle instruction has been introduced since the Kepler architecture on Nvidia GPUs, which enables threads within the same warp to directly exchange data in registers. This brought new performance optimization opportunities for algorithms with intensive inter-thread communication. In this work, we deploy register shuffle in the application domain of sequence alignment (or similarly, string matching), and conduct a quantitative analysis of the opportunities and limitations of using register shuffle. We select two sequence alignment algorithms, Smith-Waterman (SW) and Pairwise-Hidden-Markov-Model (PairHMM), from the widely used Genome Analysis Toolkit (GATK) as case studies. Compared to implementations using shared memory, we obtain a significant speed-up of 1.2× and 2.1× by using shuffle instructions for SW and PairHMM. Furthermore, we develop a performance model for analyzing the kernel performance based on the measured shuffle latency from a suite of microbenchmarks. Our model provides valuable insights for CUDA programmers into how to best use shuffle instructions for performance optimization.
更多
查看译文
关键词
energy efficiency,shared memory,global memory,warp shuffle instruction,Kepler architecture,Nvidia GPU,performance optimization,intensive inter-thread communication,register shuffle,quantitative analysis,Smith-Waterman model,pairwise-hidden Markov-model,Genome Analysis Toolkit,SW,PairHMM,microbenchmarks,CUDA programmers,sequence alignment algorithms,data movement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要