Efficient GPU-based implementation for decoding non-binary LDPC codes with layered and flooding schedules.

Concurrency and Computation: Practice and Experience(2018)

引用 1|浏览26
暂无评分
摘要
Nonbinary low-density parity-check (NB-LDPC) codes are excellent error correcting codes and outperform their binary counterparts under the same code length. NB-LDPC decoders are based on Belief Propagation Algorithm, which demands intensive message-passing computation. Recently, to achieve both flexibility and good throughput performance, NB-LDPC decoders have been ported from dedicated hardware solutions to multi/many-core systems. In this paper, we propose an FFT-based q-ary Sum-Product Algorithm (QSPA) decoding architecture for NB-LDPC codes with layered and flooding schedules on a graphics processing unit (GPU). To improve the throughput performance of the proposed decoder, four optimization methods are presented to not only accelerate the decoding kernel execution but also improve the data transfer efficiency. The experiments are mainly accomplished on NVIDIA GTX580 and GTX Titan X. Throughputs up to 63 Mbps over GF(16) and 7.65 Mbps over GF(256) are achieved on GTX580 when executing 5 layered decoding iterations. Throughputs can reach up to 139 Mbps over GF(16) and 17 Mbps over GF(256) on GTX Titan X. Experimental results show that the speedups of the decoding throughputs range from x1.7 to x16.8 by comparison with the existing FFT-based QSPA decoders on GPU.
更多
查看译文
关键词
CUDA,flooding,GPU,layered,nonbinary LDPC,synchronization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要