Fast Non-Local Adaptive In-Loop Filter Optimization on GPU

IEEE Transactions on Multimedia(2021)

引用 6|浏览94
暂无评分
摘要
The non-local adaptive in-loop filter (NALF) for video coding has achieved significant coding gain by exploiting image non-local self-similarity (NSS) to efficiently reduce the compression artifacts. However, the intensive computation of NALF hinders its practical deployment in video standardizations. In this paper, we propose a fast NALF optimization algorithm in parallel-computing framework by leveraging the massive parallel execution resources of GPU. First, the computational complexity of original NALF is analyzed in depth, then the pipelines of computational-intensive modules are re-designed to adapt to the general-purpose GPU with more parallel-friendly consideration. Specifically, we speed up the NALF by optimizing thread allocation to maximize the parallelism degree and elaborately designing the GPU block dimension to avoid access conflict. The group-level and pixel-level parallelization for collaboratively filtering and patch matching modules are designed respectively. To reduce the cost in data transmission, the whole filtering process is implemented on GPU by taking the advantage of low data dependency in NALF. Extensive experimental results show that the proposed fast NALF optimization using GPU architecture achieves high-speeed processing while maintaining the significant coding performance of original NALF, which shows the potential of NALF in the future video coding standard.
更多
查看译文
关键词
Video coding,NSS,adaptive in-loop filter,NALF,GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要