Enabling Efficient Fast Convolution Algorithms on GPUs via MegaKernels
IEEE Transactions on Computers(2020)
摘要
Modern Convolutional Neural Networks (CNNs) require a massive amount of convolution operations. To address the overwhelming computation problem, Winograd and FFT fast algorithms have been used as effective approaches to reduce the number of multiplications. Inputs and filters are transformed into special domains then perform element-wise multiplication, which can be transformed into batched GEMM o...
更多查看译文
关键词
Kernel,Convolution,Task analysis,Graphics processing units,Tensile stress,Instruction sets,Libraries
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络