An Efficient Multicore CPU Implementation for Convolution-Pooling Computation in CNNs
2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2020)
摘要
The main contribution of this paper is to present an efficient multicore CPU implementation of convolution-pooling computation in convolutional neural networks (CNNs). Since the convolution and pooling operations are performed several times in most CNNs, we propose a method to accelerate the operations. In our proposed multicore CPU implementation, we use convolution interchange to reduce the computational cost. Also, we implement convolution-pooling computation efficiently using DNNL that is an open source library for accelerating deep learning frameworks. The experimental results using Intel Corei9-7980XE CPU show that our proposed CPU implementation for the convolution-pooling is 1.42 to 2.82 times faster than the multiple convolution and then pooling by DNNL. Further, we incorporate the proposed implementation into TensorFlow to perform them as a TensorFloW operation. The incorporated implementation for the convolution-pooling is 1.18 to 2.42 times faster than straightforward implementation by primitives in TensorFlow.
更多查看译文
关键词
deep learning,neural networks,convolution,average pooling,AVX-512,TensorFlow
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要