An Efficient Multicore CPU Implementation for Convolution-Pooling Computation in CNNs

Hiroki Kataoka,Kohei Yamashita,Yasuaki Ito,Koji Nakano,Akihiko Kasagi,Tsuguchika Tabaru

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)（2020）

引用 3|浏览3

暂无评分

摘要

The main contribution of this paper is to present an efficient multicore CPU implementation of convolution-pooling computation in convolutional neural networks (CNNs). Since the convolution and pooling operations are performed several times in most CNNs, we propose a method to accelerate the operations. In our proposed multicore CPU implementation, we use convolution interchange to reduce the computational cost. Also, we implement convolution-pooling computation efficiently using DNNL that is an open source library for accelerating deep learning frameworks. The experimental results using Intel Corei9-7980XE CPU show that our proposed CPU implementation for the convolution-pooling is 1.42 to 2.82 times faster than the multiple convolution and then pooling by DNNL. Further, we incorporate the proposed implementation into TensorFlow to perform them as a TensorFloW operation. The incorporated implementation for the convolution-pooling is 1.18 to 2.42 times faster than straightforward implementation by primitives in TensorFlow.

查看译文

关键词

deep learning,neural networks,convolution,average pooling,AVX-512,TensorFlow

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要