An Efficient Multicore CPU Implementation for Convolution-Pooling Computation in CNNs

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2020)

引用 3|浏览3
暂无评分
摘要
The main contribution of this paper is to present an efficient multicore CPU implementation of convolution-pooling computation in convolutional neural networks (CNNs). Since the convolution and pooling operations are performed several times in most CNNs, we propose a method to accelerate the operations. In our proposed multicore CPU implementation, we use convolution interchange to reduce the computational cost. Also, we implement convolution-pooling computation efficiently using DNNL that is an open source library for accelerating deep learning frameworks. The experimental results using Intel Corei9-7980XE CPU show that our proposed CPU implementation for the convolution-pooling is 1.42 to 2.82 times faster than the multiple convolution and then pooling by DNNL. Further, we incorporate the proposed implementation into TensorFlow to perform them as a TensorFloW operation. The incorporated implementation for the convolution-pooling is 1.18 to 2.42 times faster than straightforward implementation by primitives in TensorFlow.
更多
查看译文
关键词
deep learning,neural networks,convolution,average pooling,AVX-512,TensorFlow
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要