Efficient Implementation of Convolution and Winograd on ASMP Embedded Multicore Vector Processor

Nicolas Leclaire,Stéphane Mancini, Claude Delnondedieu, Jean-Paul Henriques

2020 IEEE Workshop on Signal Processing Systems (SiPS)(2020)

引用 1|浏览1
暂无评分
摘要
Efficient inference of Convolutional Neural Network (CNN) is a challenging task and design choices heavily rely on the target context and size of the CNN model. Many devices are available, each one targeting a specific class of application. The most famous ones target the server-side of cloud applications and some focus on embedded applications. In this paper we show how to exploit the low-level hardware features of an embedded multicore called STxP70 ASMP, each core being equipped with a vector coprocessor. This work shows how to adapt the algorithm to the platform and vice-versa, and provides an original algo rithmic transform to optimize internal resources. Experiments are made to study the effect of numerous design parameters and CNN configurations. The results show the benefits of the proposed strategy and outline the low-level hardware features required to further optimize CNN inference.
更多
查看译文
关键词
Convolution,Memory management,Optimization,Registers,Standards,Pipelines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要