Surround the Nonlinearity: Inserting Foldable Convolutional Autoencoders to Reduce Activation Footprint

Baptiste Rossigneux, Inna Kucher, Vincent Lorrain,Emmanuel Casseau

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW（2023）

引用 0|浏览0

暂无评分

摘要

Modern deep learning architectures, while highly successful, are characterized by substantial computational and memory demands due to their large number of parameters or the storing of activations. That is why it is hard to adapt a neural network to the constraints of hardware, especially at the edge. This paper presents an investigation into a novel approach for activation compression, which we term 'Projection-based compression on channels' or 'ProChan'. Our method involves interposing projection layers into a pretrained network around the nonlinearity, reducing the channel dimensionality through compression operations and then expanding it back. Our module is made to be then totally fused with the convolutions around it, guaranteeing no overhead, and maximum FLOPs reduction. We studied its absorption of the cost of quantization, to combine the two approaches for footprint reduction. Our findings indicate that the projections likely perform an 'adaptive stretching' operation on the feature space, enabling the preservation of essential information when constrained by dimensional limitations. We also perform an ablation study on the different possible strategies for a stable and quick training, and analyse the interactions with different quantization paradigms, namely PACT for activations and post-training quantization (PTQ) methods for weights.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要