Dynamic Sparse Training with Structured Sparsity

Mike Lasby,Anna Golubeva,Utku Evci,Mihai Nica,Yani Ioannou

ICLR 2024（2023）

引用 0|浏览41

暂无评分

摘要

DST methods achieve state-of-the-art results in sparse neural network training, matching the generalization of dense models while enabling sparse training and inference. Although the resulting models are highly sparse and theoretically cheaper to train, achieving speedups with unstructured sparsity on real-world hardware is challenging. In this work we propose a DST method to learn a variant of structured N:M sparsity, the acceleration of which in general is commonly supported in commodity hardware. Furthermore, we motivate with both a theoretical analysis and empirical results, the generalization performance of our specific N:M sparsity (constant fan-in), present a condensed representation with a reduced parameter and memory footprint, and demonstrate reduced inference time compared to dense models with a naive PyTorch CPU implementation of the condensed representation Our source code is available at https://github.com/calgaryml/condensed-sparsity

查看译文

关键词

Machine Learning,dynamic sparse training,structured sparsity,N:M sparsity,efficient deep learning,RigL,SRigL,constant fan-in,dynamic neuron ablation,neuron ablation,structured and fine-grained sparsity,online inference,accelerating inference

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要