Initialization and Transfer Learning of Stochastic Binary Networks from Real-Valued Ones

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGITION WORKSHOPS (CVPRW 2021)(2021)

引用 7|浏览9
暂无评分
摘要
We consider the training of binary neural networks (BNNs) using the stochastic relaxation approach, which leads to stochastic binary networks (SBNs). We identify that a severe obstacle to training deep SBNs without skip connections is already the initialization phase. While smaller models can be trained from a random (possibly data-driven) initialization, for deeper models and large datasets, it becomes increasingly difficult to obtain non-vanishing and low variance gradients when initializing randomly.In this work, we initialize SBNs from real-valued networks with ReLU activations. Real valued networks are well established, easier to train and benefit from many techniques to improve their generalization properties. We propose that closely approximating their internal features can provide a good initialization for SBN. We transfer features incrementally, layer-by-layer, accounting for noises in the SBN, exploiting equivalent reparametrizations of ReLU networks and using a novel transfer loss formulation. We demonstrate experimentally that with the proposed initialization, binary networks can be trained faster and achieve a higher accuracy than when initialized randomly.
更多
查看译文
关键词
transfer learning,stochastic binary networks,binary neural network training,stochastic relaxation approach,real-valued networks,feature transfer,ReLU networks,BNN,deep SBN training,data-driven initialization,nonvanishing gradient,low variance gradient,ReLU activation,generalization properties,transfer loss formulation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要