Training 1-Bit Networks on a Sphere: A Geometric Approach.

International Conference on Artificial Neural Networks and Machine Learning (ICANN)(2022)

引用 0|浏览15
暂无评分
摘要
Weight binarization offers a promising alternative towards building highly efficient Deep Neural Networks (DNNs) that can be deployed in low-power, constrained devices. However, given their discrete nature, training 1-bit DNNs is not a straightforward or uniquely defined process and several strategies have been proposed to address this issue yielding every time closer performance to their full-precision counterparts. In this paper we analyze 1-bit DNNs from a differential geometry perspective. We part from noticing that for a given model wvith d binary weights, all possible weight configurations lie on a sphere of radius d. Along with the traditional training procedure based on the Straight Through Estimator (STE), we leverage concepts from the fields of Riemannian optimization to constrain the search space to spherical manifolds, a subset of Riemannian manifolds. Our approach offers a principled solution; nevertheless, in practice we found that simply constraining the norm of the underlying auxiliary network works just as effectively. Additionally, we observe that by enforcing a unit norm on the network parameters, our network explores a space of well-conditioned matrices. Complementary to our approach, we additionally define an angle based regularization that guides the auxiliary space exploration. We binarize a ResNet architecture in order to demonstrate the effectiveness of our approach in the tasks of image classification on the CIFAR-100 and ImageNet datasets.
更多
查看译文
关键词
1-bit Neural Networks,Geometric optimization,Conditioning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要