Asymptotic Distribution of Stochastic Mirror Descent Iterates in Average Ensemble Models

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览2
暂无评分
摘要
The stochastic mirror descent (SMD) algorithm is a general class of training algorithms that utilizes a mirror potential to influence the implicit bias of the training algorithm and includes stochastic gradient descent (SGD) as a special case. In this paper, we explore the performance of the SMD on mean-field ensemble models and generalize earlier results obtained for SGD. The evolution of the distribution of parameters is mapped to a continuous time process in the space of probability distributions. Our main result gives a nonlinear partial differential equation (PDE) to which the continuous time process converges in the asymptotic of large networks. The impact of the mirror potential appears through a multiplicative term that is equal to the inverse of its Hessian and defines a gradient flow over an appropriate Riemannian manifold. We provide numerical simulations which allow us to study and characterize the effect of the mirror potential on the performance of networks trained with SMD for some binary classification problems.
更多
查看译文
关键词
stochastic mirror descent,ensemble models,neural networks,mean field
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要