sqFm: a novel adaptive optimization scheme for deep learning model

Shubhankar Bhakta,Utpal Nandi,Madhab Mondal, Kuheli Ray Mahapatra,Partha Chowdhuri,Pabitra Pal

Evolutionary Intelligence(2024)

引用 0|浏览3
暂无评分
摘要
For deep model training, an optimization technique is required that minimizes loss and maximizes accuracy. The development of an effective optimization method is one of the most important study areas. The diffGrad optimization method uses gradient changes during optimization phases but does not update 2nd order moments based on 1st order moments, and the AngularGrad optimization method uses the angular value of the gradient, which necessitates additional calculation. Due to these factors, both of those approaches result in zigzag trajectories that take a long time and require additional calculations to attain a global minimum. To overcome those limitations, a novel adaptive deep learning optimization method based on square of first momentum (sqFm) has been proposed. By adjusting 2nd order moments depending on 1st order moments and changing step size according to the present gradient on the non-negative function, the suggested sqFm delivers a smoother trajectory and better image classification accuracy. The empirical research comparing the performance of the proposed sqFm with Adam, diffGrad, and AngularGrad applying non-convex functions demonstrates that the suggested method delivers the best convergence and parameter values. In comparison to SGD, Adam, diffGrad, RAdam, and AngularGrad(tan) using the Rosenbrock function, the proposed sqFm method can attain the global minima gradually with less overshoot. Additionally, it is demonstrated that the proposed sqFm gives consistently good classification accuracy when training CNN networks (ResNet16, ResNet50, VGG34, ResNet18, and DenseNet121) on the CIFAR10, CIFAR100, and MNIST datasets, in contrast to SGDM, diffGrad, Adam, AngularGrad(Cos), and AngularGrad(Tan). The proposed method also gives the best classification accuracy than SGD, Adam, AdaBelief, Yogi, RAdam, and AngularGrad using the ImageNet dataset on the ResNet18 network. Source code link: https://github.com/UtpalNandi/sqFm-A-novel-adaptive-optimization-scheme-for-deep-learning-model .
更多
查看译文
关键词
DiffGrad,Adam,AngularGrad,ResNet34,VGG16,ResNet18,DenseNet121,ResNet50,Neural network,Optimization,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要