TaylorNet: A Taylor-Driven Generic Neural Architecture

Hongjue Zhao, Yizhuo Chen,Dachun Sun, Yingdong Hu, Kaizhao Liang,Yanbing Mao,Lui Sha,Huajie Shao

ICLR 2023（2023）

引用 0|浏览35

暂无评分

摘要

Physics-informed machine learning (PIML) aims to incorporate physics knowledge into deep neural networks (DNNs) to improve the model generalization. However, existing methods in PIML are either designed for specific problems or hard to interpret the results using black-box DNNs. In this work, we propose Taylor Neural Network (TaylorNet), a generic neural architecture that parameterizes Taylor polynomials using DNNs without using non-linear activation functions. The key challenges of developing TaylorNet lie in: (i) mitigating the curse of dimensionality caused by higher-order terms, and (ii) improving the stability of model training. To overcome these challenges, we first adopt Tucker decomposition to decompose the higher-order derivatives in Taylor expansion parameterized by DNNs into low-rank tensors. Then we propose a novel reducible TaylorNet to further reduce the computational complexity by removing more redundant parameters in the hidden layers. In order to improve training accuracy and stability, we develop a new Taylor initialization method. Finally, the proposed models are evaluated on a broad spectrum of applications, including image classification, natural language processing (NLP), and dynamical systems. The results demonstrate that our proposed Taylor-Mixer, which replaces MLP and activation layers in the MLP-Mixer with Taylor layer, can achieve comparable accuracy on image classification, and similarly on sentiment analysis in NLP, while significantly reducing the number of model parameters. More importantly, our method can interpret some dynamical systems with Taylor polynomials. Meanwhile, the results demonstrate that our Taylor initialization can significantly improve classification accuracy compared to Xavier and Kaiming initialization.

查看译文

关键词

Taylor Neural Networks,Image Classification,Physics Guided Machine Learning,Dynamical Systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要