Fast Evaluation and Approximation of the Gauss-Newton Hessian Matrix for the Multilayer Perceptron

SIAM Journal on Mathematics of Data Science (SIMODS)(2019)

引用 0|浏览43
暂无评分
摘要
We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton\r\nHessian (GNH) matrix for the multilayer perceptron. The algorithm has a\r\nprecomputation step and a sampling step. While it generally requires $O(Nn)$\r\nwork to compute an entry (and the entire column) in the GNH matrix for a neural\r\nnetwork with $N$ parameters and $n$ data points, our fast sampling algorithm\r\nreduces the cost to $O(n+d/\\epsilon^2)$ work, where $d$ is the output dimension\r\nof the network and $\\epsilon$ is a prescribed accuracy (independent of $N$).\r\nOne application of our algorithm is constructing the hierarchical-matrix\r\n(\\hmatrix{}) approximation of the GNH matrix for solving linear systems and\r\neigenvalue problems. While it generally requires $O(N^2)$ memory and $O(N^3)$\r\nwork to store and factorize the GNH matrix, respectively. The \\hmatrix{}\r\napproximation requires only $\\bigO(N r_o)$ memory footprint and $\\bigO(N\r\nr_o^2)$ work to be factorized, where $r_o \\ll N$ is the maximum rank of\r\noff-diagonal blocks in the GNH matrix. We demonstrate the performance of our\r\nfast algorithm and the \\hmatrix{} approximation on classification and\r\nautoencoder neural networks.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要