Hessian-Free Laplace in Bayesian Deep Learning
arxiv(2024)
摘要
The Laplace approximation (LA) of the Bayesian posterior is a Gaussian
distribution centered at the maximum a posteriori estimate. Its appeal in
Bayesian deep learning stems from the ability to quantify uncertainty post-hoc
(i.e., after standard network parameter optimization), the ease of sampling
from the approximate posterior, and the analytic form of model evidence.
However, an important computational bottleneck of LA is the necessary step of
calculating and inverting the Hessian matrix of the log posterior. The Hessian
may be approximated in a variety of ways, with quality varying with a number of
factors including the network, dataset, and inference task. In this paper, we
propose an alternative framework that sidesteps Hessian calculation and
inversion. The Hessian-free Laplace (HFL) approximation uses curvature of both
the log posterior and network prediction to estimate its variance. Only two
point estimates are needed: the standard maximum a posteriori parameter and the
optimal parameter under a loss regularized by the network prediction. We show
that, under standard assumptions of LA in Bayesian deep learning, HFL targets
the same variance as LA, and can be efficiently amortized in a pre-trained
network. Experiments demonstrate comparable performance to that of exact and
approximate Hessians, with excellent coverage for in-between uncertainty.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要