Riemannian Laplace approximations for Bayesian neural networks

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

54 Downloads (Pure)

Abstract

Bayesian neural networks often approximate the weight-posterior with a Gaussian distribution. However, practical posteriors are often, even locally, highly non-Gaussian, and empirical performance deteriorates. We propose a simple parametric approximate posterior that adapts to the shape of the true posterior through a Riemannian metric that is determined by the log-posterior gradient. We develop a Riemannian Laplace approximation where samples naturally fall into weightregions with low negative log-posterior. We show that these samples can be drawn by solving a system of ordinary differential equations, which can be done efficiently by leveraging the structure of the Riemannian metric and automatic differentiation. Empirically, we demonstrate that our approach consistently improves over the conventional Laplace approximation across tasks. We further show that, unlike the conventional Laplace approximation, our method is not overly sensitive to the choice of prior, which alleviates a practical pitfall of current approaches.
Original languageEnglish
Title of host publicationProceedings of the 37th Conference on Neural Information Processing Systems
Number of pages28
Volume36
PublisherNeural Information Processing Systems Foundation
Publication date2023
Publication statusPublished - 2023
Event37th Annual Conference on Neural Information Processing Systems - Ernest N. Morial Convention Center, New Orleans, United States
Duration: 10 Dec 202316 Dec 2023
Conference number: 37

Conference

Conference37th Annual Conference on Neural Information Processing Systems
Number37
LocationErnest N. Morial Convention Center
Country/TerritoryUnited States
CityNew Orleans
Period10/12/202316/12/2023

Fingerprint

Dive into the research topics of 'Riemannian Laplace approximations for Bayesian neural networks'. Together they form a unique fingerprint.

Cite this