Adaptive Cholesky Gaussian Processes

Simon Bartels; Kristoffer Stensbo-Smidt; Pablo Moreno-Munoz; Wouter Boomsma; Jes Frellsen; Søren Hauberg

Adaptive Cholesky Gaussian Processes

Simon Bartels, Kristoffer Stensbo-Smidt, Pablo Moreno-Munoz, Wouter Boomsma, Jes Frellsen, Søren Hauberg

University of Copenhagen

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

36 Downloads (Pure)

Abstract

We present a method to approximate Gaussian process regression models to large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed.

Original language	English
Title of host publication	Proceedings of the 26th International Conference on Artificial Intelligence and Statistics
Volume	206
Publisher	Proceedings of Machine Learning Research
Publication date	2023
Pages	408-452
Publication status	Published - 2023
Event	26^th International Conference on Artificial Intelligence and Statistics - Valencia, Spain Duration: 25 Apr 2023 → 27 Apr 2023 Conference number: 26

Conference

Conference	26^th International Conference on Artificial Intelligence and Statistics
Number	26
Country/Territory	Spain
City	Valencia
Period	25/04/2023 → 27/04/2023

Series	Proceedings of Machine Learning Research

Access to Document

FulltextAccepted author manuscript, 4.73 MB

https://proceedings.mlr.press/v206/bartels23a.html

OpenUrl availability

Full text

Cite this

@inproceedings{2cdb9b519f864beb9abcf7d5b5c3dd5c,

title = "Adaptive Cholesky Gaussian Processes",

abstract = "We present a method to approximate Gaussian process regression models to large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed.",

author = "Simon Bartels and Kristoffer Stensbo-Smidt and Pablo Moreno-Munoz and Wouter Boomsma and Jes Frellsen and S{\o}ren Hauberg",

year = "2023",

language = "English",

volume = "206",

series = "Proceedings of Machine Learning Research",

pages = "408--452",

booktitle = "Proceedings of the 26th International Conference on Artificial Intelligence and Statistics",

publisher = "Proceedings of Machine Learning Research",

note = "26<sup>th</sup> International Conference on Artificial Intelligence and Statistics, AISTATS ; Conference date: 25-04-2023 Through 27-04-2023",

}

Bartels, S, Stensbo-Smidt, K , Moreno-Munoz, P, Boomsma, W, Frellsen, J & Hauberg, S 2023, Adaptive Cholesky Gaussian Processes. in Proceedings of the 26th International Conference on Artificial Intelligence and Statistics. vol. 206, Proceedings of Machine Learning Research, Proceedings of Machine Learning Research, pp. 408-452, 26^th International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 25/04/2023. <https://proceedings.mlr.press/v206/bartels23a.html>

Adaptive Cholesky Gaussian Processes. / Bartels, Simon; Stensbo-Smidt, Kristoffer ; Moreno-Munoz, Pablo et al.
Proceedings of the 26th International Conference on Artificial Intelligence and Statistics. Vol. 206 Proceedings of Machine Learning Research, 2023. p. 408-452 (Proceedings of Machine Learning Research).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

TY - GEN

T1 - Adaptive Cholesky Gaussian Processes

AU - Bartels, Simon

AU - Stensbo-Smidt, Kristoffer

AU - Moreno-Munoz, Pablo

AU - Boomsma, Wouter

AU - Frellsen, Jes

AU - Hauberg, Søren

N1 - Conference code: 26

PY - 2023

Y1 - 2023

N2 - We present a method to approximate Gaussian process regression models to large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed.

AB - We present a method to approximate Gaussian process regression models to large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed.

M3 - Article in proceedings

VL - 206

T3 - Proceedings of Machine Learning Research

SP - 408

EP - 452

BT - Proceedings of the 26th International Conference on Artificial Intelligence and Statistics

PB - Proceedings of Machine Learning Research

T2 - 26<sup>th</sup> International Conference on Artificial Intelligence and Statistics

Y2 - 25 April 2023 through 27 April 2023

ER -

Adaptive Cholesky Gaussian Processes

Abstract

Conference

Access to Document

OpenUrl availability

Fingerprint

Cite this