Abstract
Bayesian predictions are stochastic just like predictions of any other inference scheme that generalize from a finite sample. While a simple variational argument shows that Bayes averaging is generalization optimal given that the prior matches the teacher parameter distribution the situation is less clear if the teacher distribution is unknown. I define a class of averaging procedures, the temperated likelihoods, including both Bayes averaging with a uniform prior and maximum likelihood estimation as special cases. I show that Bayes is generalization optimal in this family for any teacher distribution for two learning problems that are analytically tractable learning the mean of a Gaussian and asymptotics of smooth learners.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 1999 |
| Publisher | MIT Press |
| Publication date | 2000 |
| Pages | 265-271 |
| Publication status | Published - 2000 |
| Event | Advances in Neural Information Processing Systems 12 - Denver, United States Duration: 29 Nov 1999 → 4 Dec 1999 |
Conference
| Conference | Advances in Neural Information Processing Systems 12 |
|---|---|
| Country/Territory | United States |
| City | Denver |
| Period | 29/11/1999 → 04/12/1999 |