A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences

Miguel Gonzalez-Duque*, Richard Michael, Simon Bartels, Yevgen Zainchkovskyy, Søren Hauberg*, Wouter Boomsma*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

30 Downloads (Pure)

Abstract

Optimizing discrete black box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these tasks. Several methods for high-dimensional continuous and categorical Bayesian optimization have been proposed recently. However, our survey of the field reveals highly heterogeneous experimental set-ups across methods and technical barriers for the replicability and application of published algorithms to real-world tasks. To address these issues, we develop a unified framework to test a vast array of high-dimensional Bayesian optimization methods and a collection of standardized black box functions representing real-world application domains in chemistry and biology. These two components of the benchmark are each supported by flexible, scalable, and easily extendable software libraries (poli and poli-baselines), allowing practitioners to readily incorporate new optimization objectives or discrete optimizers. Project website: https://machinelearninglifescience.github.io/hdbo_benchmark
Original languageEnglish
Title of host publicationProceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)
Number of pages31
Publication date2024
Publication statusPublished - 2024
Event38th Conference on Neural Information Processing Systems - Vancouver, Canada
Duration: 10 Dec 202415 Dec 2024

Conference

Conference38th Conference on Neural Information Processing Systems
Country/TerritoryCanada
CityVancouver
Period10/12/202415/12/2024

Fingerprint

Dive into the research topics of 'A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences'. Together they form a unique fingerprint.

Cite this