Bayesian Optimization at 1,000 Dimensions: Why It Works, What Breaks, and What’s Next
by
Abstract:
Bayesian optimization (BO) with Gaussian-process (GP) surrogates is often said to “top out” around ~20 dimensions under realistic evaluation budgets due to the curse of dimensionality. In response, high-dimensional BO (HDBO) has produced a rich toolbox of optimisation algorithms that report strong performance in the hundreds or even thousands of variables under specific assumptions. Yet recent benchmark studies paint a more puzzling picture: surprisingly vanilla GP-BO configurations, with only minor implementation choices, can match or outperform many specialized HDBO methods. So what, exactly, did the sophisticated methods solve - and why do the simple ones work?
In this talk I retrace the trajectory from structured HDBO to this new „simplicity" era. I first present BAxUS and Bounce, which adaptively expand nested subspace embeddings to handle high-dimensional continuous and mixed/combinatorial spaces. Using these as a lens, I then dissect common HDBO benchmarks and show how seemingly state-of-the-art gains can arise from unintended or "too-helpful" structure in the test problems rather than from genuinely scalable modeling.
The second half focuses on a diagnostic explanation for strong vanilla performance. I highlight two mechanisms: (i) vanishing gradients in both GP marginal-likelihood training and acquisition-function optimization, which can silently freeze learning unless length-scale initialization (or priors) are scaled with dimensionality; and (ii) implicit locality induced by widely used acquisition optimizers - especially "sample-around-best" / RAASP-style candidate generation - which effectively turns global BO into a robust local search procedure. The takeaway is that much reported HDBO success is driven by effective locality, and that many benchmarks are easier than their nominal dimension suggests. I close with open challenges around realistic benchmark design, budget-aware model complexity, and principled local-search hybrids.