We present a general strategy for turning generative models into candidate solution samplers for batch Bayesian optimization (BO). The use of generative models for BO enables large batch scaling as generative sampling, optimization of non-continuous design spaces, and high-dimensional and combinatorial design. Inspired by the success of direct preference optimization (DPO), we show that one can train a generative model with noisy, simple utility values directly computed from observations to then form proposal distributions whose densities are proportional to the expected utility, i.e., BO's acquisition function values. Furthermore, this approach is generalizable beyond preference-based feedback to general types of reward signals and loss functions. This perspective avoids the construction of surrogate (regression or classification) models, common in previous methods that have used generative models for black-box optimization. Theoretically, we show that the generative models within the BO process approximately follow a sequence of distributions which asymptotically concentrate at the global optima under certain conditions. We also demonstrate this effect through experiments on challenging optimization problems involving large batches in high dimensions.
翻译:我们提出了一种通用策略,将生成模型转化为批量贝叶斯优化(BO)的候选解采样器。在BO中使用生成模型能够实现大规模批处理扩展,支持非连续设计空间的优化,以及高维与组合设计。受直接偏好优化(DPO)成功的启发,我们证明可以通过直接利用观测数据计算出的含噪声简单效用值训练生成模型,从而构建其密度与期望效用(即BO采集函数值)成正比的提议分布。此外,该方法可推广至基于偏好的反馈之外,适用于一般类型的奖励信号与损失函数。这一视角避免了以往使用生成模型进行黑盒优化的方法中常见的代理(回归或分类)模型构建。理论上,我们证明了BO过程中的生成模型在特定条件下近似遵循一系列渐近收敛于全局最优解的分布。我们通过在涉及高维大规模批处理的复杂优化问题上的实验进一步验证了这一效应。