Bayesian optimization has emerged as a powerful strategy to accelerate scientific discovery by means of autonomous experimentation. However, expensive measurements are required to accurately estimate materials properties, and can quickly become a hindrance to exhaustive materials discovery campaigns. Here, we introduce Gemini: a data-driven model capable of using inexpensive measurements as proxies for expensive measurements by correcting systematic biases between property evaluation methods. We recommend using Gemini for regression tasks with sparse data and in an autonomous workflow setting where its predictions of expensive to evaluate objectives can be used to construct a more informative acquisition function, thus reducing the number of expensive evaluations an optimizer needs to achieve desired target values. In a regression setting, we showcase the ability of our method to make accurate predictions of DFT calculated bandgaps of hybrid organic-inorganic perovskite materials. We further demonstrate the benefits that Gemini provides to autonomous workflows by augmenting the Bayesian optimizer Phoenics to yeild a scalable optimization framework leveraging multiple sources of measurement. Finally, we simulate an autonomous materials discovery platform for optimizing the activity of electrocatalysts for the oxygen evolution reaction. Realizing autonomous workflows with Gemini, we show that the number of measurements of a composition space comprising expensive and rare metals needed to achieve a target overpotential is significantly reduced when measurements from a proxy composition system with less expensive metals are available.
翻译:贝叶斯优化已经成为一种强大的战略,通过自主实验来加速科学发现。然而,为了准确估计材料特性,需要花费昂贵的测量方法来准确估计材料特性,并会很快成为详尽材料发现运动的障碍。在这里,我们引入了Gemini:一个数据驱动模型,通过纠正财产评估方法之间的系统性偏差,能够使用廉价的测量作为昂贵测量的替代物。我们建议使用Gemini执行回归任务,同时提供稀少的数据,并在一个自主工作流程环境中,利用对评估目标的昂贵预测来构建一个信息更加丰富的获取功能,从而减少一个最昂贵的评价,优化者需要达到预期目标值。在回归环境中,我们展示了我们准确预测DFT计算有机有机有机有机有机有机活性材料的带宽的能力。我们进一步展示了Gemini通过增强Bayesian优化器的功能,将一个可升级的最佳框架用于利用多种测量来源来支撑。最后,我们模拟一个自动材料发现平台,以优化电化学催化剂的活动,以达到理想值值值值值值。实现由稀有的金属构成的自主工作流程,我们从可大大降低。