Many important tasks in chemistry revolve around molecules during reactions. This requires predictions far from the equilibrium, while most recent work in machine learning for molecules has been focused on equilibrium or near-equilibrium states. In this paper we aim to extend this scope in three ways. First, we propose the DimeNet++ model, which is 8x faster and 10% more accurate than the original DimeNet on the QM9 benchmark of equilibrium molecules. Second, we validate DimeNet++ on highly reactive molecules by developing the challenging COLL dataset, which contains distorted configurations of small molecules during collisions. Finally, we investigate ensembling and mean-variance estimation for uncertainty quantification with the goal of accelerating the exploration of the vast space of non-equilibrium structures. Our DimeNet++ implementation as well as the COLL dataset are available online.
翻译:化学领域的许多重要任务都围绕反应过程中的分子。 这要求预测远远超出平衡范围, 而分子机器学习的最新工作则侧重于平衡或近平衡状态。 在本文中,我们的目标是以三种方式扩大这一范围。 首先,我们提议DimeNet++模型,比最初的QM9平衡分子基准DimeNet高出8x和10%。 其次,我们通过开发具有挑战性的COLL数据集来验证高反应分子的DimeNet++,该数据集含有碰撞期间小分子的扭曲配置。 最后,我们调查不确定性量化的聚合和中值估计,目的是加速探索非平衡结构的广阔空间。 我们的DimeNet++和COLL数据集可以在网上查阅。