We introduce a novel class of localized atomic environment representations, based upon the Coulomb matrix. By combining these functions with the Gaussian approximation potential approach, we present LC-GAP, a new system for generating atomic potentials through machine learning (ML). Tests on the QM7, QM7b and GDB9 biomolecular datasets demonstrate that potentials created with LC-GAP can successfully predict atomization energies for molecules larger than those used for training to chemical accuracy, and can (in the case of QM7b) also be used to predict a range of other atomic properties with accuracy in line with the recent literature. As the best-performing representation has only linear dimensionality in the number of atoms in a local atomic environment, this represents an improvement both in prediction accuracy and computational cost when considered against similar Coulomb matrix-based methods.
翻译:我们根据库伦布矩阵引入了新型的局部原子环境代表体系。通过将这些功能与高山近似潜在方法相结合,我们提出了LC-GAP,这是一个通过机器学习产生原子潜力的新系统。对QM7、QM7b和GDB9生物分子数据集的测试表明,通过LC-GAP开发出的潜力能够成功地预测比化学精度培训更大的分子的原子化能,并且(就QM7b而言)也可以用来预测一系列与最近文献相一致的准确性的其他原子特性。由于最佳代表体系仅具有局部原子环境中原子数量的线性维度,因此,如果参照类似的库伦基基方法,这代表了预测准确性和计算成本的提高。