Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/.
翻译:从图像探测和图形理论中应用深刻的学习概念,极大地推动了蛋白质-捆绑和结合的亲近性预测,这是药物发现和蛋白工程的巨大挑战。我们利用这些进步,设计了一个全新的深层次学习结构,由三维共进神经网络组成,利用频道注意力和两个图形共进网络,利用基于关注的节点特征聚合。HAC-Net(Hybridge Doit-Base-Based Convolution Convolution Neal)在PDBbind v.2016核心数据集(该领域最广泛承认的基准)上取得了最先进的结果。我们广泛评估了我们模型的通用性,使用了多次火车测试的分拆,每个分拆都使蛋白质结构、蛋白序列或离子和延伸连接性指纹之间的差异最大化。此外,我们在培训和测试组中以SMILES串的结点为类似标准,还评估了HAC-Net关于低质量数据的性能。我们设想这一模型可以推广到一个广泛的范围,通过多层火车测试来评估模型,每个分解的蛋白质/M-HA/GAL-G-CMQ-S-SAL-SOLM-C-SOL-S-C-S-SON-C-SON-SON-SOL-SON-G-S-SUM-S-S-S-S-S-S-G-S-S-S-S-S-S-S-S-S-SOL-S-SON-G-S-SON-SON-S-SON-SON-SON-G-C-STI/G-S-S-S-S-S-G-S-S-G-S-S-S-S-S-S-S-S-S-S-S-S-S-STI/G-S-S-S-S-S-STI的公开的公开的可使用系统。