How to design protein sequences folding into the desired structures effectively and efficiently? Structure-based protein design has attracted increasing attention in recent years; however, few methods can simultaneously improve the accuracy and efficiency due to the lack of expressive features and autoregressive sequence decoder. To address these issues, we propose ProDesign, which contains a novel residue featurizer and ProGNN layers to generate protein sequences in a one-shot way with improved recovery. Experiments show that ProDesign could achieve 51.66\% recovery on CATH 4.2, while the inference speed is 70 times faster than the autoregressive competitors. In addition, ProDesign achieves 58.72\% and 60.42\% recovery scores on TS50 and TS500, respectively. We conduct comprehensive ablation studies to reveal the role of different types of protein features and model designs, inspiring further simplification and improvement.
翻译:为了解决这些问题,我们提议设计方案,其中包括一个新的残余发酵器和ProGNN层,以一线之力生成蛋白序列,并改进恢复。实验显示,ProDefect可以在CATH4.2上实现51.66%的蛋白质恢复,而推断速度比自动递进竞争者快70倍。此外,ProDeprint在TS50和TS500上分别取得了58.72%和60.42%的恢复分数。我们进行全面的调整研究,以揭示不同种类蛋白特征和模型设计的作用,激励进一步的简化和改进。