Thanks to automated cryo-EM and GPU-accelerated processing, single-particle cryo-EM has become a rapid structure determination method that permits capture of dynamical structures of molecules in solution, which has been recently demonstrated by the determination of COVID-19 spike protein in March, shortly after its breakout in late January 2020. This rapidity is critical for vaccine development in response to emerging pandemic. This explains why a 2D classification approach based on multi-reference alignment (MRA) is not as popular as the Bayesian-based approach despite that the former has advantage in differentiating subtle structural variations under low signal-to-noise ratio (SNR). This is perhaps because that MRA is a time-consuming process and a modular GPU-acceleration package for MRA is still lacking. Here, we introduced a library called Cryo-RALib that contains GPU-accelerated modular routines for accelerating MRA-based classification algorithms. In addition, we connect the cryo-EM image analysis with the python data science stack so as to make it easier for users to perform data analysis and visualization. Benchmarking on the TaiWan Computing Cloud (TWCC) container shows that our implementation can accelerate the computation by one order of magnitude. The library has been made publicly available at https://github.com/phonchi/Cryo-RAlib.
翻译:由于自动冷冻-EM和GPU加速处理,单粒子冷冻-EM已成为一种快速的结构确定方法,能够捕捉溶解分子的动态结构,这最近表现在3月COVID-19钉状蛋白质的确定上,这是在2020年1月底爆发后不久的3月COVID-19钉状蛋白质的确定上。这种快速性对于疫苗开发以应对新出现的大流行病至关重要。这解释了为什么基于多参照比对齐(MRA)的2D分类方法不如以巴耶斯为基础的方法那么受欢迎。尽管前者在低信号至噪音比率(SNRR)下具有区分微妙结构变异的优势。这或许是因为MRA是一个耗时过程,MRA的模块化GUP-加速软件包仍然缺乏。在这里,我们推出了一个名为CLeco-RALib的图书馆,该图书馆包含GPU-accerateed模块,用于加速以MRA为基础的分类算算法。此外,我们把Celoo-EM图像分析与Pyson数据科学库连接起来,以便让用户更容易进行数据分析和快速分析。