Competition-based FDR control has been commonly used for over a decade in the computational mass spectrometry community (Elias and Gygi, 2007). Recently, the approach has gained significant popularity in other fields after Barber and Candes (2015) laid its theoretical foundation in a more general setting that included the feature selection problem. In both cases, the competition is based on a head-to-head comparison between an observed score and a corresponding decoy / knockoff. Keich and Noble (2017b) recently demonstrated some advantages of using multiple rather than a single decoy when addressing the problem of assigning peptide sequences to observed mass spectra. In this work, we consider a related problem -- detecting peptides based on a collection of mass spectra -- and we develop a new framework for competition-based FDR control using multiple null scores. Within this framework, we offer several methods, all of which are based on a novel procedure that rigorously controls the FDR in the finite sample setting. Using real data to study the peptide detection problem we show that, relative to existing single-decoy methods, our approach can increase the number of discovered peptides by up to 50% at small FDR thresholds.
翻译:十多年来,计算质量质谱系普遍采用基于竞争的FDR控制方法(Elias和Gygi, 2007年)。最近,在Barber和Candes(2015年)在包括特征选择问题在内的更宽泛的环境下奠定了理论基础,Barber和Candes(2015年)在其他方面,Barber和Candes(2015年)在包括特征选择问题在内的更为普遍的环境下奠定了理论基础,该方法在计算质量质谱系社区中已普遍使用了十多年来基于竞争的FDR控制方法。在这两种情况下,竞争都基于对一个观察到的得分和相应的诱饵/诱饵之间的头比对头比较。Keich和Nob(2017b)最近展示了在解决向观察到的质谱分配peptide序列问题时使用多个而非一个单一诱饵的优势。在这项工作中,我们考虑了一个相关问题 -- -- 在收集质量质谱谱的收集中发现Peptide序列 -- -- 我们的方法可以把发现PDR的阈值增加到50。