This shows you the differences between two versions of the page.
papers:spivak-2009 [2013/11/24 16:10] leonb created |
papers:spivak-2009 [2013/11/24 16:11] (current) leonb |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Improvements to the Percolator algorithm for peptide identification from shotgun | + | ===== Improvements to the Percolator algorithm for peptide identification from shotgun |
- | ics data sets ===== | + | |
+ | // | ||
+ | Shotgun proteomics coupled with database search software allows the identification of a large | ||
+ | number of peptides in a single experiment. However, some existing search algorithms, such as | ||
+ | SEQUEST, use score functions that are designed primarily to identify the best peptide for a given | ||
+ | spectrum. Consequently, | ||
+ | function Xcorr fails to discriminate accurately between correct and incorrect peptide identifications. | ||
+ | Several machine learning methods have been proposed to address the resulting classification task of | ||
+ | distinguishing between correct and incorrect peptide-spectrum matches (PSMs). A recent example | ||
+ | is Percolator, which uses semi-supervised learning and a decoy database search strategy to learn to | ||
+ | distinguish between correct and incorrect PSMs identified by a database search algorithm. The | ||
+ | current work describes three improvements to Percolator. (1) Percolator’s heuristic optimization is | ||
+ | replaced with a clear objective function, with intuitive reasons behind its choice. (2) Tractable | ||
+ | nonlinear models are used instead of linear models, leading to improved accuracy over the original | ||
+ | Percolator. (3) A method, Q-ranker, for directly optimizing the number of identified spectra at a | ||
+ | specified q value is proposed, which achieves further gains. | ||