Time: 13:00-15:00, Oct 26, 2018
Venue: Room 300, 320 Yueyang Road
Speaker: Dr Hao CHI (Institute of Computing Technology, CAS)
Host: Zefeng WANG
Title: Open-pFind: Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine
Introduction: Shotgun proteomics has grown rapidly in recent decades, especially for peptide and protein identification. However, more than 50% of MS/MS data acquired in shotgun proteomics have not been successfully identified. As shown in a number of studies, unexpected modifications is a major reason underlying the low identification rate, and several other factors also hinder precise peptide identification, e.g., semi- and non-specific digestion, in-source fragmentation and co-eluting peptides in mixed spectra. We have developed a novel database search algorithm, Open-pFind, to efficiently identify peptides even in an ultra-large search space which takes into account unexpected modifications, amino acid mutations, semi- or non-specific digestion and co-eluting peptides. We re-analyzed an entire human proteome dataset consisting of ~25 million spectra. It took Open-pFind ~5 hours on a 64-core workstation to search all of these spectra. More than one million peptides were identified, which were 86.7% more than those reported previously. The results obtained with Open-pFind demonstrated that the characteristics of MS/MS data vary according to different methods for sample preparation and LC-MS/MS. Open search strategies, as made practical by Open-pFind, will most likely be the preferred tools for large-scale MS/MS data analyses in the future.