Database search engines and target database features impinge upon the identification of post-translationally cis-spliced peptides in HLA class I immunopeptidomes.
posted on 2022-05-11, 09:31authored byMichele Mishto, Yehor Horokhovskyi, John A Cormican, Xiaoping Yang, Steven Lynham, Henning Urlaub, Juliane Liepe
Unconventional epitopes presented by HLA class I complexes are emerging targets for T cell targeted immunotherapies. Their identification by mass spectrometry required development of novel methods to cope with the large number of theoretical candidates. Methods to identify post-translationally spliced peptides led to a broad range of outcomes. We here investigated the impact of three common database search engines - i.e. Mascot, Mascot+Percolator and PEAKS DB - as final identification step, as well as the features of target database on the ability to correctly identify non-spliced and cis-spliced peptides. We used ground truth datasets measured by mass spectrometry to benchmark methods' performance and extended the analysis to HLA class I immunopeptidomes. PEAKS DB showed better precision and recall of cis-spliced peptides and larger number of identified peptides in HLA class I immunopeptidomes than the other search engine strategies. The better performance of PEAKS DB appears to result from better discrimination between target and decoy hits and hence a more robust FDR estimation, and seems independent to peptide and spectrum features here investigated. This article is protected by copyright. All rights reserved.