-
Notifications
You must be signed in to change notification settings - Fork 1
Home
MHC-validator is a machine learning software to rescore immunopeptidomics data aquired with mass spectrometers. In addition to learning from peptide features, mhc-validator also uses MHC binding affinities from NetMHCpan4.1 and MHCflurry to better assess whether a potential immunopeptide in your mass spectrometry run is really present in the sample. It can be built into commonly used immunopeptidomics pipelines. If you implement mhc-validator into your immunopeptidomics pipeline, you can significantly boost the number of confidentially identified immunopeptides. Depending on the sample quality, we report 1.5 up to 10 fold more peptide spectrum matches (PSMs) with mhc-validator compared to the commonly used enhancing tools (Aka percolator, DeepRescore etc.). MHC-validator does not only boost the number of immunopeptides found, it is also highly specific in finding low abundant immunopeptides in your samples.
Below is a brief high level description of how mhc-validator works:
-
First, mhc-validator loads peptides sequences and its features. Based on these features and the knowledge wheather a peptide comes from a target or decoy search, mhc-validator tries to learn how likely a peptide spectrum match is real. MHC-validator uses three types of features, a) the features reported by the search engine (target vs. decoy, mass, peptide length, charge, Xcorr etc.), b) immuinopeptide binding affinities reported by NetMHCpan4.1 and/or MHCflurry and c) peptide amino acid sequences themselves. The base algorithm is based on learning from the search engine results only (termed MV), immunopeptide binding affinity assessment (MHC) and peptide sequence encoding (PE) can be added by the user using the options available. Let's assume we intend to use MHC-validator to its full potential and set the options 'sequence_encoing' (PE), 'netmhcpan' and 'mhcflurry' (MV) all to 'True' in this example.
-
Once the sequences have been loaded, mhc-validator first used NetMHCpan4.1 and MHCflurry to generate MHC binding affinities/elution scores and adds the results to the feature list provided from the database search results. Based on these features, MHC-validator uses a neuronal network to learn from those features and assigns possibilities for the peptide to be hit or not. This first neuronal network can (If sequence encoding is set to True as it is in our example) be connected to a second neuronal network which takes the amino acid sequences into account.
-
Results are reported in form of a q-value. Peptides with a q-value <0.01 are identified with having less than 1% chance of being a false positive (Aka are a true hit based on a 1% FDR cutoff).
-
You can now use these peptides for further analysis.