Skip to content

Commit

Permalink
add proforma ion notation to the charge state rule
Browse files Browse the repository at this point in the history
  • Loading branch information
mobiusklein committed Nov 1, 2024
1 parent 9817d9f commit cedf54f
Show file tree
Hide file tree
Showing 5 changed files with 483 additions and 476 deletions.
182 changes: 91 additions & 91 deletions examples/chinese_hamster_hcd_selected_head.mzlb.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ MS:1003188|library name=examples/chinese_hamster_hcd_selected_head
<Spectrum=1>
MS:1003061|library spectrum name=AAAACALTPGPLADLAAR/2_1(4,C,CAM)_46eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=855.4538
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=46
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -20,28 +22,26 @@ MS:1000028|detector resolution=7500
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=8799173.32
MS:1003085|previous MSn-1 scan precursor intensity=8799173.32
MS:1003086|precursor apex intensity=25273307.5
MS:1003208|experimental precursor monoisotopic m/z=855.455
MS:1000512|filter string="FTMS + p NSI d Full ms2 855.96@hcd35.00 [140.00-1725.00]"
MS:1003059|number of peaks=87
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=8.13346e-015)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAACALTPGPLADLAAR
MS:1000224|molecular mass=1710.9076
MS:1000041|charge state=2
MS:1000744|selected ion m/z=855.4538
[1]MS:1001975|delta m/z=1.4
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=855.455
MS:1003169|proforma peptidoform sequence=AAAAC[Carbamidomethyl]ALTPGPLADLAAR
MS:1003270|proforma peptidoform ion notation=AAAAC[Carbamidomethyl]ALTPGPLADLAAR/2
MS:1001117|theoretical mass=1708.89303961159
[2]MS:1003048|number of enzymatic termini=2
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=R
[2]MS:1001113|c-terminal flanking residue=L
[2]MS:1000885|protein accession=tr|G3IJB9|G3IJB9_CRIGR UDP-N-acetylhexosamine pyrophosphorylase-like protein 1 OS=Cricetulus griseus GN=I79_023952 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=2
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=R
[1]MS:1001113|c-terminal flanking residue=L
[1]MS:1000885|protein accession=tr|G3IJB9|G3IJB9_CRIGR UDP-N-acetylhexosamine pyrophosphorylase-like protein 1 OS=Cricetulus griseus GN=I79_023952 PE=4 SV=1
MS:1003243|adduct ion mass=1710.9076
<Interpretation=1>
[1]MS:1001975|delta m/z=1.4
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.2848
MS:1003080|top 20 peak unassigned intensity fraction=0.1879
MS:1003289|intensity of highest unassigned peak=0.45
Expand Down Expand Up @@ -138,6 +138,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=4
<Spectrum=2>
MS:1003061|library spectrum name=AAAACALTPGPLADLAAR/2_1(4,C,CAM)_53eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=855.4538
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=53
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -151,28 +153,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=1776618.56
MS:1003085|previous MSn-1 scan precursor intensity=1776618.56
MS:1003086|precursor apex intensity=12167259.65
MS:1003208|experimental precursor monoisotopic m/z=855.4574
MS:1000512|filter string="FTMS + p NSI d Full ms2 855.95@hcd35.00 [140.00-1725.00]"
MS:1003059|number of peaks=204
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=1.31932e-020)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAACALTPGPLADLAAR
MS:1000224|molecular mass=1710.9076
MS:1000041|charge state=2
MS:1000744|selected ion m/z=855.4538
[1]MS:1001975|delta m/z=4.2
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=855.4574
MS:1003169|proforma peptidoform sequence=AAAAC[Carbamidomethyl]ALTPGPLADLAAR
MS:1003270|proforma peptidoform ion notation=AAAAC[Carbamidomethyl]ALTPGPLADLAAR/2
MS:1001117|theoretical mass=1708.89303961159
[2]MS:1003048|number of enzymatic termini=2
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=R
[2]MS:1001113|c-terminal flanking residue=L
[2]MS:1000885|protein accession=tr|G3IJB9|G3IJB9_CRIGR UDP-N-acetylhexosamine pyrophosphorylase-like protein 1 OS=Cricetulus griseus GN=I79_023952 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=2
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=R
[1]MS:1001113|c-terminal flanking residue=L
[1]MS:1000885|protein accession=tr|G3IJB9|G3IJB9_CRIGR UDP-N-acetylhexosamine pyrophosphorylase-like protein 1 OS=Cricetulus griseus GN=I79_023952 PE=4 SV=1
MS:1003243|adduct ion mass=1710.9076
<Interpretation=1>
[1]MS:1001975|delta m/z=4.2
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.3165
MS:1003080|top 20 peak unassigned intensity fraction=0.142
MS:1003289|intensity of highest unassigned peak=0.16
Expand Down Expand Up @@ -386,6 +386,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=5
<Spectrum=3>
MS:1003061|library spectrum name=AAAAGQTGTVPPGAPGALPLPGMAIVK/2_0_76eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=1207.1672
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=76
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -399,28 +401,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=6939079.2
MS:1003085|previous MSn-1 scan precursor intensity=6939079.2
MS:1003086|precursor apex intensity=7583304.35
MS:1003208|experimental precursor monoisotopic m/z=1207.1661
MS:1000512|filter string="FTMS + p NSI d Full ms2 1207.67@hcd35.00 [140.00-2000.00]"
MS:1003059|number of peaks=122
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=1.43642e-012)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAAGQTGTVPPGAPGALPLPGMAIVK
MS:1000224|molecular mass=2414.3344
MS:1000041|charge state=2
MS:1000744|selected ion m/z=1207.1672
[1]MS:1001975|delta m/z=-0.9
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=1207.1661
MS:1003169|proforma peptidoform sequence=AAAAGQTGTVPPGAPGALPLPGMAIVK
MS:1003270|proforma peptidoform ion notation=AAAAGQTGTVPPGAPGALPLPGMAIVK/2
MS:1001117|theoretical mass=2412.319901150229
[2]MS:1003048|number of enzymatic termini=1
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=A
[2]MS:1001113|c-terminal flanking residue=E
[2]MS:1000885|protein accession=tr|G3I2Q7|G3I2Q7_CRIGR Transcription intermediary factor 1-beta OS=Cricetulus griseus GN=I79_017700 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=1
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=A
[1]MS:1001113|c-terminal flanking residue=E
[1]MS:1000885|protein accession=tr|G3I2Q7|G3I2Q7_CRIGR Transcription intermediary factor 1-beta OS=Cricetulus griseus GN=I79_017700 PE=4 SV=1
MS:1003243|adduct ion mass=2414.3344
<Interpretation=1>
[1]MS:1001975|delta m/z=-0.9
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.2591
MS:1003080|top 20 peak unassigned intensity fraction=0.0
MS:1003289|intensity of highest unassigned peak=0.12
Expand Down Expand Up @@ -552,6 +552,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=0
<Spectrum=4>
MS:1003061|library spectrum name=AAAAGSTSVKPIFSR/2_0_44eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=731.9043
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=44
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -565,28 +567,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=324419.29
MS:1003085|previous MSn-1 scan precursor intensity=324419.29
MS:1003086|precursor apex intensity=361702.23
MS:1003208|experimental precursor monoisotopic m/z=731.9023
MS:1000512|filter string="FTMS + p NSI d Full ms2 731.90@hcd34.00 [110.00-1475.00]"
MS:1003059|number of peaks=111
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=6.33525e-018)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAAGSTSVKPIFSR
MS:1000224|molecular mass=1463.8086
MS:1000041|charge state=2
MS:1000744|selected ion m/z=731.9043
[1]MS:1001975|delta m/z=-2.7
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=731.9023
MS:1003169|proforma peptidoform sequence=AAAAGSTSVKPIFSR
MS:1003270|proforma peptidoform ion notation=AAAAGSTSVKPIFSR/2
MS:1001117|theoretical mass=1461.7939769138902
[2]MS:1003048|number of enzymatic termini=1
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=Q
[2]MS:1001113|c-terminal flanking residue=D
[2]MS:1000885|protein accession=tr|G3I0F4|G3I0F4_CRIGR NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 OS=Cricetulus griseus GN=I79_016836 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=1
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=Q
[1]MS:1001113|c-terminal flanking residue=D
[1]MS:1000885|protein accession=tr|G3I0F4|G3I0F4_CRIGR NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 OS=Cricetulus griseus GN=I79_016836 PE=4 SV=1
MS:1003243|adduct ion mass=1463.8086
<Interpretation=1>
[1]MS:1001975|delta m/z=-2.7
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.1681
MS:1003080|top 20 peak unassigned intensity fraction=0.0217
MS:1003289|intensity of highest unassigned peak=0.17
Expand Down Expand Up @@ -707,6 +707,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=1
<Spectrum=5>
MS:1003061|library spectrum name=AAAAGSTSVKPIFSR/3_0_28eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=3
MS:1000744|selected ion m/z=488.2719
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=28
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -720,28 +722,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=3390555.93
MS:1003085|previous MSn-1 scan precursor intensity=3390555.93
MS:1003086|precursor apex intensity=3965011.86
MS:1003208|experimental precursor monoisotopic m/z=488.2738
MS:1000512|filter string="FTMS + p NSI d Full ms2 488.27@hcd34.00 [110.00-1475.00]"
MS:1003059|number of peaks=161
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=9.67069e-018)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAAGSTSVKPIFSR
MS:1000224|molecular mass=1464.8157
MS:1000041|charge state=3
MS:1000744|selected ion m/z=488.2719
[1]MS:1001975|delta m/z=3.8
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=488.2738
MS:1003169|proforma peptidoform sequence=AAAAGSTSVKPIFSR
MS:1003270|proforma peptidoform ion notation=AAAAGSTSVKPIFSR/3
MS:1001117|theoretical mass=1461.7939769138902
[2]MS:1003048|number of enzymatic termini=1
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=Q
[2]MS:1001113|c-terminal flanking residue=D
[2]MS:1000885|protein accession=tr|G3I0F4|G3I0F4_CRIGR NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 OS=Cricetulus griseus GN=I79_016836 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=1
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=Q
[1]MS:1001113|c-terminal flanking residue=D
[1]MS:1000885|protein accession=tr|G3I0F4|G3I0F4_CRIGR NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 OS=Cricetulus griseus GN=I79_016836 PE=4 SV=1
MS:1003243|adduct ion mass=1464.8157
<Interpretation=1>
[1]MS:1001975|delta m/z=3.8
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.1804
MS:1003080|top 20 peak unassigned intensity fraction=0.0
MS:1003289|intensity of highest unassigned peak=0.09
Expand Down Expand Up @@ -912,6 +912,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=0
<Spectrum=6>
MS:1003061|library spectrum name=AAAALGSHGSCSSEVEK/2_1(10,C,CAM)_50eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=830.8834
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=50
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -925,28 +927,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=30003.73
MS:1003085|previous MSn-1 scan precursor intensity=30003.73
MS:1003086|precursor apex intensity=28800.11
MS:1003208|experimental precursor monoisotopic m/z=830.8868
MS:1000512|filter string="FTMS + p NSI d Full ms2 831.38@hcd34.00 [110.00-1675.00]"
MS:1003059|number of peaks=68
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=6.94218e-013)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAALGSHGSCSSEVEK
MS:1000224|molecular mass=1661.7668
MS:1000041|charge state=2
MS:1000744|selected ion m/z=830.8834
[1]MS:1001975|delta m/z=4.1
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=830.8868
MS:1003169|proforma peptidoform sequence=AAAALGSHGSC[Carbamidomethyl]SSEVEK
MS:1003270|proforma peptidoform ion notation=AAAALGSHGSC[Carbamidomethyl]SSEVEK/2
MS:1001117|theoretical mass=1659.7522486039798
[2]MS:1003048|number of enzymatic termini=2
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=K
[2]MS:1001113|c-terminal flanking residue=E
[2]MS:1000885|protein accession=tr|G3HHY9|G3HHY9_CRIGR V-type proton ATPase subunit G 1 OS=Cricetulus griseus GN=I79_010250 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=2
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=K
[1]MS:1001113|c-terminal flanking residue=E
[1]MS:1000885|protein accession=tr|G3HHY9|G3HHY9_CRIGR V-type proton ATPase subunit G 1 OS=Cricetulus griseus GN=I79_010250 PE=4 SV=1
MS:1003243|adduct ion mass=1661.7668
<Interpretation=1>
[1]MS:1001975|delta m/z=4.1
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.4368
MS:1003080|top 20 peak unassigned intensity fraction=0.184
MS:1003289|intensity of highest unassigned peak=0.45
Expand Down Expand Up @@ -1024,6 +1024,8 @@ MS:1003290|number of unassigned peaks among top 20 peaks=6
<Spectrum=7>
MS:1003061|library spectrum name=AAAALGSHGSCSSEVEK/2_1(10,C,CAM)_52eV
MS:1003065|spectrum aggregation type=MS:1003066|singleton spectrum
MS:1000041|charge state=2
MS:1000744|selected ion m/z=830.8834
MS:1000044|dissociation method=MS:1000422|beam-type collision-induced dissociation
[1]MS:1000045|collision energy=52
[1]UO:0000000|unit=UO:0000266|electronvolt
Expand All @@ -1037,28 +1039,26 @@ MS:1000028|detector resolution=15000
[2]UO:0000000|unit=MS:1000040|m/z
[3]MS:1000829|isolation window upper offset=0.95
[3]UO:0000000|unit=MS:1000040|m/z
MS:1003085|previous MS1 scan precursor intensity=9544513.13
MS:1003085|previous MSn-1 scan precursor intensity=9544513.13
MS:1003086|precursor apex intensity=26056925.91
MS:1003208|experimental precursor monoisotopic m/z=830.8817
MS:1000512|filter string="FTMS + p NSI d Full ms2 830.88@hcd35.00 [140.00-1675.00]"
MS:1003059|number of peaks=402
[4]MS:1003275|other attribute name=Se
[4]MS:1003276|other attribute value=1(^G1:sc=6.88234e-022)
<Analyte=1>
MS:1000888|stripped peptide sequence=AAAALGSHGSCSSEVEK
MS:1000224|molecular mass=1661.7668
MS:1000041|charge state=2
MS:1000744|selected ion m/z=830.8834
[1]MS:1001975|delta m/z=-2.0
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003208|experimental precursor monoisotopic m/z=830.8817
MS:1003169|proforma peptidoform sequence=AAAALGSHGSC[Carbamidomethyl]SSEVEK
MS:1003270|proforma peptidoform ion notation=AAAALGSHGSC[Carbamidomethyl]SSEVEK/2
MS:1001117|theoretical mass=1659.7522486039798
[2]MS:1003048|number of enzymatic termini=2
[2]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[2]MS:1001112|n-terminal flanking residue=K
[2]MS:1001113|c-terminal flanking residue=E
[2]MS:1000885|protein accession=tr|G3HHY9|G3HHY9_CRIGR V-type proton ATPase subunit G 1 OS=Cricetulus griseus GN=I79_010250 PE=4 SV=1
[1]MS:1003048|number of enzymatic termini=2
[1]MS:1001045|cleavage agent name=MS:1001251|Trypsin
[1]MS:1001112|n-terminal flanking residue=K
[1]MS:1001113|c-terminal flanking residue=E
[1]MS:1000885|protein accession=tr|G3HHY9|G3HHY9_CRIGR V-type proton ATPase subunit G 1 OS=Cricetulus griseus GN=I79_010250 PE=4 SV=1
MS:1003243|adduct ion mass=1661.7668
<Interpretation=1>
[1]MS:1001975|delta m/z=-2.0
[1]UO:0000000|unit=UO:0000169|parts per million
MS:1003079|total unassigned intensity fraction=0.339
MS:1003080|top 20 peak unassigned intensity fraction=0.0816
MS:1003289|intensity of highest unassigned peak=0.17
Expand Down
6 changes: 6 additions & 0 deletions mzspeclib/validate/rules/base.json
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,12 @@
"allow_children": false,
"name": "possible charge state",
"repeatable": false
},
{
"accession": "MS:1003270",
"allow_children": false,
"name": "proforma peptidoform ion notation",
"repeatable": false
}
],
"combination_logic": "OR",
Expand Down
Loading

0 comments on commit cedf54f

Please sign in to comment.