-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fraction Column is lost and reevaluated by MSStats #174
Comments
I think it would be better if openms just exports a fraction column correctly. Instead of hoping for a correct guess b Msstats. |
@timosachsenberg I have no idea why this is not the case. I thought we export everything. |
I also did a PR to MSstats once to address this issue. Maybe it did not make it into 3.22? Did you check 3.22.1 or whatever ele came before 4? |
Seems not to be in the code anymore, after they changed their code structure! |
I tested with 3.22.1, which should be the latest version before v4. And yes v4 is not compatible in any case to be used in the nfcore/proteomicslfq docker... For testing (v4.2.0) I had to build another container. |
Yeah, we checked. We export it, and it seems that the issue is on the MSstats side (see Till's comments). |
Can you find out why it is not compatible? In theory the openms::openms2.7.0pre package should be built with the latest conda packages. 2.6.0 from bioconda is of course outdated. I think this would be the way forward. |
proteomicslfq_docker_build.log Attached is the log of the dockerfile build of nf-core/proteomicslfq with the following environment.yml: name: nf-core-proteomicslfq-1.0.0
So there are conflicts but conda can't figure out where. |
I would try "mamba" to find the conflicts. Conda is basically useless for this. And in this case even seems to be bugged. |
After some testing I finally managed to include MSstats v4.2, but for this I needed to change the version of python (to v3.9) and ptxqc (to v1.0.12). Unfortunately, this leads to an error in ptxqc when running the test profile. The current environment is: name: nf-core-proteomicslfq-1.0.0
The error of ptxqc is the following: Loading required package: PTXQC |
I will ask @cbielow if he knows what the issue is here |
I cannot find anything obviously wrong with the code in PTXQC. |
Why does it want an mqpar.xml at all? We input mztab. |
its quite an unusual combination indeed, but the mqpar.xml is used to find some threshold parameters, if available. |
The script I'm using is this nextflow script: https://github.com/tillenglert/proteomicslfq/blob/master/main.nf#L1304 with this config (testfiles): As I'm still working on msfragger I tested the ptxqc process with comet. The logs and inputfiles are attached to this comment: |
the error is fixed in the current development version of PTXQC. Since this is a regression, the last working version should be |
Ah perfect! I haven't tried this version, but it's working and compatible with the remaining packages. This is the current environment I'm using, which is working vor msstats and ptxqc: name: nf-core-proteomicslfq-1.0.0
|
Feel free to open a PR with the environment update |
I'm currently adding MSFragger as a search engine for ProteomicsLFQ. When running the minimal test profile I ran into an issue with MSstats. The tool could not figure out the fractionation of the samples and stopped the executation with following message:
Now searching for the reason of this issue I looked into the source code of MSstats and the function OpenMStoMSstatsFormat, which preprocesses the data for MSstats before doing the dataProcess function.
This function also just takes the required columns of the out.csv of proteomicslfq which are the following:
source: https://rdrr.io/bioc/MSstats/src/R/OpenMStoMSstatsFormat.R (MSstats 3.22)
Which leads to the loss of the Fraction Column. This was not leading to an Error when using Comet or MSGF+ search engines, as MSstats is analysing the features and can detect if its Technical Replicates or Fractionated Samples if the features are clear enough. I guess the problem in MSFragger was that it found too many overlapping features and at the same time too many duplicated features across fractions and samples.
When testing the newest version of MSstats (4.2) it could actually correctly assign the fractions. The latest version is dependent on MSstatsConvert which includes the conversion tools for different MS tools. So maybe it would make the ProteomicsLFQ pipeline more robust to errors especially as the information of fractions is lost.
The text was updated successfully, but these errors were encountered: