Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sfitz add mosdepth quantize #88

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

sorelfitzgibbon
Copy link
Collaborator

@sorelfitzgibbon sorelfitzgibbon commented Oct 30, 2024

Description

  • Add mosdepth quantize and update nftest.
  • Currently written for 4 quantize bins, but the original software allows any number of bins. An issue has been made.

Testing Results

  • NFTest
    • log: /hot/software/pipeline/pipeline-generate-SQC-BAM/Nextflow/development/unreleased/sfitz-add-mosdepth-quantize/log-nftest-20241101T231717Z.log
    • cases: default set

Checklist

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have reviewed the Nextflow pipeline standards.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have set up or verified the branch protection rule following the github standards before opening this pull request.

  • I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

  • I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

  • I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)

  • I have tested the pipeline using NFTest, or I have justified why I did not need to run NFTest above.

@sorelfitzgibbon sorelfitzgibbon requested a review from a team as a code owner October 30, 2024 01:43
Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looking good! A few minor details to iron out:

Comment on lines +107 to +111
mosdepth_quantize_use_fast_algorithm:
type: 'Bool'
required: false
default: false
help: 'Use fast algorithm for quantizing coverage values'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (non-blocking): Is there a downside to enabling the use of the fast algorithm by default?

Copy link
Collaborator Author

@sorelfitzgibbon sorelfitzgibbon Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a difficult decision. Not using fast mode gives more "correct" results. Fast mode ignores paired read overlaps and CIGAR strings (thus indels wrt ref). Ignoring the paired read overlap is the bigger issue, especially for samples with small insert sizes (wrt read length). This is noted in the README. The time difference isn't clear as I haven't benchmarked more than a few samples and not directly on scratch. It's enough that the mosdepth author recommends fast mode for most use-cases (I assume non-small insert cases). We currently have fast mode true by default for the regular mosdepth coverage calculation, so we should probably change one or the other to make them consistent.

Comment on lines 47 to 51
process {
withName: run_validate_PipeVal {
when = false
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This can be removed, the test file should be small enough that the validation shouldn't pose a problem

main.nf Outdated
Comment on lines 56 to 59
include { assess_coverage_mosdepth } from './module/windows_mosdepth' addParams(
workflow_output_dir: "${params.output_dir_base}/mosdepth-${params.mosdepth_version}",
workflow_log_output_dir: "${params.log_output_dir}/process-log/mosdepth-${params.mosdepth_version}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This import is duplicated below; one of them can be removed

Comment on lines 35 to 38
export MOSDEPTH_Q0=${params.mosdepth_q0_label}
export MOSDEPTH_Q1=${params.mosdepth_q1_label}
export MOSDEPTH_Q2=${params.mosdepth_q2_label}
export MOSDEPTH_Q3=${params.mosdepth_q3_label}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Quoting since the parameters are used directly and to make it a bit safer

Suggested change
export MOSDEPTH_Q0=${params.mosdepth_q0_label}
export MOSDEPTH_Q1=${params.mosdepth_q1_label}
export MOSDEPTH_Q2=${params.mosdepth_q2_label}
export MOSDEPTH_Q3=${params.mosdepth_q3_label}
export MOSDEPTH_Q0="${params.mosdepth_q0_label}"
export MOSDEPTH_Q1="${params.mosdepth_q1_label}"
export MOSDEPTH_Q2="${params.mosdepth_q2_label}"
export MOSDEPTH_Q3="${params.mosdepth_q3_label}"

path ".command.*"

script:
output_filename = generate_standard_filename("mosdepth${params.picard_version}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output_filename = generate_standard_filename("mosdepth${params.picard_version}",
output_filename = generate_standard_filename("mosdepth-${params.picard_version}",

@sorelfitzgibbon
Copy link
Collaborator Author

Test path within initial description has been updated for new test results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants