Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current implementation has a few issues:
output_bam_basename = sub(sub(unmapped_bam, sub_strip_path, ""), sub_strip_unmapped, "") + ".aligned.unsorted"
is called 3 times)gs://
prefix, this workflow struggles with every other cloud URL I've tried to use it with (AWS, and DNAnexus).sub(unmapped_bam, "gs://.*/", "")
. This means the file URL is converted directly to a string, and is not properly converted to the location at which this file will actually be downloaded.I've solved these issues by using
basename()
, which is a much more portable and stable way of converting from aFile
to aString
, and doesn't require prefix stripping. In addition I've removed the code duplication, which is no longer needed because of Cromwell fixes.As a demonstration that this works, I've run the whole GATK3 Best Practice pipeline with my fixed version of the workflow on AWS (with Cromwell 37), and downloaded the Cromwell metadata (including the submitted workflow, and all results). Unfortunately I had to include some other AWS workarounds, but you can see from the logs that we start with this as an input:
And we then up with:
This is exactly what you want.
Full metadata logs are here. The workflow as a whole doesn't work (it fails for an unrelated reason), but the GenericPreProcessingWorkflow does complete successfully, which is what is of interest here.
GenericPreProcessingWorkflow.log
BestPractice.log