-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update nextclade rules in ingest [#21] #32
Conversation
input: | ||
nextclade="results/nextclade.tsv", | ||
output: | ||
nextclade_metadata=temp("results/nextclade_metadata.tsv"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to mark this as temp
? Would it not be useful for debugging along with all the other intermediate files in results/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's marked as temp because the next rule (join_metadata_and_nextclade
) merges this file and the other metadata (data/subset_metadata.tsv
) together, so all the data ends up in that file anyway. Any debugging would most likely happen using that file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we want to debug augur curate rename
in this rule and join_metadata_and_nextclade
fails or we simply don't want to run it?
My argument is mostly based on consistency, since the same could be said about results/nextclade.tsv
: why not make that temp because it eventually makes its way into results/metadata.tsv
?
Consider this non-blocking since it works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we want to debug
augur curate rename
in this rule andjoin_metadata_and_nextclade
fails or we simply don't want to run it?
In such a situation, Snakemake's --notemp
option could be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My argument is mostly based on consistency, since the same could be said about
results/nextclade.tsv
: why not make that temp because it eventually makes its way intoresults/metadata.tsv
?
results/nextclade.tsv
and results/nextclade_metadata.tsv
are effectively isotopes of the same underlying data; we don't need to retain both of them, because if you have one, you can tell what had to have been in the other one. I marked the latter one as temp because it's the newer of the two, so it was less change.
c1c19ac
to
6a4b959
Compare
Description of proposed changes
Updates usage of
augur
iningest
build to align with pathogen repo guide standards.Closes #21