Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that contig names can be used within filenames #57

Open
fedarko opened this issue Oct 13, 2022 · 0 comments
Open

Ensure that contig names can be used within filenames #57

fedarko opened this issue Oct 13, 2022 · 0 comments
Labels
backburner Low-priority things that are still good to keep track of bug Something isn't working

Comments

@fedarko
Copy link
Owner

fedarko commented Oct 13, 2022

There are a few commands that include contig names in filenames -- right now it's just phasing commands:

  • strainFlye smooth create (output reads for each contig are named [contig].fasta.gz)
  • strainFlye smooth assemble (output LJA assemblies for each contig are written to a folder named [contig])
  • strainFlye link nt (output pickle files are named [contig]_pos2nt2ct.pickle and [contig]_pospair2ntpair2ct.pickle)
  • strainFlye link graph (output graphs, regardless of format, include [contig] as a prefix)

In most cases, contig names should be restricted to [a-zA-Z0-9_-.], and should thus be fine as filenames. But I'm sure eventually we'll start seeing weird contig names with spaces or other characters that will mess this up.

I'm not sure it's worth trying to anticipate and address these problems in advance (we could modify the FASTA-loading parts of the code to do some validation on contig names), but I'm making this issue just to catalog what parts of the code this problem touches at the moment.

@fedarko fedarko added bug Something isn't working backburner Low-priority things that are still good to keep track of labels Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backburner Low-priority things that are still good to keep track of bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant