Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance seqtk_mergefa: FASTA-only output and improved ambiguity handling #6350

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dianichj
Copy link

@dianichj dianichj commented Sep 24, 2024

Bug Fix
Tool merges FASTA/Q files into a FASTA output and considers the quality threshold for FASTQ files when merging.

  1. Clarified the -m option to handle ambiguous bases and conflicts (e.g., N and other IUPAC codes).
  2. Improved help documentation with clearer examples and explanations.
  3. Refined input parameter labels for better clarity and consistency.

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

- Restricted input formats to only FASTA and compressed FASTA files (.fasta, .fasta.gz). Removed support for FASTQ files.
- Updated the tool description and help section to accurately reflect that the tool only merges FASTA files.
- Improved the tool's clarity by ensuring it is used for its intended purpose: merging FASTA files only.
@wm75
Copy link
Contributor

wm75 commented Sep 24, 2024

Sorry @dianichj if our offline discussion of this wasn't clear enough, but that's not what the change was meant to look like.
The tool can merge fastq inputs - it's just that the result will be fasta in all cases. So instead of restricting the input formats, you should fix the possible output formats.

@dianichj
Copy link
Author

Sorry @dianichj if our offline discussion of this wasn't clear enough, but that's not what the change was meant to look like. The tool can merge fastq inputs - it's just that the result will be fasta in all cases. So instead of restricting the input formats, you should fix the possible output formats.

Pavan and I are testing if indeed the tool is able to generate FASTQ Files from the original tool documentation. So far we have encountered some issues. We will be able to confirm later on if the tool is for both FASTA and FASTQ or only FASTA =)!

"Tool merges FASTA/Q files into a FASTA output and considers the quality threshold for FASTQ files when merging."

1. Clarified the -m option to handle ambiguous bases and conflicts (e.g., N and other IUPAC codes).
2. Improved help documentation with clearer examples and explanations.
3. Refined input parameter labels for better clarity and consistency.
@dianichj dianichj changed the title Seqtk mergefa toolshed edit (FASTA only tool) Enhance seqtk_mergefa: FASTA-only output and improved ambiguity handling Sep 26, 2024
edited echo command line back to #echo
@wm75 wm75 added the wip label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants