-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Metaphlan4 format #107
Conversation
The amount of rows before the profile differs between metaphlan3 and 4. Therefore taxpasta was not working on metaphlan4 output under the current circumstances. Since the header is formatted the same between both versions, a quick scan for the first field of the header could ensure the table is read from the right line in both formats.
Thanks @TheOafidian ! I'll leave this up to @Midnighter to decide if he's OK with this system, however I can predict already he will ask for example test data to be added from MetaPhlAn4. Would you be able to upload some too? (Can be the one you found the issue from, but you can manually remove any sample-identification info) |
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## dev #107 +/- ##
==========================================
+ Coverage 81.99% 82.17% +0.17%
==========================================
Files 106 110 +4
Lines 1594 1677 +83
Branches 281 299 +18
==========================================
+ Hits 1307 1378 +71
- Misses 247 255 +8
- Partials 40 44 +4
☔ View full report in Codecov by Sentry. |
Just to be sure, the extra line is a default feature of MetaPhlAn4 and not due to some option that you have set? |
Hey @Midnighter, don't have time to go look for an example on my work laptop right now, but indeed I've used default args. I've found in the Metaphlan4 tutorial an example of output where this new line is also present like it was in my data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @TheOafidian, and apologies for the wait. I'm afraid that the logic needs to be slightly more complicated, as the read
method is designed to accept both buffers, like StringIO
, and path-like objects, such as str
or Path
.
If you don't feel like tackling that, I'm happy to take over from here. As mentioned in the comments, adding one or two MetaPhlAn4 files to the test data is also needed.
Some test data thanks to @apcamargo |
Handle buffer arguments in addition to path-like arguments.
The amount of rows before the profile differs between metaphlan3 and 4. Therefore taxpasta was not working on metaphlan4 output under the current circumstances. Since the header is formatted the same between both versions, a quick scan for the first field of the header could ensure the table is read from the right line in both formats.
I did not find any issues raised for this, so I just added the code I've used on my local example of to get it to work on the new format.