Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Metaphlan4 format #107

Merged
merged 3 commits into from
Jul 2, 2023
Merged

Support Metaphlan4 format #107

merged 3 commits into from
Jul 2, 2023

Conversation

TheOafidian
Copy link
Contributor

@TheOafidian TheOafidian commented Jun 2, 2023

The amount of rows before the profile differs between metaphlan3 and 4. Therefore taxpasta was not working on metaphlan4 output under the current circumstances. Since the header is formatted the same between both versions, a quick scan for the first field of the header could ensure the table is read from the right line in both formats.

I did not find any issues raised for this, so I just added the code I've used on my local example of to get it to work on the new format.

The amount of rows before the profile differs between metaphlan3 and 4. Therefore taxpasta was not working on metaphlan4 output under the current circumstances. Since the header is formatted the same between both versions, a quick scan for the first field of the header could ensure the table is read from the right line in both formats.
@jfy133
Copy link
Contributor

jfy133 commented Jun 2, 2023

Thanks @TheOafidian !

I'll leave this up to @Midnighter to decide if he's OK with this system, however I can predict already he will ask for example test data to be added from MetaPhlAn4. Would you be able to upload some too? (Can be the one you found the issue from, but you can manually remove any sample-identification info)

@codecov-commenter
Copy link

codecov-commenter commented Jun 2, 2023

Codecov Report

Patch coverage: 84.33% and project coverage change: +0.17 🎉

Comparison is base (f60efe6) 81.99% compared to head (06c1191) 82.17%.

❗ Current head 06c1191 differs from pull request most recent head ef280dc. Consider uploading reports for the commit ef280dc to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #107      +/-   ##
==========================================
+ Coverage   81.99%   82.17%   +0.17%     
==========================================
  Files         106      110       +4     
  Lines        1594     1677      +83     
  Branches      281      299      +18     
==========================================
+ Hits         1307     1378      +71     
- Misses        247      255       +8     
- Partials       40       44       +4     
Impacted Files Coverage Δ
.../application/metaphlan/metaphlan_profile_reader.py 67.56% <50.00%> (-32.44%) ⬇️
...ucture/application/application_service_registry.py 93.16% <83.33%> (+0.26%) ⬆️
...rc/taxpasta/infrastructure/application/__init__.py 100.00% <100.00%> (ø)
...pasta/infrastructure/application/ganon/__init__.py 100.00% <100.00%> (ø)
.../infrastructure/application/ganon/ganon_profile.py 100.00% <100.00%> (ø)
...tructure/application/ganon/ganon_profile_reader.py 100.00% <100.00%> (ø)
...ion/ganon/ganon_profile_standardisation_service.py 100.00% <100.00%> (ø)
...a/infrastructure/application/supported_profiler.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Midnighter
Copy link
Contributor

Just to be sure, the extra line is a default feature of MetaPhlAn4 and not due to some option that you have set?

@TheOafidian
Copy link
Contributor Author

Hey @Midnighter, don't have time to go look for an example on my work laptop right now, but indeed I've used default args. I've found in the Metaphlan4 tutorial an example of output where this new line is also present like it was in my data.

Copy link
Contributor

@Midnighter Midnighter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @TheOafidian, and apologies for the wait. I'm afraid that the logic needs to be slightly more complicated, as the read method is designed to accept both buffers, like StringIO, and path-like objects, such as str or Path.

If you don't feel like tackling that, I'm happy to take over from here. As mentioned in the comments, adding one or two MetaPhlAn4 files to the test data is also needed.

@jfy133
Copy link
Contributor

jfy133 commented Jun 26, 2023

Some test data thanks to @apcamargo

#111 (comment)

Handle buffer arguments in addition to path-like arguments.
@Midnighter Midnighter merged commit 7349f16 into taxprofiler:dev Jul 2, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] CRITICAL error when parsing MetaPhlAn4's output
4 participants