Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add rank lineage #130

Merged
merged 8 commits into from
Aug 19, 2023
Merged

Feat: add rank lineage #130

merged 8 commits into from
Aug 19, 2023

Conversation

Midnighter
Copy link
Contributor

@Midnighter Midnighter commented Aug 19, 2023

@Midnighter Midnighter self-assigned this Aug 19, 2023
@codecov-commenter
Copy link

codecov-commenter commented Aug 19, 2023

Codecov Report

Patch coverage: 58.82% and project coverage change: +3.08% 🎉

Comparison is base (d130751) 82.85% compared to head (79167a3) 85.93%.

❗ Current head 79167a3 differs from pull request most recent head b217f33. Consider uploading reports for the commit b217f33 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #130      +/-   ##
==========================================
+ Coverage   82.85%   85.93%   +3.08%     
==========================================
  Files         114      114              
  Lines        1744     1778      +34     
  Branches      308      316       +8     
==========================================
+ Hits         1445     1528      +83     
+ Misses        255      202      -53     
- Partials       44       48       +4     
Files Changed Coverage Δ
src/taxpasta/infrastructure/cli/merge.py 52.60% <0.00%> (-2.22%) ⬇️
src/taxpasta/infrastructure/cli/standardise.py 32.29% <0.00%> (-2.54%) ⬇️
src/taxpasta/domain/service/taxonomy_service.py 100.00% <100.00%> (+3.70%) ⬆️
...tructure/domain/service/taxopy_taxonomy_service.py 73.78% <100.00%> (+73.78%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Midnighter Midnighter requested a review from a team August 19, 2023 16:57
Copy link
Contributor

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing any docs updates?

Overall looks ok to me, just a few queries (but not blockers IMO)

No tested, will try now but can't guaruntee I'll get to it, I trust you ;)

Comment on lines +148 to +150
"Pseudomonadales;Gammaproteobacteria;Proteobacteria;"
"Bacteria;root",
"Saccharomycetes;Ascomycota;Eukaryota;root",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which order is this meant to be in the actual file/ I typically would view the rank to go broad to specific left to right,

e.g. "Saccharomycetes;Ascomycota;Eukaryota;root", would be root;Eukaryota;Ascomycota;Saccaromycetes,

Which also makes me wonder: what happens when there are species that are 'missing' an intermediate taxonmic rank (e.g. has a species, genus, family name, but no order name...., but then continues with classes, phyla etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I simply used the order returned by taxopy (which makes sense from the perspective of a particular taxon). However, you're right that most tools provide the lineage from higher to more specific rank.

@jfy133
Copy link
Contributor

jfy133 commented Aug 19, 2023

Tested and no bugs: still works for me!

I still personally find the reversed the lineages reflexively unintuitive, but not wrong. You're welcome to merge from my POV! (once docs updated, if necessary!)

@Midnighter
Copy link
Contributor Author

We can turn around the order of the lineages. It makes sense to me and hopefully, not too many people have built code based on this yet.

I don't actually know how taxopy reports skipped ranks in the lineage. Would need to add test data for that. Do you happen to know an example from the NCBI taxonomy?

@Midnighter
Copy link
Contributor Author

As discussed, will merge this and make the proposed changes in a new PR.

@Midnighter Midnighter merged commit d3d03bf into dev Aug 19, 2023
@Midnighter Midnighter deleted the feat-rank-lineage branch August 19, 2023 17:53
@jfy133
Copy link
Contributor

jfy133 commented Aug 19, 2023

We can turn around the order of the lineages. It makes sense to me and hopefully, not too many people have built code based on this yet.

👍

I don't actually know how taxopy reports skipped ranks in the lineage. Would need to add test data for that. Do you happen to know an example from the NCBI taxonomy?

Not off the top of my head... I remember seeing it in metaphlan2 or malt results...

Maybe candidatus phyla etc?

@jfy133
Copy link
Contributor

jfy133 commented Aug 19, 2023

Possibly something like this: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=1387480&lvl=3&lin=f&keep=1&srchmode=1&unlock

(Can't check properly from my phone though...)

@jfy133 jfy133 mentioned this pull request Aug 25, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add --add-rank-lineage to taxpasta merge and taxpasta standardise
3 participants