Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top hit not listed when -tid and -tcov satisfied, LCA not performed #2

Open
morien opened this issue Nov 6, 2021 · 1 comment
Open

Comments

@morien
Copy link

morien commented Nov 6, 2021

I have an issue I haven't been able to debug with the lca.py script.

Here is my input data:

ASV1146	N/A	QWEAS1184-15	129788	99.042	100	1.15e-158	562	Genbank	Eukaryota / Mollusca / Bivalvia / Venerida / Veneridae / Ruditapes / Ruditapes philippinarum
ASV1184	Mucor hiemalis isolate A26 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590554	64493	99.569	74	3.69e-114	424	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor hiemalis
ASV1184	Mucor hiemalis isolate 58 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590552	64493	99.569	74	3.69e-114	424	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor hiemalis
ASV1184	Mucor hiemalis isolate A26 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590554	64493	99.569	74	3.69e-114	424	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor hiemalis
ASV1184	Mucor hiemalis isolate 58 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590552	64493	99.569	74	3.69e-114	424	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor hiemalis
ASV1184	Mucor sp. BM-2009-2 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590553	664300	99.138	74	1.72e-112	418	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor sp. BM-2009-2
ASV1184	Mucor sp. BM-2009-2 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590553	664300	99.138	74	1.72e-112	418	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor sp. BM-2009-2
ASV1184	N/A	INRMA1956-14	225336	99.042	100	1.15e-158	562	Genbank	Eukaryota / Arthropoda / Insecta / Coleoptera / Elateridae / unknown genus / Alaus
ASV1184	N/A	FCHAR2474-19	430594	99.035	99	1.48e-157	558	Genbank	Eukaryota / Arthropoda / Arachnida / Araneae / Lycosidae / Schizocosa / Schizocosa saltatrix
ASV1184	N/A	INRMA1996-14	1395699	98.403	100	2.48e-155	551	Genbank	Eukaryota / Arthropoda / Insecta / Orthoptera / Tettigoniidae / unknown genus / Steiroxys
ASV1184	Mucor circinelloides isolate R6 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590555	36080	98.276	74	3.72e-109	407	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor circinelloides
ASV1184	Mucor circinelloides isolate R6 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590555	36080	98.276	74	3.72e-109	407	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor circinelloides
ASV1184	Mucor circinelloides f. lusitanicus strain CBS 277.49 mitochondrion, complete genome	KR809877	29924	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor lusitanicus
ASV1184	Lichtheimia hongkongensis mitochondrion, complete genome	KJ561171	549293	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Lichtheimiaceae / Lichtheimia / Lichtheimia hongkongensis
ASV1184	Lichtheimia sp. SYL-2013 cytochrome oxidase subunit 1 gene, complete cds; mitochondrial	KC522836	1329394	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Lichtheimiaceae / Lichtheimia / Lichtheimia sp. SYL-2013
ASV1184	Mucor circinelloides f. lusitanicus strain CBS 277.49 mitochondrion, complete genome	KR809877	29924	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor lusitanicus
ASV1184	Lichtheimia hongkongensis mitochondrion, complete genome	KJ561171	549293	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Lichtheimiaceae / Lichtheimia / Lichtheimia hongkongensis
ASV1184	Lichtheimia sp. SYL-2013 cytochrome oxidase subunit 1 gene, complete cds; mitochondrial	KC522836	1329394	98.083	100	7.56e-151	545	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Lichtheimiaceae / Lichtheimia / Lichtheimia sp. SYL-2013
ASV1184	N/A	GMAFW261-15	6893	97.444	100	2.5e-150	534	Genbank	Eukaryota / Arthropoda / Arachnida / unknown order / unknown family / unknown genus / Araneae
ASV1184	Mucor fragilis isolate R7 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590556	64491	97.414	74	8.05e-106	396	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor fragilis
ASV1184	Mucor fragilis isolate R7 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590556	64491	97.414	74	8.05e-106	396	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor fragilis
ASV1184	Zygorhynchus moelleri isolate Cr27 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590569	1302847	97.357	73	1.74e-102	385	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor moelleri
ASV1184	Zygorhynchus moelleri isolate Cr27 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590569	1302847	97.357	73	1.74e-102	385	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor moelleri
ASV1184	N/A	GMAFW286-15	7147	97.125	100	1.16e-148	529	Genbank	Eukaryota / Arthropoda / Insecta / unknown order / unknown family / unknown genus / Diptera
ASV1184	Zygorhynchus moelleri isolate A56 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590568	1302847	96.889	72	1.05e-99	375	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor moelleri
ASV1184	Zygorhynchus moelleri isolate A56 cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial	FJ590568	1302847	96.889	72	1.05e-99	375	Genbank	Fungi / Mucoromycota / Mucoromycetes / Mucorales / Mucoraceae / Mucor / Mucor moelleri

Here is my command:
python2 ~/programs/galaxy-tool-lca/lca.py -i tmp.txt -o tmp.taxonomy_table.txt -b 5 -id 80 -cov 50 -t best_hit -tid 98 -tcov 80 -minbit 0

Here is my output:
ASV1184 no identification no identification no identification no identification no identification no identification no identification no identification no identification no lca

According to the documentation, I believe I should be getting a top hit, since the top hit sorted by bitscore has >98% identity and >80% coverage. I'm not applying any text filtering on the hits, either.

If I apply the 'best_hits_range' flag I get several assignments for best hit. Could the devs explain this behaviour?

@morien
Copy link
Author

morien commented Dec 11, 2021

hi @dickgroenenberg @gbbio sorry to bug you about this but I'd like to ask if you could reproduce this behaviour, and if it's not actually an error, could you explain the behaviour. thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant