Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Update drug categorization logic #54

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@

### Update drug categorization
- confers_resistance_to() now gets drugs for the whole AMR gene family. For example, OXA-19 previously only returned cephalosporin and penam, but now will also return oxacillin (from AMR gene family).
- Implementation of drugs_to_drug_classes() has also been fixed. Previously, the drug class was obtained from the superclasses of the drugs list passed without a thorough check if the drug class was the immediate child of 'antibiotic molecule'. These checks have now been put in place.
- drugs_to_drug_classes() also uses the 'has_part' ARO relationship now to get drug classes for antibiotic mixtures. In case of antibiotic mixtures, the drug classes of the drugs associated with 'has_part' are returned rather than 'antibiotic mixture' (ARO:3000707).
- 'antibiotic mixture' will not be reported as a drug class, rather the individual antibiotic classes making up the antibiotic mixture will be reported.

### Manual curation
- argannot_curation: (Tet)tetH:EF460464:6286-7839:1554 was incorrectly annotated as ARO:3004797 which is a beta-lactamase due to a loose RGI hit. This was manually curated to ARO:3000175.
Expand Down
40 changes: 35 additions & 5 deletions argnorm/drug_categorization.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,26 @@
ARO = lib.get_aro_ontology()
confers_resistance_to_drug_class_rel = ARO.get_relationship('confers_resistance_to_drug_class')
confers_resistance_to_antibiotic_rel = ARO.get_relationship('confers_resistance_to_antibiotic')
has_part_rel = ARO.get_relationship('has_part')

def _get_drug_classes(super_classes_list: List[str]) -> List[str]:
"""
- Helper function to traverse up and record immediate child of 'antibiotic molecule' in ARO
- Traverses up ARO until immediate child of 'antibiotic molecule' class reached and 'antibiotic mixture' class not reached
- antibiotic molecule -> ARO:1000003
- antibiotic mixture -> ARO:3000707
"""
output = []

for super_class in super_classes_list:
super_class_classes = list(super_class.superclasses(1))
antibiotic_molecule_node = [ARO['ARO:1000003']]

# checking if immediate child of 'antibiotic molecule' is reached & it is not 'antibiotic mixture'
if super_class_classes[1:] == antibiotic_molecule_node and super_class.id != 'ARO:3000707':
output.append(super_class.id)

return output

def confers_resistance_to(aro_num: str) -> List[str]:
'''
Expand Down Expand Up @@ -55,15 +75,25 @@ def drugs_to_drug_classes(drugs_list: List[str]) -> List[str]:
to the function in the drugs_list.
'''
drug_classes = []
temp_drug_classes = []

for drug in drugs_list:
drug_instance = ARO[drug]
drug_instance_superclasses = list(drug_instance.superclasses())
superclasses_len = len(drug_instance_superclasses)
temp_drug_classes += _get_drug_classes(drug_instance_superclasses)

has_part_nodes = drug_instance.relationships.get(has_part_rel, [])
for has_part_node in has_part_nodes:
has_part_node_superclasses = list(has_part_node.superclasses())[1:]

for super_class in has_part_node_superclasses:
super_class_categories = list(super_class.superclasses())
temp_drug_classes += _get_drug_classes(super_class_categories)

if temp_drug_classes == []:
temp_drug_classes.append(drug_instance.id)

if superclasses_len >= 3:
drug_classes.append(drug_instance_superclasses[superclasses_len - 3].id)
else:
drug_classes.append(drug_instance_superclasses[0].id)
drug_classes += list(set(temp_drug_classes))
temp_drug_classes = []

return sorted(drug_classes)
220 changes: 110 additions & 110 deletions outputs/hamronized/abricate.argannot.tsv

Large diffs are not rendered by default.

220 changes: 110 additions & 110 deletions outputs/hamronized/abricate.megares.tsv

Large diffs are not rendered by default.

184 changes: 92 additions & 92 deletions outputs/hamronized/abricate.ncbi.tsv

Large diffs are not rendered by default.

228 changes: 114 additions & 114 deletions outputs/hamronized/abricate.resfinder.tsv

Large diffs are not rendered by default.

66 changes: 33 additions & 33 deletions outputs/hamronized/abricate.resfinderfg.tsv

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion outputs/hamronized/amrfinderplus.ncbi.orfs.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ amrfinderplus.ncbi.orfs.tsv tet(Q) tetracycline resistance ribosomal protection
amrfinderplus.ncbi.orfs.tsv bexA multidrug efflux MATE transporter BexA NCBI Reference Gene Database 2023-Nov-01 BAB64566.1 amrfinderplus 3.10.30 gene_presence_detected EFFLUX 80.36 EFFLUX 18 1085 356 k119_41685 443 - 92.98 ARO:3003953 ARO:0000045,ARO:3000662 ARO:0000001,ARO:3005386
amrfinderplus.ncbi.orfs.tsv lnu(C) lincosamide nucleotidyltransferase Lnu(C) NCBI Reference Gene Database 2023-Nov-01 WP_063851341.1 amrfinderplus 3.10.30 gene_presence_detected LINCOSAMIDE 100.0 LINCOSAMIDE 234 725 164 k119_46979 164 - 97.56 ARO:3002837 ARO:0000046 ARO:0000017
amrfinderplus.ncbi.orfs.tsv sat4 streptothricin N-acetyltransferase Sat4 NCBI Reference Gene Database 2023-Nov-01 WP_000627290.1 amrfinderplus 3.10.30 gene_presence_detected STREPTOTHRICIN 86.11 STREPTOTHRICIN 8 472 155 k119_47732 180 - 100.0 ARO:3002897 ARO:0000012 ARO:3000034
amrfinderplus.ncbi.orfs.tsv aph(3')-IIIa aminoglycoside O-phosphotransferase APH(3')-IIIa NCBI Reference Gene Database 2023-Nov-01 WP_001096887.1 amrfinderplus 3.10.30 gene_presence_detected AMIKACIN/KANAMYCIN 100.0 AMINOGLYCOSIDE 207 998 264 k119_48139 264 - 100.0 ARO:3002647 ARO:0000005,ARO:0000013,ARO:0000021,ARO:0000024,ARO:0000049,ARO:3000652,ARO:3000655,ARO:3000657,ARO:3000658 ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:3000707
amrfinderplus.ncbi.orfs.tsv aph(3')-IIIa aminoglycoside O-phosphotransferase APH(3')-IIIa NCBI Reference Gene Database 2023-Nov-01 WP_001096887.1 amrfinderplus 3.10.30 gene_presence_detected AMIKACIN/KANAMYCIN 100.0 AMINOGLYCOSIDE 207 998 264 k119_48139 264 - 100.0 ARO:3002647 ARO:0000005,ARO:0000013,ARO:0000021,ARO:0000024,ARO:0000049,ARO:3000652,ARO:3000655,ARO:3000657,ARO:3000658 ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016
amrfinderplus.ncbi.orfs.tsv aadS aminoglycoside 6-adenylyltransferase AadS NCBI Reference Gene Database 2023-Nov-01 WP_003013318.1 amrfinderplus 3.10.30 gene_presence_detected STREPTOMYCIN 100.0 AMINOGLYCOSIDE 34628 35488 287 k119_48233 287 + 100.0 ARO:3004683 ARO:0000040 ARO:0000016
amrfinderplus.ncbi.orfs.tsv tet(X2) tetracycline-inactivating monooxygenase Tet(X2) NCBI Reference Gene Database 2023-Nov-01 WP_008651082.1 amrfinderplus 3.10.30 gene_presence_detected TETRACYCLINE 100.0 TETRACYCLINE 12370 13533 388 k119_48273 388 + 99.74 ARO:3000205 ARO:0000030,ARO:0000051,ARO:0000069,ARO:3000152,ARO:3000528,ARO:3000667,ARO:3000668 ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050
amrfinderplus.ncbi.orfs.tsv tet(O) tetracycline resistance ribosomal protection protein Tet(O) NCBI Reference Gene Database 2023-Nov-01 WP_014636291.1 amrfinderplus 3.10.30 gene_presence_detected TETRACYCLINE 100.0 TETRACYCLINE 978 2894 639 k119_60190 639 + 99.22 ARO:3000190 ARO:0000051,ARO:0000069,ARO:3000152,ARO:3000528,ARO:3000667,ARO:3000668 ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050
Expand Down
Loading
Loading