Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drug Categorization Code Added #8

Merged
merged 27 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
07b7e74
Added the ontology files
Vedanth-Ramji Aug 22, 2023
79a08fe
Added drug categorization code
Vedanth-Ramji Aug 22, 2023
fd0a887
Update normalizers.py
Vedanth-Ramji Aug 22, 2023
0502681
Added smoke tests for drug categories functions
Vedanth-Ramji Aug 22, 2023
7290f88
Changed 'get_data' back to 'get_data_path'
Vedanth-Ramji Aug 25, 2023
e40d2be
Simplified asserts
Vedanth-Ramji Aug 25, 2023
154ba25
Updated Drug Categorization to Implement Suggestions
Vedanth-Ramji Aug 25, 2023
00d2f59
Used tuples instead of lists wherever required
Vedanth-Ramji Aug 25, 2023
a5df660
Used zip, instead of range(len()). Updated test cases and output to h…
Vedanth-Ramji Aug 25, 2023
c987a35
Changed the aro_mapping_table function to reflect code in main, stopp…
Vedanth-Ramji Aug 25, 2023
8047c74
Fixed indentation
Vedanth-Ramji Aug 29, 2023
60a87b6
Improved code readability
Vedanth-Ramji Aug 29, 2023
cf56c04
Implemented relative imports patch
Vedanth-Ramji Sep 2, 2023
3b305ed
added pronto in requirements
Vedanth-Ramji Sep 4, 2023
8ebd894
added pronto in environments
Vedanth-Ramji Sep 4, 2023
6580a69
fixed typing error thrown up in checks
Vedanth-Ramji Sep 4, 2023
dc31c53
imported pronto to test_arg_category
Vedanth-Ramji Sep 4, 2023
b301535
added pronto to test_smoke
Vedanth-Ramji Sep 4, 2023
e5a5163
added basic check for float aro numbers
Vedanth-Ramji Sep 10, 2023
3297a7a
added in-line documentation for type conversion of aro strings
Vedanth-Ramji Sep 10, 2023
28db1a7
Returning empty list with aro:nan
Vedanth-Ramji Sep 10, 2023
ce08054
sorted edge case for when aro number not in ontology
Vedanth-Ramji Sep 10, 2023
e2d7657
Deleting unnecesarry ontology files.
Vedanth-Ramji Sep 23, 2023
0ba170b
changed function get_data's name to get_data_path
Vedanth-Ramji Sep 26, 2023
d28d784
Cleaned up discrepancy with single and double quotes + changes by bla…
Vedanth-Ramji Sep 26, 2023
1cbf56b
fixed discrepancies with spacing
Vedanth-Ramji Sep 26, 2023
c21f100
fixed unnecessary spacing
Vedanth-Ramji Sep 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions argnorm/drug_categorization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# These functions are run in the run function of the BaseNormalizer.
# After ARO numbers are obtained, these functions can be executed - hence should be independent of which db is used.

import pronto
from typing import List, Tuple

# Load the ArgNorm ontology from the 'aro.obo' file
luispedro marked this conversation as resolved.
Show resolved Hide resolved
ARO = pronto.Ontology.from_obo_library('aro.obo')

def get_immediate_drug_classes(aro_num: str) -> List[Tuple]:
'''
Description: Gets the drug classes to which a gene confers resistance to.
Only lists the drug class column in the CARD db.

Parameters:
aro_num (str): ARO number. Needs to be in the form 'ARO:number'.

Returns:
drug_classes_list (list[tuple]):
A two-dimensional list where each inner list represents a drug class.
Each inner list contains the ARO number and name of the drug class in that order. [ARO:number, name].
'''

# Some databases don't provide aro numbers as strings.
# Converting those aro numbers to pronto's desired format.
if type(aro_num) == float or type(aro_num) == int:
aro_num = 'ARO:' + str(aro_num)

# If dealing with aro nans, the final drug class categorization will give [].
# Hence, immediate drug classes categorization also gives empty list.
if aro_num == 'ARO:nan':
return []

# Returning empty list if aro number not in ARO ontology.
if aro_num not in ARO.terms():
luispedro marked this conversation as resolved.
Show resolved Hide resolved
return []

gene = ARO[aro_num]

confers_resistance_to_drug_class = any(r.name == 'confers_resistance_to_drug_class' for r in gene.relationships)
confers_resistance_to_antibiotic = any(r.name == 'confers_resistance_to_antibiotic' for r in gene.relationships)

drug_classes = []

if confers_resistance_to_drug_class:
for drug_class in gene.relationships[ARO.get_relationship('confers_resistance_to_drug_class')]:
drug_classes.append((drug_class.id, drug_class.name))

if confers_resistance_to_antibiotic:
for drug_class in gene.relationships[ARO.get_relationship('confers_resistance_to_antibiotic')]:
drug_classes.append((drug_class.id, drug_class.name))

return drug_classes

def get_drug_class_category(drug_classes_list: List[Tuple]) -> List[str]:
'''
Description: Gives a list of categories of drug classes, e.g. cephem and penam are categorized as beta_lactam antibiotics.

Parameters:
drug_classes_list (list[tuple]):
A two-dimensional list where each inner list represents a drug class.
Each inner list contains the ARO number and name of the drug class in that order. [ARO:number, name].
Designed to use the return value of the function 'get_immediate_drug_classes'.

Returns:
drug_class_categories (list[str]):
A list containing the names of the drug class categories of each drug class given as input to the function.
Order of the names of the drug class categories corresponds to the order in which the drug classes were given
to the function in the drug_classes_list.
'''
drug_class_categories = []

for drug_class in drug_classes_list:
drug_class_instance = ARO[drug_class[0]]
drug_class_instance_superclasses = list(drug_class_instance.superclasses())
superclasses_len = len(drug_class_instance_superclasses)

if superclasses_len >= 3:
drug_class_categories.append(drug_class_instance_superclasses[superclasses_len - 3].name)
else:
drug_class_categories.append(drug_class_instance_superclasses[0].name)

return drug_class_categories
Loading