diff --git a/CHANGELOG.md b/CHANGELOG.md index 8f39254..8425301 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -35,11 +35,21 @@ ## Unreleased -### Using amino acid file for argannot rather than nucleotide file -- ARG-ANNOT is comprised of coding sequences. The data wasn't being handled properly before as contig mode was used when passing coding sequences to RGI. Now, the amino acid version of ARG-ANNOT is used with protein mode when running the database in RGI. -- One to many ARO mapping such as NG_047831:101-955 to Erm(K) and almG eliminated as protein mode used -- A total of 10 ARO mappings changed -### argnorm.lib: Making argNorm more usable as a library +### Handling gene clusters & reverse complements in resfinder +- Resfinder has gene clusters which can't be passed through RGI using 'contig' mode. +- Gene clusters were identified and were manually assigned ARO numbers. +- A seperate file with manual curation for gene clusters and RCs was created, and their AROs were updated after concatenating RGI results and genes not in RGI results. +- 40 gene clusters present. +- 9 genes in reverse complement form also present. +- RC genes were manually curated. + +### Using amino acid file for argannot & resfinder rather than nucleotide file +- ARG-ANNOT and Resfinder are comprised of coding sequences. The data wasn't being handled properly before as contig mode was used when passing coding sequences to RGI. Now, the amino acid versions of ARG-ANNOT & Resfinder are used with protein mode when running the database in RGI. +- ARG-ANNOT AA file is available online. Resfinder AA file is generated using biopython. +- One to many ARO mapping such as NG_047831:101-955 to Erm(K) and almG in ARG-ANNOT eliminated as protein mode used +- A total of 10 ARO mappings changed in ARG-ANNOT + +### argnorm.lib: Making argNorm more usable as a library - A file called `lib.py` will be introduced so that users can use argNorm as a library more easily. - Users can import the `map_to_aro` function using `from argnorm.lib import map_to_aro`. The function takes a gene name as input, maps the gene to the ARO and returns a pronto term object with the ARO mapping. - The `get_aro_mapping_table` function, previously within the BaseNormalizer class, has also been moved to `lib.py` to give users the ability to access the mapping tables being used for normalization. diff --git a/argnorm/data/manual_curation/README.md b/argnorm/data/manual_curation/README.md new file mode 100644 index 0000000..cc7f096 --- /dev/null +++ b/argnorm/data/manual_curation/README.md @@ -0,0 +1,21 @@ +# Resfinder Notes + +## Gene Clusters + +- Resfinder has gene clusters (nucleotide sequence with multiple CDSs present) which can't be passed through RGI using 'contig' mode. +- Gene clusters were identified and were manually assigned ARO numbers. +- 40 gene clusters present. + +## Reverse Complement +1) blaBIM-1_1_CP016446 +2) blaSPG-1_1_KP109680 +3) grdA_1_QJX10702 +4) tet(43)_1_GQ244501 +5) aac(3)-Xa_1_AB028210 +6) blaBKC-1_1_KP689347 +7) mph(A)_1_D16251 +8) qepA1_1_AB263754 +9) aac(3)-I_1_AJ877225 + +- 9 genes in reverse complement form also present. +- RC genes were manually curated \ No newline at end of file diff --git a/argnorm/data/manual_curation/resfinder_curation.tsv b/argnorm/data/manual_curation/resfinder_curation.tsv index c42294c..ad9c9cd 100644 --- a/argnorm/data/manual_curation/resfinder_curation.tsv +++ b/argnorm/data/manual_curation/resfinder_curation.tsv @@ -1,3 +1,50 @@ -Original ID ARO -EstDL136_1_JN242251 -aac(3)-I_1_AJ877225 3007384 +Original ID Gene Name in CARD ARO Origin Position in Cluster Description +VanHAX_1_FJ866609 glycopeptide resistance gene cluster VanA 3000236 https://www.ncbi.nlm.nih.gov/nuccore/M97297.1?report=fasta 6018-8624 "Part of VanA cluster (ARO:3000236). Contains: ARO:3002942, ARO:3000010, and ARO:3002949 " +VanHAX_2_M97297 glycopeptide resistance gene cluster VanA 3000236 https://www.ncbi.nlm.nih.gov/nuccore/FJ866609?report=fasta 3762-6368 "Part of VanA cluster (ARO:3000236). Contains: ARO:3002942 and ARO:3002949. The vanA gene (ARO:3002949) is modified, 1 G substituded with 1 T " +VanHMX_1_FJ349556 glycopeptide resistance gene cluster VanM 3000256 https://www.ncbi.nlm.nih.gov/nuccore/FJ349556.1?report=fasta 3884-6502 "Part of VanM cluster (ARO:3000256). Contains: ARO: ARO:3002947, ARO:3002911, and ARO:3002953" +vanM_1_FJ349556 glycopeptide resistance gene cluster VanM 3000256 https://www.ncbi.nlm.nih.gov/nuccore/FJ349556.1?report=fasta Whole Cluster "full vanM cluster, ARO:3000256" +VanC1XY_1_AF162694 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/AF162694.1?report=fasta 1411-3011 "Part of VanC cluster (ARO:3000246). Contains: ARO:3000368, ARO:3002966" +VanC1XY_2_DQ022190 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/DQ022190.1?report=fasta 205-1805 Part of VanC cluster (ARO:3000246). +VanC2XY_1_EU151754 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151754.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanHAX_PT_1_DQ018710 glycopeptide resistance gene cluster VanA 3000236 https://www.ncbi.nlm.nih.gov/nuccore/DQ018710.1 5109-7715 Part of VanA cluster (ARO:3000236) +VanHAX_PA_1_DQ018711 glycopeptide resistance gene cluster VanA 3000236 https://www.ncbi.nlm.nih.gov/nuccore/DQ018711.1?report=fasta 3168-5750 Part of VanA cluster (ARO:3000236) +VanHAX_PT_2_AY926880 glycopeptide resistance gene cluster VanA 3000236 https://www.ncbi.nlm.nih.gov/nuccore/AY926880.2?report=fasta 2771-5377 Part of VanA cluster (ARO:3000236) +dldHA2X_1_AL939117 https://www.ncbi.nlm.nih.gov/nuccore/AL939117.1 53343-56013 Gene not in CARD +VanHBX_1_AF192329 glycopeptide resistance gene cluster VanB 3000238 https://www.ncbi.nlm.nih.gov/nuccore/AF192329 27871-30477 Part of VanB cluster (ARO:3000238) +VanHBX_2_U35369 glycopeptide resistance gene cluster VanB 3000238 https://www.ncbi.nlm.nih.gov/nuccore/U35369.1?report=fasta 4007-6613 "Part of VanB cluster (ARO:3000238). Contains ARO:3002943, ARO:3002950" +VanC4XY_1_EU151752 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151752.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC4XY_2_EU151753 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151753.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC2XY_2_EU151755 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151755.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC4XY_3_EU151756 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151756.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC2XY_3_EU151757 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151757.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC2XY_4_EU151758 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151758.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC3XY_2_EU151759 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151759.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanC2XY_5_EU151760 glycopeptide resistance gene cluster VanC 3000246 https://www.ncbi.nlm.nih.gov/nuccore/EU151760.1?report=fasta 29-1650 Part of VanC cluster (ARO:3000246) +VanHDX_6_DQ172830 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/DQ172830.1?report=fasta 3019-5628 Part of VanD cluster (ARO:3000253) +VanHDX_7_AB242319 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/AB242319.1?report=fasta 3045-5654 Part of VanD cluster (ARO:3000253) +VanHDX_3_AF175293 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/AF175293.1?report=fasta 3115-5724 Part of VanD cluster (ARO:3000253) +VanHDX_4_AY082011 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/AY082011.1?report=fasta 4937-7546 "Part of VanD cluster (ARO:3000253). Contains ARO:3002944, ARO:3000005, ARO:3003070" +VanHDX_5_AY489045 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/AY489045.1?report=fasta 3046-5655 Part of VanD cluster (ARO:3000253) +VanHDX_1_AF130997 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/AF130997.1?report=fasta 3122-5728 Part of VanD cluster (ARO:3000253) +VanHDX_2_EU999036 glycopeptide resistance gene cluster VanD 3000253 https://www.ncbi.nlm.nih.gov/nuccore/EU999036.1?report=fasta 3044-5653 Part of VanD cluster (ARO:3000253) +VanHFX_1_AF155139 glycopeptide resistance gene cluster VanF 3000255 https://www.ncbi.nlm.nih.gov/nuccore/AF155139.2?report=fasta 4979-7648 "Part of VanF cluster (ARO:3000255). Contains ARO:3002945, ARO:3002908, ARO:3002952" +VanEXY_1_FJ872411 glycopeptide resistance gene cluster VanE 3000259 https://www.ncbi.nlm.nih.gov/nuccore/FJ872411.1?report=fasta 39736-41347 "Part of VanE cluster (ARO:3000259). Contains ARO:3002907, ARO:3002967" +VanGXY_1_AY271782 glycopeptide resistance gene cluster VanG 3000257 https://www.ncbi.nlm.nih.gov/nuccore/AY271782.1?report=fasta 21049-22859 "Part of VanG cluster (ARO:3000257). Contains ARO:3002909, ARO:3003069" +VanG2XY_1_FJ872410 glycopeptide resistance gene cluster VanG 3000257 https://www.ncbi.nlm.nih.gov/nuccore/FJ872410 39328-41138 Part of VanG cluster (ARO:3000257) +VanLXY_1_EU250284 glycopeptide resistance gene cluster VanL 3000260 https://www.ncbi.nlm.nih.gov/nuccore/EU250284.1?report=fasta 955-2578 "Part of VanL cluster (ARO:3000260). Contains ARO:3002910, ARO:3002968" +VanNXY_1_JF802084 glycopeptide resistance gene cluster VanN 3002917 https://www.ncbi.nlm.nih.gov/nuccore/JF802084.2?report=fasta 560-2165 "Part of VanN cluster (ARO:3002917). Contains ARO:3002912, ARO:3002969" +VanHOX_1_KF478993 glycopeptide resistance gene cluster VanO 3002918 https://www.ncbi.nlm.nih.gov/nuccore/KF478993.1?report=fasta 491-3185 "Part of VanO cluster (ARO:3002918). Contains ARO:3002948, ARO:3002954" +vanXmurFvanWI_1_CP001336 glycopeptide resistance gene cluster VanI 3003722 https://www.ncbi.nlm.nih.gov/nuccore/CP001336.1?report=fasta 1776504-1780580 Part of VanI cluster (ARO:300372). Origin has both VanI cluster and VanB cluster. Contains ARO:3003725 +vanXmurFvanKWI_1_NZAGAF01000127 glycopeptide resistance gene cluster VanI 3003722 https://www.ncbi.nlm.nih.gov/nuccore/NZ_AGAF01000127.1?report=fasta 3324-8562 Part of VanI cluster (ARO:300372). CDS in origin is in reverse complement form. +vanXmurFvanKWI_2_AP008230 glycopeptide resistance gene cluster VanI 3003722 https://www.ncbi.nlm.nih.gov/nuccore/AP008230.1?report=fasta 4202889-4208186 Part of VanI cluster (ARO:300372). Contains ARO:3003727 +mph(D)_1_AB048591 macrolide phosphotransferase (MPH) 3000333 https://www.ncbi.nlm.nih.gov/nuccore/AB048591.1?report=fasta 4-840 Gene not in CARD. Parent ARO used. +aac(3)-Xa_1_AB028210 AAC(3)-Xa 3002544 Reverse complement in resfinder db. +blaBKC-1_1_KP689347 BKC-1 3004757 Reverse complement in resfinder db. +mph(A)_1_D16251 mphA 3000316 Reverse complement in resfinder db. +qepA1_1_AB263754 QepA2 3004103 Reverse complement in resfinder db. +tet(43)_1_GQ244501 tet(43) 3000573 Reverse complement in resfinder db. +blaSPG-1_1_KP109680 SPG-1 3003720 +blaBIM-1_1_CP016446 BlaB 3004201 +grdA_1_QJX10702 +aac(3)-I_1_AJ877225 AAC(3)-I 3007384 +EstDL136_1_JN242251 diff --git a/argnorm/data/resfinder_ARO_mapping.tsv b/argnorm/data/resfinder_ARO_mapping.tsv index bb23577..6066624 100644 --- a/argnorm/data/resfinder_ARO_mapping.tsv +++ b/argnorm/data/resfinder_ARO_mapping.tsv @@ -22,94 +22,43 @@ TOprJ2_1_MN175502 OprJ 3000802 resfinder TOprJ3_1_LC633285 OprJ 3000802 resfinder TOprJ4_1_CP091084 OprJ 3000802 resfinder VanA_bc_1_Y15704 vanA 3000010 resfinder -VanC1XY_1_AF162694 vanXY gene in vanC cluster 3002966 resfinder VanC1XY_1_AF162694 vanC 3000368 resfinder VanC1XY_2_DQ022190 vanC 3000368 resfinder -VanC1XY_2_DQ022190 vanXY gene in vanC cluster 3002966 resfinder -VanC2XY_1_EU151754 vanXY gene in vanC cluster 3002966 resfinder VanC2XY_1_EU151754 vanC 3000368 resfinder -VanC2XY_2_EU151755 vanXY gene in vanC cluster 3002966 resfinder VanC2XY_2_EU151755 vanC 3000368 resfinder -VanC2XY_3_EU151757 vanXY gene in vanC cluster 3002966 resfinder VanC2XY_3_EU151757 vanC 3000368 resfinder VanC2XY_4_EU151758 vanC 3000368 resfinder -VanC2XY_4_EU151758 vanXY gene in vanC cluster 3002966 resfinder VanC2XY_5_EU151760 vanC 3000368 resfinder -VanC2XY_5_EU151760 vanXY gene in vanC cluster 3002966 resfinder VanC2_1_L29638 vanC 3000368 resfinder -VanC3XY_1_AY033764 vanXY gene in vanC cluster 3002966 resfinder VanC3XY_1_AY033764 vanC 3000368 resfinder -VanC3XY_2_EU151759 vanXY gene in vanC cluster 3002966 resfinder VanC3XY_2_EU151759 vanC 3000368 resfinder -VanC4XY_1_EU151752 vanXY gene in vanC cluster 3002966 resfinder VanC4XY_1_EU151752 vanC 3000368 resfinder -VanC4XY_2_EU151753 vanXY gene in vanC cluster 3002966 resfinder VanC4XY_2_EU151753 vanC 3000368 resfinder VanC4XY_3_EU151756 vanC 3000368 resfinder -VanC4XY_3_EU151756 vanXY gene in vanC cluster 3002966 resfinder VanEXY_1_FJ872411 vanE 3002907 resfinder -VanEXY_1_FJ872411 vanXY gene in vanE cluster 3002967 resfinder VanE_1_AF136925 vanE 3002907 resfinder VanG2XY_1_FJ872410 vanG 3002909 resfinder -VanG2XY_1_FJ872410 vanXY gene in vanG cluster 3003069 resfinder VanGXY_1_AY271782 vanG 3002909 resfinder -VanGXY_1_AY271782 vanXY gene in vanG cluster 3003069 resfinder VanHAX_1_FJ866609 vanH gene in vanA cluster 3002942 resfinder -VanHAX_1_FJ866609 vanA 3000010 resfinder -VanHAX_1_FJ866609 vanX gene in vanA cluster 3002949 resfinder VanHAX_2_M97297 vanH gene in vanA cluster 3002942 resfinder -VanHAX_2_M97297 vanA 3000010 resfinder -VanHAX_2_M97297 vanX gene in vanA cluster 3002949 resfinder -VanHAX_PA_1_DQ018711 vanX gene in vanA cluster 3002949 resfinder -VanHAX_PA_1_DQ018711 vanA 3000010 resfinder VanHAX_PA_1_DQ018711 vanH gene in vanF cluster 3002945 resfinder -VanHAX_PT_1_DQ018710 vanA 3000010 resfinder -VanHAX_PT_1_DQ018710 vanX gene in vanA cluster 3002949 resfinder VanHAX_PT_1_DQ018710 vanH gene in vanA cluster 3002942 resfinder VanHAX_PT_2_AY926880 vanH gene in vanA cluster 3002942 resfinder -VanHAX_PT_2_AY926880 vanA 3000010 resfinder -VanHAX_PT_2_AY926880 vanX gene in vanA cluster 3002949 resfinder -VanHBX_1_AF192329 vanX gene in vanB cluster 3002950 resfinder VanHBX_1_AF192329 vanH gene in vanB cluster 3002943 resfinder -VanHBX_1_AF192329 vanB 3000013 resfinder -VanHBX_2_U35369 vanX gene in vanB cluster 3002950 resfinder -VanHBX_2_U35369 vanB 3000013 resfinder VanHBX_2_U35369 vanH gene in vanB cluster 3002943 resfinder -VanHDX_1_AF130997 vanD 3000005 resfinder VanHDX_1_AF130997 vanH gene in vanD cluster 3002944 resfinder -VanHDX_1_AF130997 vanX gene in vanD cluster 3003070 resfinder -VanHDX_2_EU999036 vanX gene in vanD cluster 3003070 resfinder -VanHDX_2_EU999036 vanD 3000005 resfinder VanHDX_2_EU999036 vanH gene in vanD cluster 3002944 resfinder VanHDX_3_AF175293 vanH gene in vanD cluster 3002944 resfinder -VanHDX_3_AF175293 vanD 3000005 resfinder -VanHDX_3_AF175293 vanX gene in vanD cluster 3003070 resfinder -VanHDX_4_AY082011 vanX gene in vanD cluster 3003070 resfinder -VanHDX_4_AY082011 vanD 3000005 resfinder VanHDX_4_AY082011 vanH gene in vanD cluster 3002944 resfinder -VanHDX_5_AY489045 vanX gene in vanD cluster 3003070 resfinder -VanHDX_5_AY489045 vanD 3000005 resfinder VanHDX_5_AY489045 vanH gene in vanD cluster 3002944 resfinder VanHDX_6_DQ172830 vanH gene in vanD cluster 3002944 resfinder -VanHDX_6_DQ172830 vanD 3000005 resfinder -VanHDX_6_DQ172830 vanX gene in vanD cluster 3003070 resfinder -VanHDX_7_AB242319 vanX gene in vanD cluster 3003070 resfinder -VanHDX_7_AB242319 vanD 3000005 resfinder VanHDX_7_AB242319 vanH gene in vanD cluster 3002944 resfinder -VanHFX_1_AF155139 vanF 3002908 resfinder -VanHFX_1_AF155139 vanX gene in vanF cluster 3002952 resfinder VanHFX_1_AF155139 vanH gene in vanF cluster 3002945 resfinder -VanHMX_1_FJ349556 vanM 3002911 resfinder -VanHMX_1_FJ349556 vanX gene in vanM cluster 3002953 resfinder VanHMX_1_FJ349556 vanH gene in vanM cluster 3002947 resfinder VanHOX_1_KF478993 vanH gene in vanO cluster 3002948 resfinder -VanHOX_1_KF478993 vanO 3002913 resfinder -VanHOX_1_KF478993 vanX gene in vanO cluster 3002954 resfinder VanH_bc_1_Y15705 vanH gene in vanA cluster 3002942 resfinder VanLXY_1_EU250284 vanL 3002910 resfinder -VanLXY_1_EU250284 vanXY gene in vanL cluster 3002968 resfinder VanNXY_1_JF802084 vanN 3002912 resfinder -VanNXY_1_JF802084 vanXY gene in vanN cluster 3002969 resfinder VanXY_C2_1_AY033089 vanXY gene in vanC cluster 3002966 resfinder VanX_bc_1_Y15708 vanX gene in vanA cluster 3002949 resfinder aac(2')-IIa_1_AB669090 AAC(2')-IIa 3004628 resfinder @@ -152,9 +101,9 @@ aac(3)-VIIa_1_M22999 AAC(3)-VIIa 3002541 resfinder aac(3)-VIa_1_M88012 AAC(3)-VIa 3002540 resfinder aac(3)-VIa_2_NC_009838 AAC(3)-VIa 3002540 resfinder aac(3)-XI_1_CTEG01000046 AAC(6')-Iap 3007204 resfinder -aac(3)-Xa_1_AB028210 AAC(3)-Xa 3002544 resfinder -aac(6')-29a_1_AF263519 mdtN 3003548 resfinder -aac(6')-29b_1_AF263519 mdtN 3003548 resfinder +aac(3)-Xa_1_AB028210 gimA 3000463 resfinder +aac(6')-29a_1_AF263519 AAC(6')-29a 3002583 resfinder +aac(6')-29b_1_AF263519 AAC(6')-29a 3002583 resfinder aac(6')-30-aac(6')-Ib'_1_AJ584652 AAC(6')-30/AAC(6')-Ib' bifunctional protein 3002599 resfinder aac(6')-31_1_AM283489 AAC(6')-31 3002585 resfinder aac(6')-32_1_EF614235 AAC(6')-32 3002586 resfinder @@ -268,7 +217,7 @@ ant(2'')-Ia_6_AJ871915 ANT(2'')-Ia 3000230 resfinder ant(2'')-Ia_7_DQ018384 ANT(2'')-Ia 3000230 resfinder ant(2'')-Ia_8_AY920928 ANT(2'')-Ia 3000230 resfinder ant(2'')-Ia_9_HM367610 ANT(2'')-Ia 3000230 resfinder -ant(3'')-Ia_1_X02340 aadA 3002601 resfinder +ant(3'')-Ia_1_X02340 ANT(3'')-IIa 3004089 resfinder ant(3'')-Ii-aac(6')-IId_1_AF453998 ANT(3'')-II-AAC(6')-IId bifunctional protein 3002598 resfinder ant(4')-IIa_1_M98270 ANT(4')-IIa 3002624 resfinder ant(4')-IIb_1_AY114142 ANT(4')-IIb 3002625 resfinder @@ -355,7 +304,7 @@ aph(9)-Ia_2_CR628337 APH(9)-Ia 3002662 resfinder aph(9)-Ib_1_U70376 APH(9)-Ib 3002663 resfinder armA_1_AY220558 armA 3000858 resfinder blaACC-1_1_AJ133121 ACC-1 3001815 resfinder -blaACC-1_2_HG530658 ACC-1 3001815 resfinder +blaACC-1_2_HG530658 ACC-1a 3006228 resfinder blaACC-1a_1_AF180953 ACC-1a 3006228 resfinder blaACC-1b_1_AF180955 ACC-1b 3006229 resfinder blaACC-1c_1_AF180959 ACC-1c 3006230 resfinder @@ -363,7 +312,7 @@ blaACC-1d_1_ADCU02000001 ACC-1d 3006231 resfinder blaACC-2_1_AF180952 ACC-2 3001816 resfinder blaACC-3_1_AF180958 ACC-3 3001817 resfinder blaACC-4_1_GU256641 ACC-4 3001818 resfinder -blaACC-4_1_KM087831 ACC-4 3001818 resfinder +blaACC-4_1_KM087831 ACC-1a 3006228 resfinder blaACC-5_1_HE819401 ACC-5 3001819 resfinder blaACC-7_1_MG028657 ACC-7 3006232 resfinder blaACI-1_1_AJ007350 ACI-1 3004359 resfinder @@ -406,8 +355,7 @@ blaBES-1_1_AF234999 BES-1 3004751 resfinder blaBIC-1_1_GQ260093 BIC-1 3004753 resfinder blaBIC-2_1_OR143113 BIC-1 3004753 resfinder blaBIL-1_1_X74512 BIL-1 3004755 resfinder -blaBIM-1_1_CP016446 SIM-2 3005494 resfinder -blaBKC-1_1_KP689347 BKC-1 3004757 resfinder +blaBKC-1_1_KP689347 APH(4)-Ia 3002655 resfinder blaBKC_1_KP689347 BKC-1 3004757 resfinder blaBRO-1_1_Z54180 BRO-1 3004761 resfinder blaBRO-2_1_Z54181 BRO-2 3004762 resfinder @@ -1074,8 +1022,8 @@ blaLMB-1_1_MH475146 LMB-1 3005018 resfinder blaLUT-1_1_AY695112 LUT-1 3006943 resfinder blaL_1_NG050597 Bla1 3000090 resfinder blaL_2_NG050596 Bla1 3000090 resfinder -blaMAL-1_1_AJ277209 CKO-1 3004773 resfinder -blaMAL-1_2_AJ609506 CKO-1 3004773 resfinder +blaMAL-1_1_AJ277209 MAL-1 3006949 resfinder +blaMAL-1_2_AJ609506 MAL-1 3006949 resfinder blaMIR-1_1_M37839 MIR-1 3002166 resfinder blaMIR-2_1_AY227752 MIR-2 3002168 resfinder blaMIR-3_1_AY743435 MIR-3 3002169 resfinder @@ -1904,7 +1852,6 @@ blaSMB-1_1_AB636283 SMB-1 3000854 resfinder blaSME-1_1_Z28968 SME-1 3002379 resfinder blaSME-2_1_AF275256 SME-2 3002380 resfinder blaSME-3_1_AY584237 SME-3 3002381 resfinder -blaSPG-1_1_KP109680 SPG-1 3003720 resfinder blaSPM-1_1_AY341249 SPM-1 3003793 resfinder blaSPU-1_1_GQ919044 SPU-1 3006998 resfinder blaSRT-1_1_AB008454 SRT-1 3002493 resfinder @@ -2308,7 +2255,7 @@ blaZ_99_LTNH01000008 PC1 beta-lactamase (blaZ) 3000621 resfinder blaZ_9_DQ269019 PC1 beta-lactamase (blaZ) 3000621 resfinder bleO_1_AF051917 BLMT 3005036 resfinder car(A)_1_M80346 carA 3002817 resfinder -cat(pC194)_1_NC_002013 Streptococcus suis chloramphenicol acetyltransferase 3004455 resfinder +cat(pC194)_1_NC_002013 Limosilactobacillus reuteri cat-TC 3002671 resfinder cat(pC221)_1_X02529 Staphylococcus intermedius chloramphenicol acetyltransferase 3004457 resfinder cat(pC233)_1_AY355285 Enterococcus faecium chloramphenicol acetyltransferase 3004456 resfinder cat86_1_K00544 Bacillus pumilus cat86 3002672 resfinder @@ -2492,8 +2439,6 @@ dfrK_1_FN377602 dfrK 3002869 resfinder dfrK_2_FN677369 dfrK 3002869 resfinder dfrK_3_FN812951 dfrK 3002869 resfinder dldHA2X_1_AL939117 vanH gene in vanO cluster 3002948 resfinder -dldHA2X_1_AL939117 vanI 3003723 resfinder -dldHA2X_1_AL939117 vanX gene in vanI cluster 3003725 resfinder ere(A)_1_AY183453 EreA 3000361 resfinder ere(A)_2_AF099140 EreA2 3002826 resfinder ere(A)_3_AF326209 EreA2 3002826 resfinder @@ -2579,8 +2524,7 @@ erm(T)_4_AJ488494 ErmT 3000595 resfinder erm(U)_1_NG_047843 ErmU 3001305 resfinder erm(V)_1_U59450 ErmV 3002824 resfinder erm(W)_1_D14532 ErmW 3001306 resfinder -erm(X)_1_M36726 DES-1 3004780 resfinder -erm(X)_1_M36726 tetA(58) 3003980 resfinder +erm(X)_1_M36726 ErmX 3000596 resfinder erm(X)_2_X51472 ErmX 3000596 resfinder erm(X)_3_U21300 ErmX 3000596 resfinder erm(X)_4_NC_005206 ErmX 3000596 resfinder @@ -2637,7 +2581,6 @@ fos_2_FN543093 FosA8 3007371 resfinder fusB_1_AY373761 fusB 3003552 resfinder fusB_2_JF777505 fusB 3003552 resfinder fusC_1_KF527883 fusC 3003733 resfinder -grdA_1_QJX10702 efrB 3003949 resfinder grmA_1_M55520 sgm 3000862 resfinder grmB_1_M55521 sgm 3000862 resfinder grmO_1_AY524043 sgm 3000862 resfinder @@ -2749,7 +2692,7 @@ mef(A)_3_AF227520 mel 3000616 resfinder mef(A)_4_HG423652 mel 3000616 resfinder mef(B)_1_FJ196385 mef(B) 3003107 resfinder mef(C)_1_AB571865 mefC 3003745 resfinder -mph(A)_1_D16251 mphA 3000316 resfinder +mph(A)_1_D16251 ADC-132 3006307 resfinder mph(A)_2_U36578 mphA 3000316 resfinder mph(B)_1_D85892 mphB 3000318 resfinder mph(C)_1_AB013298 mphC 3000319 resfinder @@ -2821,7 +2764,7 @@ penA_1_AF515059 Neisseria meningititis PBP2 conferring resistance to beta-lactam pexA_1_HM537013 pexA 3004666 resfinder poxtA-Ef_1_WP094899500.1 poxtA 3004470 resfinder poxtA_1_MF095097 poxtA 3004470 resfinder -qepA1_1_AB263754 QepA2 3004103 resfinder +qepA1_1_AB263754 AAC(2')-Id 3002526 resfinder qepA2_1_EU847537 QepA2 3004103 resfinder qepA3_1_JQ064560 QepA1 3000448 resfinder qepA4_1_KX580704 QepA4 3004379 resfinder @@ -3031,7 +2974,7 @@ tet(40)_1_FJ158002 tet(40) 3000567 resfinder tet(40)_2_AM419751 tet(40) 3000567 resfinder tet(41)_1_AY264780 tet(41) 3000569 resfinder tet(42)_1_EU523697 tet(42) 3000572 resfinder -tet(43)_1_GQ244501 tet(43) 3000573 resfinder +tet(43)_1_GQ244501 acrB 3000216 resfinder tet(44)_1_NZ_ABDU01000081 tet(44) 3000556 resfinder tet(44)_2_FN594949 tet(44) 3000556 resfinder tet(45)_1_JF837331 tet(45) 3003196 resfinder @@ -3170,25 +3113,10 @@ tmexD2_1_MN175502 MexD 3000801 resfinder tmexD3_1_LC633285 MexD 3000801 resfinder tmexD4_1_CP091084 MexD 3000801 resfinder tva(A)_1_ENA_SOX29786 tva(A) 3004730 resfinder -vanM_1_FJ349556 vanS gene in vanM cluster 3002939 resfinder vanM_1_FJ349556 vanR gene in vanM cluster 3002928 resfinder -vanM_1_FJ349556 rphA 3000444 resfinder -vanM_1_FJ349556 vanH gene in vanM cluster 3002947 resfinder -vanM_1_FJ349556 vanM 3002911 resfinder -vanM_1_FJ349556 vanY gene in vanM cluster 3002961 resfinder -vanM_1_FJ349556 vanX gene in vanM cluster 3002953 resfinder vanXmurFvanKWI_1_NZAGAF01000127 vanX gene in vanI cluster 3003725 resfinder -vanXmurFvanKWI_1_NZAGAF01000127 vanK gene in vanI cluster 3003727 resfinder -vanXmurFvanKWI_1_NZAGAF01000127 vanW gene in vanG cluster 3002965 resfinder -vanXmurFvanKWI_1_NZAGAF01000127 vanI 3003723 resfinder -vanXmurFvanKWI_2_AP008230 vanW gene in vanG cluster 3002965 resfinder vanXmurFvanKWI_2_AP008230 vanK gene in vanI cluster 3003727 resfinder -vanXmurFvanKWI_2_AP008230 vanX gene in vanI cluster 3003725 resfinder -vanXmurFvanKWI_2_AP008230 vanI 3003723 resfinder vanXmurFvanWI_1_CP001336 vanX gene in vanI cluster 3003725 resfinder -vanXmurFvanWI_1_CP001336 vanW gene in vanB cluster 3002964 resfinder -vanXmurFvanWI_1_CP001336 vanI 3003723 resfinder -vanXmurFvanWI_1_CP001336 cfr(D) 3005021 resfinder vat(A)_1_L07778 vatA 3002840 resfinder vat(B)_1_U19459 vatB 3002841 resfinder vat(C)_1_AF015628 vatC 3002842 resfinder diff --git a/argnorm/lib.py b/argnorm/lib.py index 734f80f..c9024c4 100644 --- a/argnorm/lib.py +++ b/argnorm/lib.py @@ -16,27 +16,27 @@ def is_number(num): return True -def get_data_path(path, getting_manual_curation): - if getting_manual_curation: - return os.path.join(_ROOT, 'data/manual_curation', path) - - return os.path.join(_ROOT, 'data', path) - def get_aro_mapping_table(database): - df = pd.read_csv(get_data_path(f'{database}_ARO_mapping.tsv', False), sep='\t') + aro_mapping_table = pd.read_csv(os.path.join(_ROOT, 'data', f'{database}_ARO_mapping.tsv'), sep='\t') - manual_curation = pd.read_csv(get_data_path(f'{database}_curation.tsv', True), sep='\t') - manual_curation['Database'] = df['Database'] + manual_curation = pd.read_csv(os.path.join(_ROOT, 'data/manual_curation', f'{database}_curation.tsv'), sep='\t') + manual_curation['Database'] = aro_mapping_table['Database'] + + aro_mapping_table = aro_mapping_table.drop_duplicates(subset=['Original ID'], ignore_index=True).set_index('Original ID') + for i in manual_curation['Original ID']: + if i in aro_mapping_table.index: + aro_mapping_table.loc[i, 'ARO'] = manual_curation.set_index('Original ID').loc[i, 'ARO'] + aro_mapping_table.loc[i, 'Gene Name in CARD'] = manual_curation.set_index('Original ID').loc[i, 'Gene Name in CARD'] + else: + aro_mapping_table.loc[i] = manual_curation.set_index('Original ID').loc[i] - aro_mapping_table = pd.concat([df, manual_curation]) aro_mapping_table[TARGET_ARO_COL] = aro_mapping_table[TARGET_ARO_COL].map(lambda a: f'ARO:{int(a)}' if is_number(a) else a) - - return aro_mapping_table + return aro_mapping_table.reset_index() def map_to_aro(gene, database): if database not in ['ncbi', 'deeparg', 'resfinder', 'sarg', 'megares', 'argannot']: raise Exception(f'{database} is not a supported database.') - + mapping_table = get_aro_mapping_table(database).set_index('Original ID') try: diff --git a/db_harmonisation/crude_db_harmonisation.py b/db_harmonisation/crude_db_harmonisation.py index e2010f2..18f3f81 100644 --- a/db_harmonisation/crude_db_harmonisation.py +++ b/db_harmonisation/crude_db_harmonisation.py @@ -5,6 +5,8 @@ import os from os import path import tempfile +from Bio import SeqIO +from Bio.Seq import translate, Seq @TaskGenerator def create_out_dirs(): @@ -58,9 +60,6 @@ def get_megares_db(): @TaskGenerator def fix_ncbi(ncbi_amr_faa): - from Bio import SeqIO - from Bio.Seq import Seq - ofile = './dbs/ncbi.faa' with open(ncbi_amr_faa) as original, \ open(ofile, 'w') as corrected: @@ -70,6 +69,16 @@ def fix_ncbi(ncbi_amr_faa): return ofile +@TaskGenerator +def fna_to_faa(ifile): + ofile = ifile.replace('.fna', '.faa') + with open(ifile) as original, open(ofile, 'w') as output: + for record in SeqIO.parse(original, 'fasta'): + record.seq = Seq(str(translate(record.seq)).replace('*', '')) + SeqIO.write(record, output, 'fasta') + + return ofile + @TaskGenerator def run_rgi(fa): from get_mapping_table import get_aro_for_hits @@ -88,7 +97,7 @@ def run_rgi(fa): subprocess.check_call( [ - 'rgi', + 'rgi', 'main', '-i', fa, '-o', rgi_ofile, @@ -109,7 +118,7 @@ def move_mappings_to_argnorm(aro_mapping): create_out_dirs() barrier() for db in [ - get_resfinder_db(), + fna_to_faa(get_resfinder_db()), fix_ncbi(get_ncbi_db()), get_sarg_db(), get_resfinderfg_db(), diff --git a/db_harmonisation/get_mapping_table.py b/db_harmonisation/get_mapping_table.py index 1835481..c8f3868 100644 --- a/db_harmonisation/get_mapping_table.py +++ b/db_harmonisation/get_mapping_table.py @@ -13,7 +13,7 @@ def check_file(path): return path else: raise argparse.ArgumentTypeError(f"{path} can't be read") - + def get_aro_for_hits(fa, rgi_output, database): database_entries = [] for record in SeqIO.parse(str(fa), 'fasta'): @@ -26,7 +26,7 @@ def get_aro_for_hits(fa, rgi_output, database): rgi_hits = pd.read_csv(rgi_output, sep='\t') if database == 'resfinder': - rgi_hits['Original ID'] = rgi_hits['Contig'].apply(lambda x: "_".join(x.split('_')[:-1])) + rgi_hits['Original ID'] = rgi_hits['ORF_ID'] elif database == 'ncbi': rgi_hits['Original ID'] = rgi_hits['ORF_ID'] elif database == 'sarg': diff --git a/outputs/hamronized/abricate.megares.tsv b/outputs/hamronized/abricate.megares.tsv index 2ba3494..16a0fba 100644 --- a/outputs/hamronized/abricate.megares.tsv +++ b/outputs/hamronized/abricate.megares.tsv @@ -198,7 +198,7 @@ GMGC10.027_121_620.RBPA GMGC10.95nr_block_0005 RBPA Drugs:Rifampin:RNA-polymeras GMGC10.027_126_791.RBPA GMGC10.95nr_block_0005 RBPA Drugs:Rifampin:RNA-polymerase_binding_protein:RBPA megares 2021-Mar-27 MEG_6047 abricate abricate 1.0.1 91.39 1 302 + 87.54 ARO:3000245 ARO:3000169,ARO:3000517,ARO:3000530,ARO:3000534 ARO:3000157,ARO:3000157,ARO:3000157,ARO:3000157 GMGC10.027_135_808.MMR GMGC10.95nr_block_0005 MMR Multi-compound:Drug_and_biocide_resistance:Drug_and_biocide_SMR_efflux_pumps:MMR megares 2021-Mar-27 MEG_3996 abricate abricate 1.0.1 80.33 20 324 + 94.14 ARO:3005009 ARO:3005386 ARO:3005386 GMGC10.027_272_655.SOXS GMGC10.95nr_block_0005 SOXS Multi-compound:Drug_and_biocide_and_metal_resistance:Drug_and_biocide_and_metal_resistance_regulator:SOXS megares 2021-Mar-27 MEG_6551 abricate abricate 1.0.1 80.29 36 314 + 86.11 ARO:3003511 ARO:0000036,ARO:3000385,ARO:3007045 ARO:0000001,ARO:0000001,ARO:3000387 -GMGC10.027_903_362.EMRE GMGC10.95nr_block_0005 QACH Multi-compound:Drug_and_biocide_resistance:Drug_and_biocide_SMR_efflux_pumps:QACH megares 2021-Mar-27 MEG_5847 abricate abricate 1.0.1 87.65 1 324 + 100.0 ARO:3006954 ARO:0000020 ARO:3000007 +GMGC10.027_903_362.EMRE GMGC10.95nr_block_0005 QACH Multi-compound:Drug_and_biocide_resistance:Drug_and_biocide_SMR_efflux_pumps:QACH megares 2021-Mar-27 MEG_5847 abricate abricate 1.0.1 87.65 1 324 + 100.0 ARO:3003836 ARO:3005386 ARO:3005386 GMGC10.028_155_496.SOXS GMGC10.95nr_block_0005 SOXS Multi-compound:Drug_and_biocide_and_metal_resistance:Drug_and_biocide_and_metal_resistance_regulator:SOXS megares 2021-Mar-27 MEG_6551 abricate abricate 1.0.1 91.8 1 317 + 97.84 ARO:3003511 ARO:0000036,ARO:3000385,ARO:3007045 ARO:0000001,ARO:0000001,ARO:3000387 GMGC10.028_171_025.RBPA GMGC10.95nr_block_0005 RBPA Drugs:Rifampin:RNA-polymerase_binding_protein:RBPA megares 2021-Mar-27 MEG_6047 abricate abricate 1.0.1 85.54 1 332 + 96.23 ARO:3000245 ARO:3000169,ARO:3000517,ARO:3000530,ARO:3000534 ARO:3000157,ARO:3000157,ARO:3000157,ARO:3000157 GMGC10.030_070_454.RBPA GMGC10.95nr_block_0005 RBPA Drugs:Rifampin:RNA-polymerase_binding_protein:RBPA megares 2021-Mar-27 MEG_6047 abricate abricate 1.0.1 92.77 1 332 + 96.23 ARO:3000245 ARO:3000169,ARO:3000517,ARO:3000530,ARO:3000534 ARO:3000157,ARO:3000157,ARO:3000157,ARO:3000157 diff --git a/outputs/hamronized/abricate.resfinder.tsv b/outputs/hamronized/abricate.resfinder.tsv index 000d157..723ab2a 100644 --- a/outputs/hamronized/abricate.resfinder.tsv +++ b/outputs/hamronized/abricate.resfinder.tsv @@ -5,7 +5,7 @@ GMGC10.016_690_720.UNKNOWN GMGC10.95nr_block_0002 dfrB3_1 dfrB3 resfinder 2021-M GMGC10.026_804_036.UNKNOWN GMGC10.95nr_block_0002 dfrB5_1 dfrB5 resfinder 2021-Mar-27 AY943084 abricate abricate 1.0.1 100.0 1 237 + 100.0 Trimethoprim ARO:3004549 ARO:3000188 ARO:3000171 GMGC10.044_316_970.UNKNOWN GMGC10.95nr_block_0002 dfrB3_1 dfrB3 resfinder 2021-Mar-27 X72585 abricate abricate 1.0.1 85.78 15 232 + 91.98 Trimethoprim ARO:3003022 ARO:3000188 ARO:3000171 GMGC10.046_578_884.UNKNOWN GMGC10.95nr_block_0002 dfrB4_1 dfrB4 resfinder 2021-Mar-27 AJ429132 abricate abricate 1.0.1 100.0 1 237 + 100.0 Trimethoprim ARO:3004498 ARO:3000188 ARO:3000171 -GMGC10.038_883_974.AACA7 GMGC10.95nr_block_0006 aac(6')-29a_1 aac(6')-29a resfinder 2021-Mar-27 AF263519 abricate abricate 1.0.1 100.0 1 381 + 96.21 Amikacin;Tobramycin ARO:3003548 +GMGC10.038_883_974.AACA7 GMGC10.95nr_block_0006 aac(6')-29a_1 aac(6')-29a resfinder 2021-Mar-27 AF263519 abricate abricate 1.0.1 100.0 1 381 + 96.21 Amikacin;Tobramycin ARO:3002583 ARO:0000007,ARO:0000013,ARO:0000049,ARO:0000052,ARO:3000652 ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016 GMGC10.002_430_984.FOSB GMGC10.95nr_block_0007 fosB1_1 fosB1 resfinder 2021-Mar-27 CP001903 abricate abricate 1.0.1 99.28 1 417 + 100.0 Fosfomycin ARO:3007372 ARO:0000025 ARO:3007149 GMGC10.002_455_792.FOSB GMGC10.95nr_block_0007 fos_1 fos resfinder 2021-Mar-27 ACCV01000052 abricate abricate 1.0.1 87.22 1 399 + 100.0 Fosfomycin ARO:3007371 ARO:0000025 ARO:3007149 GMGC10.003_097_321.FOSB GMGC10.95nr_block_0007 fos_1 fos resfinder 2021-Mar-27 ACCV01000052 abricate abricate 1.0.1 93.73 1 399 + 100.0 Fosfomycin ARO:3007371 ARO:0000025 ARO:3007149 @@ -696,7 +696,7 @@ GMGC10.035_775_951.UNKNOWN GMGC10.95nr_block_0020 blaOXA-157_1 blaOXA-157 resfin GMGC10.035_923_519.AACC GMGC10.95nr_block_0020 aac(3)-VIIa_1 aac(3)-VIIa resfinder 2021-Mar-27 M22999 abricate abricate 1.0.1 82.55 8 837 + 95.62 ARO:3002541 ARO:3000657 ARO:0000016 GMGC10.036_776_447.ERMX GMGC10.95nr_block_0020 erm(U)_1 erm(U) resfinder 2021-Mar-27 NG_047843 abricate abricate 1.0.1 97.5 1 840 + 100.0 GMGC10.037_145_599.ERMX GMGC10.95nr_block_0020 erm(X)_4 erm(X) resfinder 2021-Mar-27 NC_005206 abricate abricate 1.0.1 94.5 1 855 + 100.0 Erythromycin;Lincomycin;Clindamycin;Quinupristin;Pristinamycin_IA;Virginiamycin_S -GMGC10.037_185_560.ERMX GMGC10.95nr_block_0020 erm(X)_1 erm(X) resfinder 2021-Mar-27 M36726 abricate abricate 1.0.1 96.26 1 855 + 99.88 Erythromycin;Lincomycin;Clindamycin;Quinupristin;Pristinamycin_IA;Virginiamycin_S ARO:3003980 ARO:0000051 ARO:3000050 +GMGC10.037_185_560.ERMX GMGC10.95nr_block_0020 erm(X)_1 erm(X) resfinder 2021-Mar-27 M36726 abricate abricate 1.0.1 96.26 1 855 + 99.88 Erythromycin;Lincomycin;Clindamycin;Quinupristin;Pristinamycin_IA;Virginiamycin_S ARO:3000596 ARO:0000006,ARO:0000027,ARO:0000046,ARO:0000057,ARO:0000065,ARO:0000066,ARO:3000145,ARO:3000156,ARO:3000158,ARO:3000176,ARO:3000583,ARO:3000584,ARO:3000669,ARO:3000672,ARO:3000673,ARO:3000674,ARO:3000675,ARO:3000677,ARO:3000678,ARO:3000679,ARO:3000680,ARO:3000681,ARO:3000682,ARO:3000867 ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000000,ARO:0000017,ARO:0000017,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026,ARO:0000026 GMGC10.037_639_937.AACC GMGC10.95nr_block_0020 aac(3)-Xa_1 aac(3)-Xa resfinder 2021-Mar-27 AB028210 abricate abricate 1.0.1 85.31 1 843 - 98.48 Amikacin ARO:3002544 ARO:0000049 ARO:0000016 GMGC10.038_878_517.YBXI GMGC10.95nr_block_0020 blaOXA-296_1 blaOXA-296 resfinder 2021-Mar-27 APOH01000009 abricate abricate 1.0.1 90.88 1 855 + 100.0 ARO:3001751 ARO:0000056 ARO:3000007 GMGC10.039_883_918.BLA GMGC10.95nr_block_0020 blaVHH-1_1 blaVHH-1 resfinder 2021-Mar-27 NG050334 abricate abricate 1.0.1 86.03 1 851 + 99.77 ARO:3007003 ARO:3000637 ARO:3000007 @@ -859,7 +859,7 @@ GMGC10.006_574_647.BLA GMGC10.95nr_block_0023 blaSGM-1_1 blaSGM-1 resfinder 2021 GMGC10.007_442_608.UNKNOWN GMGC10.95nr_block_0023 blaZ_12 blaZ resfinder 2021-Mar-27 M15195 abricate abricate 1.0.1 97.9 1 951 + 100.0 Amoxicillin;Ampicillin;Penicillin;Piperacillin ARO:3007101 ARO:0000032,ARO:3003706 ARO:3000007,ARO:3000007 GMGC10.007_663_890.BLA GMGC10.95nr_block_0023 blaSGM-2_1 blaSGM-2 resfinder 2021-Mar-27 AP010803 abricate abricate 1.0.1 91.71 1 929 + 99.25 ARO:3006988 ARO:0000020 ARO:3000007 GMGC10.010_077_879.UNKNOWN GMGC10.95nr_block_0023 blaCME-1_1 blaCME-1 resfinder 2021-Mar-27 AJ006275 abricate abricate 1.0.1 99.5 92 889 + 89.86 ARO:3004775 ARO:3000008 ARO:3000007 -GMGC10.010_980_812.AADA GMGC10.95nr_block_0023 ant(3'')-Ia_1 ant(3'')-Ia resfinder 2021-Mar-27 X02340 abricate abricate 1.0.1 97.55 5 942 + 96.5 Streptomycin ARO:3002601 ARO:0000039,ARO:0000040 ARO:0000016,ARO:0000016 +GMGC10.010_980_812.AADA GMGC10.95nr_block_0023 ant(3'')-Ia_1 ant(3'')-Ia resfinder 2021-Mar-27 X02340 abricate abricate 1.0.1 97.55 5 942 + 96.5 Streptomycin ARO:3004089 ARO:0000039,ARO:0000040 ARO:0000016,ARO:0000016 GMGC10.011_733_785.BLA GMGC10.95nr_block_0023 blaSGM-1_1 blaSGM-1 resfinder 2021-Mar-27 AAQG01000013 abricate abricate 1.0.1 100.0 1 966 + 100.0 ARO:3006987 ARO:0000020 ARO:3000007 GMGC10.013_566_266.BLA GMGC10.95nr_block_0023 blaSGM-1_1 blaSGM-1 resfinder 2021-Mar-27 AAQG01000013 abricate abricate 1.0.1 84.89 1 966 + 100.0 ARO:3006987 ARO:0000020 ARO:3000007 GMGC10.013_931_986.BLA GMGC10.95nr_block_0023 blaSGM-1_1 blaSGM-1 resfinder 2021-Mar-27 AAQG01000013 abricate abricate 1.0.1 81.11 14 963 + 98.34 ARO:3006987 ARO:0000020 ARO:3000007 diff --git a/setup.py b/setup.py index 77d69cc..2ee35fa 100644 --- a/setup.py +++ b/setup.py @@ -24,7 +24,7 @@ packages=['argnorm', 'argnorm.data'], include_package_data=True, package_dir={'argnorm': 'argnorm' }, - package_data={'argnorm': ['data/*.tsv', 'data/manual_curation/*.tsv']}, + package_data={'argnorm': ['data/*.tsv', 'data/manual_curation/*.tsv', 'data/cluster_rc_correction/*.tsv']}, install_requires=open("./requirements.txt", "r").read().splitlines(), long_description=open("./README.md", "r").read(), long_description_content_type='text/markdown',