Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 890 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 890 Bytes

CrisprCustomDB

CrisprCustomDB was inspired by https://github.com/edzuf/CrisprOpenDB to predict virus-host pairs from custom local spacers databases according to three criteria stablished in Dion et al. (2021) (https://doi.org/10.1093/nar/gkab133). The user must provide a .gff file generated by CRISPRDetect and the viral genomes (nucleotides) in multi-fasta format. Initially, it exctracts spacers from .gff file and creates a blast database. Then viral genomes are searched against the spacer database. Finally, it takes blast table as input and outputs a list of virus-host according to criterion 1 (Host with max mismatches = 2), criterion 2 (Host matching more regions) and criterion 3 (Host with spacer closest to the 5' end)

Usage: bash get_blast_tables.sh arrays.gff viruses.fasta

then

USAGE: perl get_host_id.pl bacteria_viruses_blast.txt > bacteria_viruses_blast_host_id.txt