Repository contains a slection of Bash shell scripts used during my MSc project for data management, merging and QC. Pertinent to AntiSMASH GenBank and Biosynthetic MetaSpades outputs.
-AntiSMASH
https://antismash.secondarymetabolites.org/#!/start
https://doi.org/10.1093/nar/gkab335
-Biosynthetic MetaSpades
https://github.com/ablab/spades
https://dx.doi.org/10.1101%2Fgr.213959.116
Merges GenBank files in respective AntiSMASH output folder into a single file.
Runs AntiSMASH analysis tool on Biosynthetic MetaSpades processed fasta files; will run on each 'genecluster' fasta iteratively.
Merges Biosynthetic MetaSpades output fasta files to one file form their subdirectories and also pases out fasta read length data.
Compiles filename accessions together into one txt file from their base file name if in directory with several result folders and shows sample names used.
Useful during GenBank file management; counts number of GenBank files in curent directory.
Collects GenBank count data outputs from AntiSMASH into one file, breaks by common parsable marker and includes sample origin for use in analysis.
Merge scaffold fasta outputs from Biosynthetic MetaSpades into one unified file. Can be used further on fasta files with tweaking.
Collects fastas together into one file, breaks by common parsable marker and includes fasta origin filename
Removes empty files in all subdirectories; clears away and identifies failed ANtiSMASH/ Biosynthetic MetaSpades analyses