Skip to content

medvir/DiaMet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiaMet

This repository aims to analyze those metagenomic Illumina reads that were unable to be classified by VirMet (undetermined_reads.fastq.gz) by aligning them on protein level using DIAMOND/BLASTx.

diamet.py analyzes all reads from undetermined_reads.fastq.gz

  • as single reads, and

  • as contigs created by de novo assembly using megahit.

How to run

  1. Enter timavo.

ssh timavo

  1. Move into the directory of the sample whose undetermined reads you want to analyze.

cd /analysis/VirMet/<run>/<sample>/

  1. Run the python script.

python <path to script>/diamet.py

Input

To run diamet.py, you need:

  • the diamond unix executable file which can be found here;

  • megahit installed on the server;

  • a protein database (defined in the code; we are using swissprot);

  • undetermined_reads.fastq.gz, which should be in the current working directory.

Output

diamet.py will output the following files:

  • undetermined_reads_diamet.pdf which plots taxonomic classification distribution of all hits;

  • undetermined_reads_diamet.tsv which lists all hits and their Query Seq - id (qseqid), Query sequence length (qlen), Alignment length (length), Unique Subject Scientific Name (sscinames), and Unique Subject Super Kingdom (sskingdoms);

  • undetermined_reads_diamet_viral.csv which lists only the viral hits and their counts;

  • undetermined_contigs_diamet.tsv which lists all hits of the contigs and their Query Seq - id (qseqid), Query sequence length (qlen),Alignment length (length), Unique Subject Scientific Name (sscinames), and Unique Subject Super Kingdom (sskingdoms).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages