Skip to content

mkarlik93/tiara_mitogenomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tiara_mitogenomics

This is a snakemake workflow for detection, and preliminary annotation mitochondrial genomes of Metazoans (Animals) such as Nematodes in an assembled metagenomic data.

Note!

Before analysis you need to set up conda environment with snakemake installed.

Usage

git clone https://github.com/mkarlik93/tiara_mitogenomics

cd tiara_mitogenomics

conda activate snakemake

Input file, database and minimum length must be specified in config file as full paths - a template of config file you can find in config subfolder and looks like this:

run_name: test_run # Name of your run
assembly_path: "metagenomic_assembly.fasta" # A full path to metagenomic assembly
outputdir: 'test_run/' # An output directory

Pipeline_settings :

  mito_pipline : True


General_parameters:

  min_length: 3000 # A minimum length of contigs analyzed by tiara
  temporary_directory: "~/." # Path to temporary directory
  database: "Nematoda_CDS_protein.fa" # A full path to the database

When you modified config file (which must be located in config subfolder) you can start your analysis.

snakemake --use-conda --cores n , where n is the number of cores you want to be used. Remember you must be in tiara_mitogenomics folder.

MT_database can be downloaded from figshare contains representative proteomes specific for 12 group of Metazoa:

  • Annelida-segmented-worms
  • Arthropoda
  • Bryozoa
  • Chaetognatha
  • Chordata
  • Cnidaria
  • Echinodermata
  • Mollusca
  • Nematoda
  • Nemertea-ribbon-worms
  • Platyhelminthes-flatworms
  • Porifera-sponges

Please unzip datapack and specify path in config file

Also you can find concatenated proteomes from all groups in Animal_CDS_protein.fa

This database was created by developers of mitoZ please, credit their work as well.

Results

The main result is a tabular file with a suffix "_tabulated_result.tsv" that contains a names of contigs with counts of mitochondrial genes.

Name Length GC_content ND2 ND5 COX3 ATP8 ND6 ND4L ND4 ND3 ND1 COX1 ATP6 CYTB COX2
Dummy_contig1 3072 23.14 0 0 1 0 0 0 1 0 0 7 0 1 0

Also you will find two subdirectories named "tiara_results" and "invidual_contigs_res". First contains native results of tiara analysis, the second one contains DNA and AA sequences originated from mitochondrial contigs with raw mmseqs2 reciprocal analysis.

Citation

Using this package please cite tiara

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published