Skip to content

Commit

Permalink
Merge pull request #7 from Russel88/dev
Browse files Browse the repository at this point in the history
0.1.18
  • Loading branch information
trinezac authored Aug 28, 2023
2 parents 96add80 + 4c3706c commit fed31ec
Show file tree
Hide file tree
Showing 11 changed files with 70 additions and 41 deletions.
24 changes: 17 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,17 +58,18 @@ maginator ... --cluster qsub --cluster_info "-l nodes=1:ppn={cores}:thinnode,mem

## Test data

A test set can be found in the test_data directory.
A test set can be found in the maginator/test_data directory.
1. Download the 3 samples used for the test at SRA: https://www.ncbi.nlm.nih.gov/sra?LinkName=bioproject_sra_all&from_uid=715601 with the ID's dfc99c_A, f9d84e_A and 221641_A
2. Change the paths to the read-files in reads.csv
3. Unzip the contigs.fasta.gz
4. Run MAGinator
2. Clone repo: git clone https://github.com/Russel88/MAGinator.git
3. Change the paths to the read-files in reads.csv
4. Unzip the contigs.fasta.gz
5. Run MAGinator

MAGinator has been run on the test data on a slurm server with the following command:
```
```sh
maginator --vamb_clusters clusters.tsv --reads reads.csv --contigs contigs.fasta --gtdb_db data/release207_v2/ --output test_out --cluster slurm --cluster_info "-n {cores} --mem {mem_gb}gb -t {runtime}" --max_mem 180
```
The expected output can be found in test_data/test_out (excluding the GTDB-tk folders, phylogeny alignments and BAM-files due to size limitations)
The expected output can be found as a zipped file on Zenodo: https://doi.org/10.5281/zenodo.8279036

## Recommended workflow

Expand All @@ -88,14 +89,23 @@ sed 's/@/_/g' vamb/clusters.tsv > clusters.tsv

Now you are ready to run MAGinator.

## Functional Annotation

To generate the functional annotation of the genes we recommend using EggNOG mapper (https://github.com/eggnogdb/eggnog-mapper).

You can download it and try to run it on the test data
```
```sh
mkdir test_out/functional_annotation
emapper.py -i test/genes/all_genes_rep_seq.fasta --output test_out/functional_annotation -m diamond --cpu 38
```

The eggNOG output can be merged with clusters.tsv and further processed to obtain functional annotations of the MAG, cluster or sample levels with the following command:
```sh
(echo -e '#sample\tMAG_cluster\tMAG\tfunction'; join -1 1 -2 1 <(awk '{print $2 "\t" $1}' clusters.tsv | sort) <(tail -n +6 annotations.tsv | head -n -3 | cut -f1,15 | grep -v '\-$' | sed 's/_[[:digit:]]\+\t/\t/' | sed 's/,/\n/g' | perl -lane '{$q = $F[0] if $#F > 0; unshift(@F, $q) if $#F == 0}; print "$F[0]\t$F[1]"' | sed 's/\tko:/\t/' | sort) | awk '{print $2 "\t" $2 "\t" $3}' | sed 's/_/\t/' | sort -k1,1 -k2,2n) > MAGfunctions.tsv
```
In this case the KEGG ortholog column 15 was picked from the eggNOG-mapper output. But by cutting e.g. column number 13, one would obtain GO terms instead. Refer to the header of the eggNOG-mapper output for other available functional annotations e.g. KEGG pathways, Pfam, CAZy, COGs, etc.


## MAGinator workflow

This is what MAGinator does with your input (if you want to see all parameters run maginator --help):
Expand Down
31 changes: 0 additions & 31 deletions conda_build/meta.yaml

This file was deleted.

5 changes: 5 additions & 0 deletions maginator/recommended_workflow/envs/checkm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
name: checkm-genome
channels:
- bioconda
dependencies:
- checkm-genome
4 changes: 4 additions & 0 deletions maginator/recommended_workflow/envs/import_hg_19.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
library(BSgenome.Hsapiens.UCSC.hg19.masked)
genome <- BSgenome.Hsapiens.UCSC.hg19
out_file <- file.path(snakemake@output[["hg19"]])
export(genome, out_file)
5 changes: 5 additions & 0 deletions maginator/recommended_workflow/envs/metabat2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
name: metabat2
channels:
- bioconda/label/cf201901
dependencies:
- metabat2
13 changes: 13 additions & 0 deletions maginator/recommended_workflow/envs/preprocess.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
channels:
- bioconda
- conda-forge
- r
dependencies:
- biopython=1.79
- pandas=1.4
- bbmap=38.96
- sickle-trim=1.33
- spades=3.15.5
- samtools=1.10
- bwa-mem2=2.2.1
- bioconductor-bsgenome.hsapiens.ucsc.hg19.masked=1.3.993
5 changes: 5 additions & 0 deletions maginator/recommended_workflow/envs/samtools.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
name: samtools
channels:
- bioconda
dependencies:
- samtools
14 changes: 14 additions & 0 deletions maginator/recommended_workflow/envs/vamb.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: vamb
channels:
- pytorch
- conda-forge
- bioconda
dependencies:
- pytorch
- pip
- torchvision
- cudatoolkit=10.2
- pysam
- numpy=1.20
- pip:
- git+https://github.com/RasmussenLab/vamb@v3.0.8
2 changes: 1 addition & 1 deletion maginator/workflow/envs/phylo.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
channels:
- bioconda
- conda-forge
- bioconda
- biobuilds
dependencies:
- biopython=1.79
Expand Down
6 changes: 5 additions & 1 deletion package.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# New version
## 1) Update version in setup.py and commit and push
## 2) Pull request of dev into main
## 3) Make release on GitHub
## 4) Run this code:
rm -r maginator.egg-info/ dist/ build/
python setup.py sdist
python setup.py install
twine upload dist/*
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

setuptools.setup(
name="maginator",
version="0.1.17",
version="0.1.18",
author="Jakob Russel & Trine Zachariasen",
author_email="russel2620@gmail.com,trine_zachariasen@hotmail.com",
description="MAGinator: Abundance, strain, and functional profiling of MAGs",
Expand Down

0 comments on commit fed31ec

Please sign in to comment.