Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ursky authored Aug 2, 2017
1 parent c1fdf63 commit 2498a14
Showing 1 changed file with 47 additions and 45 deletions.
92 changes: 47 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Introducing metaWRAP v0.1 - Comprehensive Metagenome Analysis for Beginners
## Introducing metaWRAP v0.2 - Comprehensive Metagenome Analysis for Beginners


metaWRAP aims to be an easy-to-use inclusive wrapper program that accomplishes the most basic tasks in metagenomic analysis: QC, assembly, binning, visualization, and taxonomic profiling. While there is no single best approach for processing metagenomic data, metaWRAP is meant to be a fast and simple first pass program before you delve deeper into parameterization of your approach.
Expand All @@ -21,57 +21,15 @@

## INSTALATION

Clone or download the metaWRAP directory into a semi-permanent location, then go into the ./bin folder and edit the ./bin/contig-metawrap file. Make sure that all the paths are correct, especially the paths pointing to the "scripts" and "pipelines" folders in the main metaWRAP directory. Once that is configured, simply copy the contents of the ./bin into your local bin folder, or simply add them to your path. If youre unsure how to do this, here are the commands to do this:
Clone or download the metaWRAP directory into a semi-permanent location, then go into the metaWRAP/bin folder and edit the metaWRAP/bin/contig-metawrap file. Make sure that all the paths are correct, especially the paths pointing to the "scripts" and "pipelines" folders in the main metaWRAP directory. Once that is configured, simply copy the contents of the metaWRAP/bin into your local bin folder, or simply add them to your path. If youre unsure how to do the later, here are the commands:

```
echo "export PATH="/full/path/to/metaWRAP/bin:$PATH"" >> ~/.bash_profile_
echo "export PATH="/full/path/to/metaWRAP/bin:$PATH"" >> ~/.bash_profile
source ~/.bash_profile
```

No try running metaWRAP -h to see if everything works!

## DEPENDENCIES

Since this is a wrapper program, the biggest challenge in installing metaWRAP will likely be configuring all the dependencies correctly. Firstly, the path to the folder "meta-scripts", containing numerous scripts required for running this pipeline, needs to be configured in config.sh. Next, the following programs need to be installed in your PATH. NOTE: the versions of the programs may or may not be important.

NOTE: It is not necessary to install all of these, depending on which module you are interested in using. For example, if you dont want to sort out human reads, you dont have to install bmtagger and its database. Just make sure to use the --skip-bmtagger flag when running the read_qc module.

| Software | Tested version | Used in module |
|:---------------:|:---------------:|:---------------------:|
| BLAST | v=2.6.0 | blobology |
| bmtagger | v=3.101 | read_qc |
| Bowtie2 | v=2.3.2 | blobology |
| bwa | v=0.7.15 | binning |
| Checkm | v=1.0.7 | binning |
| FastQC | v=v0.11.5 | read_qc |
| kraken | v=0.10.6 | kraken |
| kronatools | v=2.7 | kraken |
| megahit | v=1.1.1-2 | assembly |
| metabat2 | v=2.9.1 | binning |
| perl | v=5.22.0 | blobology |
| python | v=2.7.1 | all modules |
| quast | v=4.5 | assembly |
| R | v=3.3.2 | blobology |
| samtools | v=1.3.1 | assembly, blobology |
| SPAdes | v=3.10.1 | assembly |
| trim_galore | v=0.4.3 | read_qc |

The installation of most of these dependencies should not be difficult even in non-sudo environments with the use of conda. Future versions of metaWRAP are expected to have more detailed installation instructions, but for now you are on your own.


## DATABASES

Finally, you will need to download several databases and configure their paths in the config.sh file. This may be the longest step of the installation. Here is a full list of the databases:

| Database | Size | Source |
|:---------------:|:---------------:|:-----:|
|Checkm_DB |1.4GB| CheckM should prompt you to download this during first use |
|KRAKEN standard database|161GB | look at the official KRAKEN support website for download instructions |
|RefSeq NCBI_nt |71GB | look at the config.sh for download instructions |
|RefSeq NCBI_tax |283MB | look at the config.sh for download instructions |
|Indexed hg38 | 20GB | look at the bmtagger manual for instructions |


## USAGE

Once all the dependencies are in place, running metaWRAP is relatively simple. The main metaWRAP script wraps around all of its indivirual modules, which you can call independantly.
Expand Down Expand Up @@ -144,6 +102,50 @@ metaWRAP binning -t 48 -m 500 --checkm-best-bins --checkm-good-bins -a coassembl
```



## DEPENDENCIES

Since this is a wrapper program, the biggest challenge in installing metaWRAP will likely be configuring all the dependencies correctly. Firstly, the path to the folder "meta-scripts", containing numerous scripts required for running this pipeline, needs to be configured in config.sh. Next, the following programs need to be installed in your PATH. NOTE: the versions of the programs may or may not be important.

NOTE: It is not necessary to install all of these, depending on which module you are interested in using. For example, if you dont want to sort out human reads, you dont have to install bmtagger and its database. Just make sure to use the --skip-bmtagger flag when running the read_qc module.

| Software | Tested version | Used in module |
|:---------------:|:---------------:|:---------------------:|
| BLAST | v=2.6.0 | blobology |
| bmtagger | v=3.101 | read_qc |
| Bowtie2 | v=2.3.2 | blobology |
| bwa | v=0.7.15 | binning |
| Checkm | v=1.0.7 | binning |
| FastQC | v=v0.11.5 | read_qc |
| kraken | v=0.10.6 | kraken |
| kronatools | v=2.7 | kraken |
| megahit | v=1.1.1-2 | assembly |
| metabat2 | v=2.9.1 | binning |
| perl | v=5.22.0 | blobology |
| python | v=2.7.1 | all modules |
| quast | v=4.5 | assembly |
| R | v=3.3.2 | blobology |
| samtools | v=1.3.1 | assembly, blobology |
| SPAdes | v=3.10.1 | assembly |
| trim_galore | v=0.4.3 | read_qc |

The installation of most of these dependencies should not be difficult even in non-sudo environments with the use of conda. Future versions of metaWRAP are expected to have more detailed installation instructions, but for now you are on your own.


## DATABASES

Finally, you will need to download several databases and configure their paths in the config.sh file. This may be the longest step of the installation. However, you may skip databases that are not required for the modules/options you want to use. Here is a full list of the databases:

| Database | Size | Source |
|:---------------:|:---------------:|:-----:|
|Checkm_DB |1.4GB| CheckM should prompt you to download this during first use |
|KRAKEN standard database|161GB | look at the official KRAKEN support website for download instructions |
|RefSeq NCBI_nt |71GB | look at the config.sh for download instructions |
|RefSeq NCBI_tax |283MB | look at the config.sh for download instructions |
|Indexed hg38 | 20GB | look at the bmtagger manual for instructions |



### System requirements
The resource requirements for this pipeline will vary greatly based on the number of reads you are processing, but I would advise against attempting to run it on anything less than 10 cores and 100GB RAM. With the help of conda, installing metaWRAP and its dependencies on a cluster (even without sudo privileges) should be relatively easy even for beginners.

Expand Down

0 comments on commit 2498a14

Please sign in to comment.