An end-to-end script to convert Illumina shotgun sequences and metadata into full-blown diversity tables and visualizations. Of course, it's focused on the rumen and dam/calf relationships, but is widely applicable to other systems.
Written entirely during Spring Semester 2019 for work done in Dr. Hannah Cunningham-Hollinger's lab at the University of Wyoming, computed on UW's ARCC High-performance servers and presented as a poster at the Western Section American Association of Animal Science annual meeting.
You will need access to the following commands/programs:
metaxa2
,metaxa2_ttt
,metaxa2_dc
(Metaxa2)Rscript
(R)source activate
(Miniconda)qiime
,biom
(Install within conda environment namedqiime2
)
If working on a HPC, contact your department to find out how to get access to these commands.
Clone the script files
git clone https://github.com/MillironX/cowcalf-rumen-metagenomic-pipeline.git
Create a directory with all forward- and reverse- read files in it, named as
<SAMPLEID>_R1_001.fastq.gz
for forward-reads and <SAMPLEID>_R2_001.fastq.gz
for reverse-reads. Add a QIIME2-compatible metadata file
named metadata.tsv
, and copy all of the code files into it. It should look
like
.
├── sample1_R1_001.fastq.gz
├── sample1_R2_001.fastq.gz
├── sample2_R1_001.fastq.gz
├── sample2_R2_001.fastq.gz
├── ...
├── sampleN_R1_001.fastq.gz
├── sampleN_R2_001.fastq.gz
├── metadata.tsv
├── main.sh
├── fastq-to-taxonomy.sh
├── manipulatefeaturetable.R
├── fetchmetadata.R
├── sample-classifier.sh
└── sample-regression.sh
These scripts are preconfigured for use with Slurm and Lmod. Everything is
very basic, and should work on any Slurm configuration. Before use, be sure to
replace the provided credentials with your own in main.sh
,
fastq-to-taxonomy.sh
, sample-classifier.sh
, and sample-regression.sh
, then
run
sbatch main.sh
Edit main.sh
and remove every call to srun
(including its cli options),
replace every instance of $SLURM_NTASKS
with the number of parallel threads
you wish to run, and comment out every line that starts module load
. Then run
./main.sh
This project is finished. It is meant to be a reference and an inspiration, but nothing more. I do not intend to update the code now (as embarrassing as it might be).
- Miniconda now uses the
conda activate
command line instead ofsource activate
Distributed under the MIT License. See LICENSE
for more information.
Thomas A. Christensen II - @MillironX
Project Link: https://github.com/MillironX/cowcalf-rumen-metagenomic-pipline