Developed by Nat Forsdick, 2021. This project is led and funded by Manaaki Whenua - Landcare Research.
This repo contains scripts used to analyse single- and paired-end genotyping-by-sequencing (GBS) data from giant wētā species, including Deinacrida heteracantha, D. fallai, and D. mahoenui.
This work is associated with Forsdick et al., Population genomic analysis of Mahoenui giant wētā (Deinacrida mahoenui) reveals no reduction in genomic diversity following translocation, (in progress), focussing on D. mahoenui, using a reference genome from D. fallai.
Scripts were originally run on the NeSI platform via SLURM workload manager, except for R scripts which were run locally.
The workflow moves through demultiplexing, quality control, and mapping, before processing through Stacks ref_map and populations pipelines after which data are output in formats for analysis via genetics packages such as adegenet and SNPRelate in R, and STRUCTURE.
-
Stacks v2.41
-
TrimGalore v0.6.4
-
SAMtools v1.9
-
VCFtools v
-
PLINK v
-
Structure v2.3.4
-
R v4.3.1
- stacks_process_radtags.sl - Demultiplex raw paired-end GBS with Stacks process_radtags.
- run_trimgalore_B2.sl - Trim and adapter removal
- run_bowtie2_index.sl - Index reference genome
- 02_bowtie_B2.sl - Map individual data, collect mapping statistics
- 03_ref_map.sl - Run Stacks ref_map.pl
- 04_stacks_populations_B2.sl - Call and filter variants, allowing either 30% or 0% missing data, collect preliminary statistics, and output as VCF and PLINK
- 05_vcf2adegenet.sl - Convert VCF to PLINK format for conversion to other formats for downstream processing
- Analysis of final SNP sets in R
- 06_structure.sl - Analysis of final SNP sets with STRUCTURE
- Visualisation of combined STRUCTURE outputs in R with