GitHub - maxfarrell/eDNAcamtrap: Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Reference Libraries (refLib folder)

Gathering barcoding reference sequences for the Kruger National Park (also available as a stand alone git reposity.

Scripts

The two scripts to gather sequences from GenBank and BOLD are "GB_BOLD_seq_download.R" which take a list of Latin binomials and download sequences from the rentrez and bold R packages. These can be modified to download reference sequences for other marker genes. Next "CO1_from_GB_mito_genomes.R" follows the same process, but uses rentrez and modified scripts from the PrimerMiner R package to downloadd whole mitochondrial genomes and extract COI sequences.

After downloading the sequences, "format_GB_BOLD_refLib.sh", "generate_taxonomy.R", and "format_refLib_dada_2.R are used to clean up the downloaded FASTA files, generate the taxonomy file, and format for use with dada2's built-in RDP classifer.

Beyond the custom library, additional scripts are included to format the MIDORI and terrimporter COI reference databases for use with dada2 (require download of these source databases).

Data

Contains downloaded FASTA files, whole mitochondrial genomes, and species lists generated by the Kruger National Park.

Output

The "output" folder contains intermediate files, plus the final dada2-formatted reference sequences:

Kingdom to Genus: "Kruger_Vertebrates_refLib_dada2.fasta"
Species: "Kruger_Vertebrates_refLib_dada2_species.fasta"
Phylum to Species: "Kruger_Vertebrates_refLib_dada2_phy2species.fasta"

Data processing, merger, and analyses

Data

Analyses can be reproduced with the data files included here. To reproduce the DADA2 pipelines, the raw sequence reads are archived in the NCBI Sequence Read Archive:

BioProject PRJNA490450 Accession numbers SRR7822814 to SRR7822901

The "data" folder contains the raw camera trap annotations, field notes, mammal phylogeny, mammal trait data, and the final merged data file ("merged_eDNA_camtrap_data_nov20_2019.RData" - created by scripts/merging_data.R)

Scripts

Raw sequences are separated into primer sets ("separate_coi_by_primer.sh"), then per primer set, separate DADA2 pipelines perform quality filtering, denoising, chimera removal, ASV calling, and taxonomy assignment (dada2_*.R files). The sequence tables resulting from the separate pipelines are merged with "tax_assign_dada2.R"

"merging_data.R" merges the eDNA sequence tables, camera trap, and water sample data into an RData object for subsequent analyses ("merged_eDNA_camtrap_data_nov20_2019.RData").

Statistical analyses, figures, and tables for the most analyses are conducted withn "analyses.R", with the exception of the hierarchical models, which are conducted with "stan_models.R"

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
plots_tables		plots_tables
refLib		refLib
results		results
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Reference Libraries (refLib folder)

Scripts

Data

Output

Data processing, merger, and analyses

Data

Scripts

About

Releases 2

Packages

Languages

maxfarrell/eDNAcamtrap

Folders and files

Latest commit

History

Repository files navigation

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Reference Libraries (refLib folder)

Scripts

Data

Output

Data processing, merger, and analyses

Data

Scripts

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages