This repository contains the functionality to create and standardize the Global Register of Introduced and Invasive Species - Belgium to a Darwin Core checklist that can be harvested by GBIF.
This unified checklist is the result of two open and reproducible data pipelines developed for the TrIAS project (http://trias-project.be). In the data publication pipeline, we use the Checklist recipe to standardize and publish a selection of authoritative species checklists as Darwin Core Archives to GBIF. Predominantly, these checklists record the presence of alien species in Belgium for a specific taxon group or habitat and are maintained by their respective authors. In the data processing pipeline, we extract all Belgian non-native taxa from these checklists and unify their taxonomy using the GBIF Backbone Taxonomy. This automated process is implemented and documented at https://trias-project.github.io/unified-checklist/ The sources used for the unified checklist are:
- Manual of the Alien Plants of Belgium (Verloove et al. 2018)
- Checklist of alien birds of Belgium (Preda et al. 2019)
- Checklist of non-native freshwater fishes in Flanders, Belgium (Verreycken et al. 2018)
- Checklist of alien herpetofauna of Belgium (van Doorn et al. 2021)
- Inventory of alien macroinvertebrates in Flanders, Belgium (Boets et al. 2018)
- Registry of introduced terrestrial molluscs in Belgium (Backeljau et al. 2019)
- Checklist of alien species in the Scheldt estuary in Flanders, Belgium (Soors et al. 2021)
- Catalogue of the Rust Fungi of Belgium (Vanderweyen et al. 2018)
- WRiMS: World Register of Introduced Marine Species (Rius et al. 2023)
- Waarnemingen.be / observations.be - List of species observed in Belgium (Swinnen et al. 2022)
- Ad hoc checklist of alien species in Belgium (Reyserhove et al. 2018)
- RINSE - Pathways and vectors of biological invasions in Northwest Europe (Zieritz et al. 2018)
See https://trias-project.github.io/unified-checklist/
The repository structure is based on Cookiecutter Data Science and the Checklist recipe. Files and directories indicated with GENERATED
should not be edited manually.
βββ README.md : Description of this repository
βββ LICENSE : Repository license
βββ unified-checklist.Rproj : RStudio project file
βββ .gitignore : Files and directories to be ignored by git
β
βββ data
β βββ raw : Source data as downloaded from GBIF GENERATED
β βββ interim : Unified data GENERATED
β βββ processed : Darwin Core output of mapping script GENERATED
β
βββ references
β βββ verification.tsv : Verification file (for synonyms). Generated by
β 3_verify_taxa.Rmd and then manually annotated
β
βββ docs : Repository website GENERATED
β
βββ index.Rmd : Website homepage
βββ _bookdown.yml : Settings to build website in docs/
β
βββ src
βββ 1_get_taxa.Rmd : Script to get taxa from checklists
βββ 2_get_information.Rmd : Script to get related information
βββ 3_verify_taxa.Rmd : Script to verify taxa
βββ 4_unify_taxa.Rmd : Script to unify taxa
βββ 5_unify_information.Rmd : Script to unify related information
βββ 6_dwc_mapping.Rmd : Script to map to Darwin Core
βββ 7_griis_mapping.Rmd : Script to map to create Excel file for GRIIS
- Clone this repository to your computer
- Open the RStudio project file
- Open the
index.Rmd
R Markdown file in RStudio - Install any required packages
- Click
Build > Build Book
to generate the processed data and build the website indocs/
To publish an update of the dataset:
- Open the resource in the IPT (login required)
Source data
: upload the newly generated data files fromdata/processed
Darwin Core mappings
: does not require updates, unless terms were added/removed in the pipelineMetadata
: does not require updates, except for:Basic metadata
: in description, check if number of taxa (2.500+) still appliesTaxonomic coverage
: in description, update numbers per kingdom based on new dataTemporal coverage
: updateEnd date
if need be
- Publish: click
Publish
, add a short description and publish - Check if dataset is updated at GBIF (can take a couple of hours)
MIT License for the code and documentation in this repository. The included data is released under another license.