Skip to content

Commit

Permalink
Release v5.0.0 (#787)
Browse files Browse the repository at this point in the history
This is the first AusTraits release to be compiled using the `{traits.build}` R package, available at https://github.com/traitecoevo/traits.build. The workflow is a refined version of the R-scripted pipeline previously used to compile AusTraits and the output structure has not changed, other than on-going minor fine-tuning, as detailed below.

Taxonomy: New AusTraits-specific functions relating to taxonomy have been written that utilise the package `{APCalign}`. One new function, `build_align_taxon_names` uses the function `APCalign::align_taxa()` to standardise syntax, correct typos, and ensure taxon names are aligned with some name on the APC or APNI lists. The second function, `build_update_taxon_list` uses the function APCalign:update_taxonomy to build a new `taxon_list.csv` file for the config/ folder. This file is then used by `traits.build::dataset_update_taxonomy` to update names to their currently accepted taxon name, when possible.
Edits were made to many dataset metadata files to align with these changes.

Changes to table structure:

- `method_id` was added, so that when the same trait was measured using multiple methods, these could be distinguished between
- the context identifiers were renamed to `method_context_id`, `temporal_context_id`, `entity_context_id`, `plot_context_id`, `treatment_context_id` to be more explicit
- `austraits_curators` became `dataset_curators`
- added `repeat_measurements_id` for trait measurements that are response curves, both to unify the repeated measurements that comprise a single "measurement" and also to capture the order of the measurements
No datasets have been added for this release. However, some metadata file changes exist, in particular to metadata[["taxonomic_updates"]], including removing duplicate taxonomic_updates or unneeded taxonomic updates and continued standardisation of taxonomic updates. All original_names in the taxonomic_updates tibble are now aligned to a specified taxon_name - either an informal name assigned through metadata[["taxonomic_updates"]] or a match to a name on the taxon_list.
  • Loading branch information
dfalster authored Nov 19, 2023
2 parents 4beb3d9 + 9d82348 commit 21bd155
Show file tree
Hide file tree
Showing 657 changed files with 65,087 additions and 155,350 deletions.
80 changes: 75 additions & 5 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@


# Contributing to austraits.build

We envision AusTraits as an on-going collaborative community resource that:
Expand All @@ -7,10 +9,78 @@ We envision AusTraits as an on-going collaborative community resource that:
3. Aspires to fully transparent and reproducible research of highest standard, and
4. Builds a sense of community among contributors and users.

We'd love for you to contribute. You can read more about the ways you can contribute on our website.
We'd love for you to contribute. You can read more about the ways you can contribute below.

- [Contributing new data](#contributing-new-data)
- [Improving data quality and reporting errors ](#improving-data-quality-and-reporting-error)
- [Improving documentation](#improving-documentation)
- [Development of `traits.build` package workflow](development-of-traitsbuild-workflow)


Please note that the AusTraits project has adopted a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project you agree to abide by its terms.

## Improving data quality and reporting error

All users can contribute to continual improvement, by reporting issues you encounter.

If you notice a possible error in AusTraits, please [post an issue on GitHub](https://github.com/traitecoevo/austraits.build/issues). If you can, please provide code illustrating the problem.
## Improving documentation

All users can contribute to continual improvement of AusTraits documentation, by letting us know what parts of our documentation were unclear.

If you have a suggestion, please [post an issue on GitHub](https://github.com/traitecoevo/austraits.build/issues).
## Development of `traits.build`` workflow

AusTraits uses the `traits.build` package to harmonise different sources. Interested users can help us develop this package at the package website <https://github.com/traitecoevo/traits.build/>

## Contributing new data {#data}

We gladly accept new data contributions to AusTraits, including recently collected trait data, legacy trait data from your file archives, transcribed reference works, and transcribed datasets from the literature.

If you would like to contribute data, the requirements are:

- Data was collected for Australian plant species growing in Australia
- You collected data on one of the traits listed in the [trait definitions table](http://traitecoevo.github.io/austraits.build/articles/trait_definitions.html)
- You are willing to release the data under an open license for reuse by the scientific community
- You make it is as easy as possible for us to incorporate your data by following the instructions.

### What do I need to do?

The AusTraits curators will merge each dataset into AusTraits. For each study we carefully check to ensure units are accurate, continuous trait values map in the expected range, categorical trait values map onto sensible terms, location data are accurate, taxon names are aligned to current standards, and all metadata are recorded.



As a first step, all we really require is a **Data Spreadsheet** and a copy of your **Manuscript**.

After completing a series of quality checks, we will send you a report to review that summarises the data and metadata. The reports include plots for each continuous trait, comparing values in your submission to those already in AusTraits. It plots your study locations (sites) on a map. It summarises your metadata and indicates the taxonomic alignments made. The report includes both targeted questions (sometimes) and automated questions, acting as prompts to review aspects of the report. Reviewing your report should not take long, and confirms the transparent, thorough process used to build AusTraits.

### Data

**Your dataset, preferably in a spreadsheet format.**

* **Traits:** Make sure the trait names used in your dataset are easy to interpret or, alternatively, provide a brief definition
* **Units:** Please make sure the units for each trait are provided as part of the trait name or in a separate spreadsheet/worksheet
* **Value type:** We prefer to incorporate raw values (or individual means) in AusTraits, but can use population or multi-site means if that is what is available. For mean values, please provide sample size.
* **Location:** For field studies, please provide location details (see more below).
* **Context:** Optional, but AusTraits can read in one (or more) column(s) with contextual information, such as canopy position, experimental manipulation, dry vs. wet season, etc.
* **Collection date:** Optional, but AusTraits can read in a column with sampling date (in any format)
* **Species/taxa:** Please provide complete species names or a look-up table to match species codes. Out-dated taxonomy is fine – we have name-matching algorithms.

### Metadata

The AusTraits structure has fields to input all metadata associated with your study, including methods, location details, and context. In detail:
* **Methods:** For published studies the necessary methods and study information can be extracted from a publication; just attach a copy of the manuscript or the DOI.
- The only commonly missing information is the general sampling period, such as ‘October-December 2020’; this is only required if your data file doesn't have a date column.
- For unpublished studies, provide brief methods for how each trait was measured; you can simply refer to a standard published protocol
* **Study locations:** Whenever possible, AusTraits includes location names, location coordinates (latitude/longitude), and any other location properties you have measured/recorded (vegetation description, soil chemistry, climate data, etc.). This information can be provided as a second spreadsheet or as additional columns in the main data spreadsheet. Just make sure the location name is the same in both spreadsheets.
* **Context:** If your study includes contextual variables, make sure the context values are included as columns in the data spreadsheet. Also, please make sure the contextual values are self-explanatory or provide the necessary explanation.
* **Authors:** Authorship is extended to anyone who played a key intellectual role in the experimental design and data collection. Most studies have 1-3 authors. For each author, please provide a **name**, **institutional affiliation**, **email address**, and their **ORCID** (if available). Please nominate a single contributor to be the dataset's point of contact; this person's email will not be listed in the metadata file, but is the person future AusTraits users are likely to seek out if they have questions. Additional field assistants can be listed.
* **Source:** The published manuscript is generally the source. If different traits or observations from a single dataset were published separately, please provide both references. If the dataset you are submitting is a compilation from many sources, please provide a complete list of sources and indicate which rows of data are attributable to which source.


### Common hang-ups

### Code of Conduct
Some home issues with contributions include:

Please note that the austraits project is released with a
[Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this
project you agree to abide by its terms.
* **Categorical trait values:** If you have categorical traits, please define any trait values (i.e. entries for that trait) that are not self-explanatory. A copy of our definitions file, including allowable values for each trait is available [here](http://traitecoevo.github.io/austraits.build/articles/trait_definitions.html). The definitions file is a work-in-progress and additional trait values can be added if needed to capture the exact meaning you intended.
* **Data sourced from others:** For numerical traits, AusTraits strives to only include data collected by you for this project, to avoid having multiple entries of the same measurement/observation. If you have certain trait values that were sourced from the literature, an online database, or colleagues, please indicate that clearly. If trait values for some species were collected by you and others were sourced, it is very helpful if you could add a column to your spreadsheet that indicates the source for different rows of data.
83 changes: 0 additions & 83 deletions .github/workflows/R-CMD-check.yaml

This file was deleted.

8 changes: 3 additions & 5 deletions .github/workflows/check-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,13 @@ jobs:

- name: check sources
run: |
library(austraits.build)
source("scripts/custom.R")
library(traits.build)
source("R/custom_R_code.R")
dataset_test(dir("data"))
shell: Rscript {0}

- name: build austraits
run: |
library(austraits.build)
source("scripts/custom.R")
remake::make()
source("build.R")
shell: Rscript {0}

4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.remake
remake.yml
export
.Rproj.user
.DS_Store
Expand All @@ -13,7 +14,7 @@ temp
.local/
.config/
.vs/
*.Rproj
man/*
tmp*
reports
data_*.csv
Expand All @@ -27,6 +28,7 @@ waiting_to_build
ignore
inst/doc
doc
docs
Meta
config/APC/*
config/NSL/*
Expand Down
45 changes: 7 additions & 38 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,61 +1,30 @@
Type: Package
Type: Compendium
Package: austraits.build
Title: Package used to build an AusTraits data resource
Title: Reprository used to build an AusTraits data resource
Version: 0.9.0
Maintainer: Daniel Falster <daniel.falster@unsw.edu.au>
Authors@R: c(
person(given = "Daniel", family = "Falster", role = c("cre", "aut"), email = "daniel.falster@unsw.edu.au", comment = c(ORCID = "0000-0002-9814-092X")),
person(given = "Elizabeth", family = "Wenk", role = c("cur", "aut"), comment = c(ORCID = "0000-0001-5640-5910")),
person(given = "Rachael", family = "Gallagher", role = c("aut", "cur"), comment = c(ORCID = "0000-0002-4680-8115")),
person(given = "Gary", family = "Truong", role = c("ctb")),
person(given = "Stuart", family = "Allen", role = c("ctb")),
person("ARDC", role = c("fnd")),
person("ARC", role = c("fnd"))
)
Description: This package enbales harmonising of data from diverse sources. The code was originally built to support AusTraits, an open-source compilation of data on the traits of Australian plant species. For more information on AusTraits go to https://austraits.org.
Description: This compendium compiles the AusTraits database, an open-source compilation of data on the traits of Australian plant species (see Falster et al 2021, <doi:10.1038/s41597-021-01006-6>). For more information on AusTraits go to https://austraits.org.
BugReports: https://github.com/traitecoevo/austraits.build/issues
URL: http://traitecoevo.github.io/austraits.build/
License: BSD_2_clause + file LICENCE
Depends:
R (>= 3.6.0),
R (>= 4.2.0),
base,
traits.build (>= 1.0.1),
dplyr,
lubridate,
readr,
stringr,
tidyr
Imports:
crayon,
git2r,
kableExtra,
magrittr,
purrr,
RefManageR,
remake,
rlang,
rmarkdown,
stringi,
styler,
testthat,
tibble,
whisker,
yaml
Suggests:
austraits,
leaflet,
bibtex,
knitr,
bench,
devtools,
markdown,
rprojroot,
pkgdown,
rcrossref,
zip,
covr
furrr
Remotes:
traitecoevo/austraits@develop,
richfitz/remake
traitecoevo/traits.build@v1.0.1
Encoding: UTF-8
VignetteBuilder: knitr
RoxygenNote: 7.2.3
Expand Down
44 changes: 0 additions & 44 deletions Dockerfile

This file was deleted.

Loading

0 comments on commit 21bd155

Please sign in to comment.