Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding university identifiers to traits.build - discussion #169

Open
ehwenk opened this issue Aug 8, 2024 · 0 comments
Open

Adding university identifiers to traits.build - discussion #169

ehwenk opened this issue Aug 8, 2024 · 0 comments

Comments

@ehwenk
Copy link
Collaborator

ehwenk commented Aug 8, 2024

One of our immediate aims is to add column(s) to the traits.build database structure that allows trait observations to be linked to herbarium records or in instances when a dataset collector has a unique record number that links across trait observations in multiple datasets.

We want to be fully compliant with the DwC standard, but also minimise the number of additional fields we add to traits.build, especially as these fields will be blank for the majority of datasets.

Looking through DwC, it seems there are two distinct types of "identifiers" that probably need to be added:

  1.  A record number for casual links between observations. These are record numbers that link across datasets, but aren’t GUID’s. We’d most likely use [Dwc:recordNumber](http://rs.tdwg.org/dwc/terms/recordNumber), defined as, “An identifier given to the dwc:Occurrence at the time it was recorded. Often serves as a link between field notes and a dwc:Occurrence record, such as a specimen collector's number.”
    
  2.  An identifier that links to *ALL* herbarium vouchers, GBIF, etc. This will be either [dwc:occurrenceID](https://dwc.tdwg.org/list/#dwc_occurrenceID) or [dwc:catalogNumber](https://dwc.tdwg.org/list/#dwc_catalogNumber) although I think occurrenceID should already incorporate codes for the specific herbarium/collection, while catalogNumber would require that we also have columns for herbarium/institution (https://dwc.tdwg.org/list/#dwc_institutionCode) and maybe other details. On the other hand, within ALA, while the occurrenceID is part of the URL, it isn’t actually reported on the page.  
    
  • dwc:occurrenceID (An identifier for the dwc:Occurrence (as opposed to a particular digital record of the dwc:Occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the dwc:occurrenceID globally unique.)

  • dwc:catalogNumber (An identifier (preferably unique) for the record within the data set or collection.)

I don't think the two identifier categories can be merged or we'd be diverging from the dwc meaning of each.

As examples, see this record in ALA, GBIF:

https://biocache.ala.org.au/occurrences/60455440-c777-43d9-9cc0-19354cbc8403

https://www.gbif.org/occurrence/2430993462

The AusTraits team set out as a goal to change traits.build as little as possible, but I think before we do this we should contemplate if there are any other “occurrence” metadata fields we should be adding as part of this – at the moment we don’t explicitly include the concept of an “occurrence” in the traits.build structure. It is implicit via observationID and an observations geographic location (latitude/longitude) that to observe an organism in a location, on a date, it must have occurred there.

A few relevant references:

Nelson G, Sweeney P, Gilbert E (2018) Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6, e1027. doi:10.1002/aps3.1027.

Folk RA, Siniscalchi CM (2021) Biodiversity at the global scale: the synthesis continues. American Journal of Botany 108, 912–924. doi:10.1002/ajb2.1694.

@ehwenk ehwenk added this to AusTraits Aug 8, 2024
@ehwenk ehwenk converted this from a draft issue Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

1 participant