Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docs on CSV fields #33

Open
slint opened this issue Jan 31, 2024 · 0 comments
Open

Improve docs on CSV fields #33

slint opened this issue Jan 31, 2024 · 0 comments

Comments

@slint
Copy link
Member

slint commented Jan 31, 2024

We can integrate the following bullets into the main docs of the CSV fields:


  • Each line represents a record that will be created on Zenodo
  • Required fields are marked as bold in the header. Fields that don’t have a value are skipped.
  • For the doi field:
    • It should be filled in if there is a DOI already registered for an entry
    • If not filled, we’ll register a Zenodo DOI for the record
  • You’ll notice that the fields are a somewhat “de-normalized” version of the JSON representation we’re using on Zenodo. Since we’re often dealing with “complex” fields such as multi-level nesting of arrays of objects, we have taken some liberty with the data formatting to allow representing these values. Some examples of such fields:
    • Keywords (subjects.subject): the cell value contains “new-line” separated keywords
    • Creators/authors (creators.*): following the “new-line” separated convention, these have been “tabularized”. In the example there are two authors: Nils Schlüter (affiliation: Museum für Naturkunde, ORCID: 0000-0002-5699-3684) and John Smith (affiliation: CERN, ORCID: none)
  • Some of the fields rely on controlled vocabularies (e.g. the resource types, contributor types, licenses, related identifier relation types, etc.). The values for these types can be found under the following endpoints (to which you can add a ?q=<search term> query string parameter to narrow down results)
  • For custom fields we have a reference sheet at https://docs.google.com/spreadsheets/d/1TUyDT6yOypX2DBuM_PNUZucFTC93uFlEa7PoAMYvnDI/edit#gid=314238332, but the basic premise is that they correspond to known vocabularies such as DarwinCore, AudubonCore, etc. They all receive multiple terms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant