-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define @id
for a Dataset
(Version
)
#76
Comments
One observation with regard to the HCLS Version Level Description (HCLS VLD), which I came across when working on the JSON-LD structure for the I am not sure that the HCLS VLD maps well onto commit-based versioning. The HCLS VLD concept seems to describe a selected state that was released. It seems more related to tags in datasets. Just wanted to mention that, I am not proposing any change, because I think it can be justified to apply HCLS VLD to commit-based versions (although the cardinality-1
|
Can you give a concrete example where it would not map well?
|
I thought about merge-commits. But that might not be relevant, because we can choose one of the parents.
I would actually also propose the latter, i.e. |
re The strongest argument that I can find for going for gitsha-only is:
Not having a UUID component in the version-level ID avoids this complication, with no loss of functionality or precission. |
This established the absolute minimum necessary to distinguish summary-level and version-level dataset descriptors. The main metadata record is always considered to be a version-level description. In addition, a summary level description is linked via `dcterms:isVersionOf`. The summary-level descriptor itself is bare-bone and only declares `hasVersion` with a backlink to the version-level description. Ping #76
This is very much in the context of describing datasets that are DataLad datasets. The main forum for that should be datalad/datalad-metalad#389 #101 also contains an example of an approach that does not require DataLad identifiers. |
Replaces: datalad/datalad-metalad#380
Ping: datalad/datalad-registry#217
A tabby record will typically be a version-level description (in HCLS terms. However, this is not necessarily the case (without a
version
label, we would be missing an essential component, and it would instantly be a summary-level description.Such a difference would not necessarily impact the type annotation. Both could be
dcat:Dataset
or https://schema.org/Dataset. It would, however, matter for crafting a valid@id
.We need to have a common approach for
@id
choice within datalad's metadata ecosystem to simplify homogenization and merges across metadata sources (see datalad/datalad-metalad#30 for other thoughts). Any approach to@id
must not confuse the different description levels.I posted some ideas in datalad/datalad-registry#217 (comment)
Concrete issues:
@id
. However, in general we will not be able to infer the nature of such a DOI (concept DOI covering all versions vs. version-specific DOI). Moreover, such a DOI may be specific to a particular download (distribution-level identifier). One and the same dataset (version) could be hosted in more than one data portal and receive different DOIs that all point to the exact same information at different locations.tabby
metadata extractor would need to report at least two metadata records: the version-level description, and a concept-level description (the former linking the latter viaisVersionOf
The text was updated successfully, but these errors were encountered: