Replies: 5 comments 8 replies
-
Agree with all the above
An example of an extension would be one for users of LinkML. LinkML allows for the specification of enums, the values of which may be associated with CURIEs within a LinkML schema. To expose these as SSSOM there would be a deterministic mapping that would create a CURIE that has the schema CURIE/URL as base, suffixed with the URL encoding. linkml/linkml#272 |
Beta Was this translation helpful? Give feedback.
-
Here is the concrete non-normative proposal for now: Motivation: In much the same way as entity-entity mappings which are the primary concern of SSSOM, literals are often mapped to entities during curation activities, such as Named Entity Recognition or manual scientific publication curation. Usually, a string that "represents" an entity is linked to an ontology term or a database entity. We usually refer to these associations as synonyms, and curate them as part of our ontologies. While not entirely obvious through the SSSOM conceptual model, we want to support this use case by offering a syntactic convention: A literal
For example, the literal
Note that there is currently no support in sssom tools for these kinds of mappings, but feel free to request. |
Beta Was this translation helpful? Give feedback.
-
I am irked by this example. URL encoding gets out of hand pretty fast, and using dummy prefixes sort of defeats the purposes of using validatable, controlled vocabularies Perhaps we could alternatively consider generating a similar standard to SSSOM specifically for properties where the subject always has to be a CURIE, the predicate always a "property" (in the OBO file format sense) and the object a literal (e.g., like synonyms, SMILES strings, numbers). It could share most of the same stuff as SSSOM with the addition of an optional object data type column (which would default to XSD's string, like most people expect) |
Beta Was this translation helpful? Give feedback.
-
Just had a longer discussion about this with @udp and we are now convinced that we should not overload the current SSSOM profile, but instead provide another "class/profile" specifically for literal mappings. I will make a stab at this. |
Beta Was this translation helpful? Give feedback.
-
This is not about named entity recognition but about entity linking. There is a working group to standardize Reconciliation API for linking literal values to identifiers from terminologies. |
Beta Was this translation helpful? Give feedback.
-
While SSSOM is clearly focused on mappings between controlled vocabulary terms (ids which correspond to some entity in the world), there is a huge amount of work going on that is concerned with "literal" mappings -> mapping a string to a concept in an ontology/controlled vocabulary. These are byproducts for example of pipelines for named entity recognition tasks, or manual mapping efforts of curators that read papers and map strings they read to an ontology concept. I believe that the same concerns that we have for normal terminological mappings also apply here - (provenance, curation rules metadata), and therefore propose to develop a scheme by which we can represent literals in the
subject_id
field. One example would be to use URL encoded strings in a standard namespace:I believe this would take care of quite a few problems, and users can easily handle or ignore these kinds of subjects. Note that strictly speaking, this is already permitted by the spec of SSSOM - so we are not discussing here whether this should be allowed or not. The question is more a matter of discussing whether we should as a community recommend a "standard" way to handle literals.
I invite everyone to voice their support, ideas or concerns here.
see related: mapping-commons/sssom-py#28 (@cmungall also has some big plans to use these for mapping reconciliation).
Beta Was this translation helpful? Give feedback.
All reactions