You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Matcher pipeline to handle the single label/multiple subconcepts use-case.
Description
As discussed in #58, we would certainly benefit from having EDS-NLP handle the nitty-gritty detail of matching a terminology with automatic concept normalisation.
For now, it is reasonably easy to match a terminology wherein the label is the normalisation. However, we could use the kb_id_ attribute (see spaCy documentation) to include a more hierarchical structure.
For instance, paracetamol/tylenol should probably get the label drug and a kb_id_ like ATC=N02BE01.
Proposition
We could modify the eds.matcher component to handle this case natively, or create a new component.
The text was updated successfully, but these errors were encountered:
In the spirit of spaCy, I just wonder whether such information has to be put in custom attributes or handled by the EntityLinker, that relates Span to KnowledgeBase (as far as I understand, tell me if I missed something).
For the example of paracétamol (an ingredient in ROMEDI nomenclature), one has several ATC for instance : https://www.romedi.fr/romedi/IN7310nlprjlh2sb3t0apdjfvtk6u0ifp3 and to get a precise ATC instance may or may not be resolved by the EntityLinker using information in the rest of the Doc. In addition, it may or may not be of interest for the user to resolve this entity ; user might be interested by ingredient and prefer fitting the drugs to their ingredients.
In short, instead of thinking in term of terminology, perhaps one could think of entities in terms of graph, and try to understand to which extend one can import graph properties inside the spaCy machinery.
Feature type
Matcher pipeline to handle the single label/multiple subconcepts use-case.
Description
As discussed in #58, we would certainly benefit from having EDS-NLP handle the nitty-gritty detail of matching a terminology with automatic concept normalisation.
For now, it is reasonably easy to match a terminology wherein the label is the normalisation. However, we could use the
kb_id_
attribute (see spaCy documentation) to include a more hierarchical structure.For instance,
paracetamol/tylenol
should probably get the labeldrug
and akb_id_
likeATC=N02BE01
.Proposition
We could modify the
eds.matcher
component to handle this case natively, or create a new component.The text was updated successfully, but these errors were encountered: