-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the legends in Stave to show Disease, Medical, etc. #53
Comments
Except Disease and Medical, what other annotations can we add? BioBERT: https://github.com/dmis-lab/biobert |
should be based on the |
Yeah. Here is the problem. In the config of this example, the ner_type is specified to Disease, so all the model outputs would be Disease, if I remove this configuration and run the pipeline, all the outputs will be labelled as "BioEntity", see the default configuration here (Line 235). I could not find any instructions on how can I change the entity type to show different kinds of types, instead of all the Entities are labelled as "BioEntity". |
@hunterhector I think I detect the problem. In the following file: I check the source code for BioBERTProcessor, and I noticed that the relationship between Line 235 and Line 228 seems that do not make sense. It just labels all the type of entities as "BioEntity", and if I change the configuration to "DISEASE", all the type of entities will then be labelled as "DISEASE", and I can change whatever I want actually. Here I just change the configuration to "APPLE", like this: |
|
got it. I remember I used to solve an issue to support bio ner using stanza, I will try that. |
I tried stanza, and the ner_type of the outputs are as follows:
we may change the Dieases, Medical to Test, Problem and Treatment. |
Yeah, double check with @Piyush13y since I am sure we also have more spacy models |
Yes, we have more scispacy models that we can use and they give out different kinds of NER labels. Ref: https://allenai.github.io/scispacy/ @Leolty I feel we can't just be changing the label type for the reason that I mentioned to you guys on the call. We want the users to see what they understand in the legend and not some NLP jargon. They wouldn't know what EntityMentions/MedicalEntityMentions mean. Also, adding more attributes (ner_type) to the same annotation will still require changes to the ontology file. Might as well create new annotations for each of the NER types for a smoother demo. At least, that's what I think, specially since it might not really take a lot more time than the adjustable label type approach. |
@hunterhector @Piyush13y We have the json file here, like this, https://github.com/asyml/ForteHealth/blob/50_streamlit_to_stave/examples/search_engine_to_stave/default_onto_project.json And in the code, we usually use this to create new project: It can successfully create the project, but I can not open the documents in the project, it keeps loading. So I go over the
|
Hi, @Leolty. Thanks for exploring this and it seems like you find an interesting bug, and I believe it is related to this function. Would you mind creating the issues on Stave to discuss the bug? Now the fix of the bug could be simple (fixing the quotation marks and case before storing the value to the database). But I am still wondering of the reasons and the best solution:
|
Hi, @hunterhector. After check the function you sent me, I think I have known where the bug is. As you mentioned, In python, we usually use these functions to load a json file: import json
file_obj = open(file_path)
project_json = json.load(file_obj)
create_project(project_json) And I just made Actually, I just need to use the dump function to solve this bug, for example: import json
file_obj = open(file_path)
project_json = json.load(file_obj)
create_project(json.dumps(project_json)) So I think there is no need to modify the source code. We just need to make sure the parameter of the funtion |
Sounds good, thanks! |
As mentioned in the meeting.
The text was updated successfully, but these errors were encountered: