Skip to content

A Transformer-based Machine Learning Framework using Conditional Random Fields as Decoder for Clinical Text Mining

Notifications You must be signed in to change notification settings

poethan/TransformerCRF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransformerCRF (from UoM, MCR)

A Transformer-based Machine Learning Framework using Conditional Random Fields as Decoder for Clinical Text Mining

Place-holder for our project/poster presentation in HealTAC2022 conference and follow-up work: code and data sharing

Conference program: https://healtac2022.github.io/programmes/

Our poster and presentation are shared in the files under this repository https://github.com/poethan/TransformerCRF/blob/main/Healtac22_poster_transformerCRF.pdf
alternatively this link also fine folder-address.

News: TransformerCRF-v2 is hosted at this new repository - click link

On clinical Text Mining: another project/poster presentation from HealTAC2022 is also uploaded on this github page 'Diagnosis Certainty and Progression: A Natural Language Processing Approach to Enable Characterisation of the Evolution of Diagnoses in Clinical Notes' poster

Full Paper Coming
"On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?" Yuping Wu, Lifeng Han, Valerio Antonini, and Goran Nenadic. ArXiv pre-print https://doi.org/10.48550/arXiv.2210.12770 Link. (https://github.com/poethan/TransformerCRF/blob/main/view_TransformerCRF.pdf)
Direct code download
From this repository:
https://github.com/poethan/TransformerCRF/blob/main/TransformerCRF_dev-main.zip
Experimental Trials from Pilot Study using n2c2-2018 challenge set in data-constrained fine-tuning/learning

TransformerCRF

  • learning from scratch using 303 EHR letters from n2c2-2018

BioformerApt

  • Adaptation layer on top of Bioformer testing its performance on n2c2-2018 test set of 200 EHR letters

BioformerApt, ClinicalBERT-CRF, BioformerCRF

  • Continuous learning useing data constrained setting of 303 EHR letters
Citation Assistant

Bibtex

@misc{https://doi.org/10.48550/arxiv.2210.12770, doi = {10.48550/ARXIV.2210.12770},

url = {https://arxiv.org/abs/2210.12770},

author = {Wu, Yuping and Han, Lifeng and Antonini, Valerio and Nenadic, Goran},

keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},

title = {On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?},

publisher = {arXiv},

year = {2022},

copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International} }

Plain

  • Yuping Wu, Lifeng Han, Valerio Antonini, and Goran Nenadic. 2022. "On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?". Pre-print. arXiv:2210.12770 [cs.CL] https://doi.org/10.48550/arXiv.2210.12770

  • Lifeng Han, Valerio Antonini, Ghada Alfattni, Alfredo Madrid, Warren Del-Pinto, Judith Andrew, William G. Dixon, Meghna Jani, Ana María Aldana, Robyn Hamilton, Karim Webb, Goran Nenadic. 2022. A Transformer-based Machine Learning Framework using Conditional Random Fields as Decoder for Clinical Text Mining. Posters in HealTAC 2022: the 5th Healthcare Text Analytics Conference. June 15-16th. Virtual and Local Hubs in UK.

  • Alfredo Madrid, Caitlin Bullin, Lifeng Han, Judith Andrew, Warren Del-Pinto, Ghada Alfattni, Oswaldo S. Pabón, Ernestina M. Ruiz, Luis Rodríguez, Ana María Aldana, Robyn Hamilton, Karim Webb, Meghna Jani, Goran Nenadic, William G. Dixon. 2022. Diagnosis Certainty and Progression: A Natural Language Processing Approach to Enable Characterisation of the Evolution of Diagnoses in Clinical Notes. Posters in HealTAC 2022: the 5th Healthcare Text Analytics Conference. June 15-16th. Virtual and Local Hubs in UK.

Acknowledgement
  • We thank Viktor Schlegel for helping on debugging and Hao Li for discussion during the development stage. We thank Alfredo Madrid Garcia on the co-work regarding clinical annotation and computational model guidlines. We are grateful to Manchester Open Research Fund (OR) for supporting this on-going open source project.
  • NeuroNER-project
Contact, welcome to reach out
Read More about Our Work

Yuping Wu Lifeng Han Goran Nenadic

About

A Transformer-based Machine Learning Framework using Conditional Random Fields as Decoder for Clinical Text Mining

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published