Skip to content
forked from aphp/eds-pseudo

EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports

License

Notifications You must be signed in to change notification settings

arkhn/eds-pseudo

 
 

Repository files navigation

Tests Documentation Codecov Black Poetry DVC

EDS-Pseudonymisation

This project aims at detecting identifying entities at AP-HP's Clinical Data Warehouse:

Label Description
ADRESSE Street address, eg 33 boulevard de Picpus
DATE Any absolute date other than a birthdate
DATE_NAISSANCE Birthdate
HOPITAL Hospital name, eg Hôpital Rothschild
IPP Internal AP-HP identifier for patients, displayed as a number
MAIL Email address
NDA Internal AP-HP identifier for visits, displayed as a number
NOM Any last name (patients, doctors, third parties)
PRENOM Any first name (patients, doctors, etc)
SECU Social security number
TEL Any phone number
VILLE Any city
ZIP Any zip code

Publication

Please find our arXiv preprint at the following link: https://arxiv.org/pdf/2303.13451.pdf.

If you use EDS-Pseudo, please cite us as below:

@article{tannier2023development,
  title={Development and validation of a natural language processing algorithm to pseudonymize documents in the context of a clinical data warehouse},
  author={Tannier, Xavier and Wajsb{\"u}rt, Perceval and Calliger, Alice and Dura, Basile and Mouchet, Alexandre and Hilka, Martin and Bey, Romain},
  journal={arXiv preprint arXiv:2303.13451},
  year={2023}
}

Documentation

Visit the documentation for more information!

Acknowledgement

We would like to thank Assistance Publique – Hôpitaux de Paris and AP-HP Foundation for funding this project.

About

EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%