External Validation of Predictive Models for Diagnosis, Management and Severity of Pediatric Appendicitis
This repository accompanies the paper "External Validation of Predictive Models for Diagnosis, Management and Severity of Pediatric Appendicitis".
Abstract
Background. Appendicitis is a common condition among children and adolescents. Machine learning models can offer much-needed tools for improved diagnosis, severity assessment and management guidance for pediatric appendicitis. However, to be adopted in practice, such systems must be reliable, safe and robust across various medical contexts, e.g., hospitals with distinct clinical practices and patient populations.
Methods. We performed external validation of models predicting the diagnosis, management and severity of pediatric appendicitis. Trained on a cohort of 430 patients admitted to the Children's Hospital St. Hedwig (Regensburg, Germany), the models were validated on an independent cohort of 301 patients from the Florence Nightingale Hospital (Düsseldorf, Germany). The data included demographic, clinical, scoring, laboratory and ultrasound parameters. In addition, we explored the benefits of model retraining and inspected variable importance.
Results. The distributions of most parameters differed between the datasets. Consequently, we saw a decrease in predictive performance for diagnosis, management and severity across most metrics. After retraining with a portion of external data, we observed gains in performance, which, nonetheless, remained lower than in the original study. Notably, the most important variables were consistent across the datasets.
Conclusions. While the performance of transferred models was satisfactory, it remained lower than on the original data. This study demonstrates challenges in transferring models between hospitals, especially when clinical practice and demographics differ or in the presence of externalities such as pandemics. We also highlight the limitations of retraining as a potential remedy since it could not restore predictive performance to the initial level.
All the libraries required to run this code are in the conda environment environment.yml
. To install it, follow the instructions below:
conda env create -f environment.yml # install dependencies
conda activate app-ext-val # activate environment
The data for both Regensburg and Düsseldordf cohorts is available in an anonymized format in the data
folder as CSV files.
utils
folder contains utility functions and scriptsnotebooks
folder contains Jupyter notebooks for the analysissummary_statistics.ipynb
performs exploratory data analysisexternal_validation.ipynb
performs external validation of the predictive modelsretraining.ipynb
explores model retrainingvariable_importance.ipynb
explores variable importance
This repository is maintained by Ričards Marcinkevičs (richard.martsinkevich@gmail.com).
To cite the paper, please use
@article{MarcinkevicsSokol2024,
title = {External Validation of Predictive Models for Diagnosis, Management and Severity of Pediatric Appendicitis},
url = {http://dx.doi.org/10.1101/2024.10.28.24316300},
DOI = {10.1101/2024.10.28.24316300},
publisher = {Cold Spring Harbor Laboratory},
author = {Marcinkevi\v{c}s, Ri\v{c}ards and Sokol, Kacper and Paulraj, Akhil and Hilbert, Melinda A. and Rimili, Vivien and Wellmann, Sven and Knorr, Christian and Reingruber, Bertram and Vogt, Julia E. and Reis Wolfertstetter, Patricia},
year = {2024}
}
This repository is additionally licensed under CC-BY-NC-4.0.