Measure agreement between chart reviewers.
Whether your chart annotations come from humans, machine-learning, or coded data like ICD-10,
chart-review
can compare them to reveal interesting statistics like:
Accuracy
- F1-score (agreement)
- Cohen's Kappa (agreement)
- Sensitivity and Specificity
- Positive (PPV) or Negative Predictive Value (NPV)
- False Negative Rate (FNR)
Confusion Matrix
- TP = True Positive (type I error)
- TN = True Negative (type II error)
- FP = False Positive
- FN = False Negative
For guides on installing & using Chart Review, read our documentation.
$ ls
config.yaml labelstudio-export.json
$ chart-review accuracy jill jane
Comparing 3 charts (1, 3–4)
Truth: jill
Annotator: jane
F1 Sens Spec PPV NPV Kappa TP FN TN FP Label
0.667 0.75 0.6 0.6 0.75 0.341 3 1 3 2 *
0.667 0.5 1.0 1.0 0.5 0.4 1 1 1 0 Cough
1.0 1.0 1.0 1.0 1.0 1.0 2 0 1 0 Fatigue
0 0 0 0 0 0 0 0 1 2 Headache
We love 💖 contributions!
If you have a good suggestion 💡 or found a bug 🐛, read our brief contributors guide for pointers to filing issues and what to expect.