This repository allows you to experiment with the R
code used to
generate the conference presentation “scorecal - Empirical score
calibration under the microscope”.
To run this code without installing anything, click on the launch binder
badge above. This will launch Rstudio on a free, small, cloud
instance. It may take a two or three minutes to launch the server and
initialise Rstudio.
You will eventually get a web browser tab with the Rstudio IDE running
in it, and with this scorecal_CSCC_2019
project open in it. Find the
file scorecal_CSCC_2019.Rmd
in the Files
tab of the bottom, right
pane, and click on its name. This will open the file for editing in the
top, left pane. Then click the Knit
button just above the code. This
will execute all the code in the notebook and render the results to
. You will see progress messages in the R Markdown
tab of the bottom, left pane. The rendered PDF file will open
in a pop-up window.
The notebook took a little under three minutes to knit when I tried it.
There will be many red messages in the R Markdown
tab while the PDF
file is being rendered. These can be ignored (provided you get sensible
looking output). The only difference I could spot is that the font used
in the archived presentation is missing from the cloud instance, and the
default font used in its place has a slightly larger spacing, resulting
in some text on the slides being slightly too long for the page.
The notebook took about two minutes to knit when I tried it. There will
be many red messages in the R Markdown
tab while the PDF file is being
rendered. These can be ignored (provided you get sensible looking
output). The only difference I could spot is that the font used in the
archived presentation is missing from the cloud instance, and the
default font used in its place has a slightly larger spacing, resulting
in some text on the slides being slightly too long for the page.
You can edit the code in the notebook and re-knit it to see what happens. More likely, you will want to edit the code and just execute it in the notebook without knitting it to PDF. The results of each code chunk will be displayed in the notebook immediately after the code chunk. To do that, you will have to find out how to drive Rstudio.
The cloud instance is small, and has various constraints imposed on it. The limitations at the time of writing this are:
- The server has limited memory so you cannot load large datasets or run big computations
- This is meant for interactive and ephemeral interactive coding so an instance will die after 10 minutes of inactivity.
- An instance cannot be kept alive for more than 12 hours
The holepunch package was used to convert this repository to a docker image for cloud execution.
This repository contains an executable R
notebook that generates
the presentation “scorecal - Empirical score calibration under the
microscope” given by Ross W. Gayler on
2019-08-30 at the conference Credit Scoring & Credit Control
XVI in Edinburgh,
UK. The presentation, as given, is archived at
The notebook contains all the R
code used to simulate and analyse the
data and generate the plots. This is in the form of a script rather than
a package. You are free to modify the script to see what happens. If you
are using the Rstudio IDE, edit the notebook file
and execute the relevant code chunks. The
results will be displayed in the notebook immediately after the code
chunks. Click the Knit
button to render a new copy of the presentation
slides with the results of your altered code.
This notebook requires the binb
package to enable rendering the output as a PDF presentation. It uses
the metropolis
template to set the style of the presentation. These
require a variety of LaTeX tools and fonts to be installed, in addition
to the rmarkdown
and knitr
infrastructure. See for binb
installation advice.
Conference presentation abstract
Score calibration is the process of empirically determining the relationship between a score and an outcome on some population of interest, and scaling is the process of expressing that relationship in agreed units. Calibration is often treated as a simple matter and attacked with simple tools – typically, either assuming the relationship between score and log-odds is linear and fitting a logistic regression with the score as the only covariate, or dividing the score range into bands and plotting the empirical log-odds as a function of score band.
Both approaches ignore some information in the data. The assumption of a linear score to log-odds relationship is too restrictive and score banding ignores the continuity of the scores. While a linear score to log-odds relationship is often an adequate approximation, the reality can be much more interesting, with noticeable deviations from the linear trend. These deviations include large-scale non-linearity, small-scale non-monotonicity, discrete discontinuities, and complete breakdown of the linear trend at extreme scores.
Detecting these effects requires a more sophisticated approach to empirically determining the score to outcome relationship. Taking a more sophisticated approach can be surprisingly tricky: the typically strong linear trend can obscure smaller deviations from linearity; detecting subtle trends requires exploiting the continuity of the scores, which can obscure discrete deviations; trends at extreme scores (out in the data-sparse tails of the distribution of scores) can be obscured by trends at less extreme scores (where there is more data); score distributions with some specific values that are relatively common can disrupt methods relying on continuity; and any modelling technique can introduce its own biases.
Over the years I have developed a personal approach to these issues in score calibration and implemented them as an open source, publicly accessible R package for score calibration. I discuss these technical issues in empirical score calibration and show how they are addressed in the scorecal package.
Thanks to:
Jonathan Crook and the Credit Research Centre, University of Edinburgh Business School, for running a great conference.
Mathew Ling and the fine folk of
ANZORN for helping me use
to make this
repository remotely executable.
All the maintainers of the R
packages used in this repository, and the
maintainers of all their
maintainer | packages |
Hadley Wickham | dplyr, forcats, stringr, tidyverse |
Jim Hester | fs, glue |
Kirill Müller | here, tibble |
Yihui Xie | knitr, rmarkdown |
Dirk Eddelbuettel | binb |
Sundar Dorai-Raj | binom |
Claus O. Wilke | cowplot |
Jacob Kaplan | fastDummies |
Trevor Hastie | glmnet |
Hong Ooi | glmnetUtils |
R Core Team | grid |
Karthik Ram | holepunch |
Stefan Milton Bache | magrittr |
Simon Wood | mgcv |
Thomas Lin Pedersen | patchwork |
Adelchi Azzalini | sn |
Gábor Csárdi | sessioninfo |
Dirk Schumacher | thankr |