Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate CRPS explanation into documentation #220

Open
lboeman opened this issue Mar 1, 2022 · 0 comments
Open

Incorporate CRPS explanation into documentation #220

lboeman opened this issue Mar 1, 2022 · 0 comments

Comments

@lboeman
Copy link
Member

lboeman commented Mar 1, 2022

During the American Made Challenges Solar Forecasting Competition, users had questions about how we handle CRPS using our descrete quantiles. @dplarson provided the following helpful explanation which should be incorporated into the documentation:


The Solar Forecast Arbiter (SFA) uses numerical integration to approximate the integral term in the CRPS definition but does not interpolate the forecast CDF (since the SFA makes no assumptions regarding how the forecasts were generated). There are tradeoffs with this approach, but that is true of any CRPS formulation where you don’t assume the forecast is defined by a specific continuous parametric function (and therefore must approximate the CRPS numerically). The SFA documentation (https://solarforecastarbiter-core.readthedocs.io/en/latest/generated/solarforecastarbiter.metrics.probabilistic.continuous_ranked_probability_score.html) and source code (https://solarforecastarbiter-core.readthedocs.io/en/latest/_modules/solarforecastarbiter/metrics/probabilistic.html#continuous_ranked_probability_score) have more details.

Working through a CRPS calculation by hand is tedious, but an example to illustrate: Let’s assume the target variable is GHI [W/m^2] and the probabilistic forecast is given as:

Prob(Power <= 42 W/m^2) = 0%
Prob(Power <= 408 W/m^2) = 10%
Prob(Power <= 474 W/m^2) = 20%
Prob(Power <= 521 W/m^2) = 30%
Prob(Power <= 562 W/m^2) = 40%
Prob(Power <= 600 W/m^2) = 50%
Prob(Power <= 638 W/m^2) = 60%
Prob(Power <= 679 W/m^2) = 70%
Prob(Power <= 726 W/m^2) = 80%
Prob(Power <= 792 W/m^2) = 90%
Prob(Power <= 1158 W/m^2) = 100%

which is approximately the CDF of a Gaussian (aka normal) distribution with mean = 600 W/m^2 and standard deviation = 150 W/m^2.

The steps to calculate the CRPS for a single forecast timestamp is:

calculate the indicator function term: 1{x >= y} = 1 if x >= y, else 0 (where y is the observation)
calculate the integrand term: ( F(x) – 1{x >= y} )^2
integrate along x

If the observation is 710 W/m^2, then in vector form we have:

x = [42, 408, 474, 521, 562, 600, 638, 679, 726, 792, 1158] # the forecasted GHI [W/m^2]
F(x) = [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] # the forecasted probability [-]
y = 710 # the observed GHI [W/m^2]

Then the calculation is:

indicator function term: 1{x >= y} = [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1] # only {726, 792, 1158} are >= 710 W/m^2 (the {80%, 90%, 100%} forecasts)
the difference between the forecast probability and the indicator function: F(x) – 1{x >= y} = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, -0.2, -0.1, 0.0]
the integrand term: ( F(x) – 1{x >= y} )^2 = [0.0, 0.01, 0.04, 0.09, 0.16, 0.25, 0.36, 0.49, 0.04, 0.01, 0.0]
integrate along x (using trapezoid rule): CRPS = 64.4 W/m^2

You can follow the same approach for other observations:

same forecasts, but the observation is 450 W/m^2: CRPS = 98.5 W/m^2
same forecasts, but the observation is 1000 W/m^2: CRPS = 271.1 W/m^2

Note: in this example, we know the forecast is from a Gaussian distribution and the CRPS of a Gaussian can be calculated analytically [1]. If the observation is 710 W/m^2, then the analytical calculation of the CRPS is 65.9 W/m^2 vs the SFA CRPS of 64.4 W/m^2. The difference here is due to the limits of numerical integration. For example, increasing the probability resolution results in the SFA CRPS converging to the analytically calculated CRPS (65.9 W/m^2):

forecast every 10% (i.e., 0%, 10%, …, 90%, 100%): CRPS = 64.4 W/m^2
forecast every 1% (i.e., 0%, 1%, …, 99%, 100%): CRPS = 65.0 W/m^2
forecast every 0.1% (i.e., 0%, 0.1%, …, 99.9%, 100%): CRPS = 65.9 W/m^2

[1] Gneiting et al. (2005) “Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation”, doi: https://doi.org/10.1175/MWR2904.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant