Skip to content

Commit

Permalink
add doc for gratia enhancements
Browse files Browse the repository at this point in the history
  • Loading branch information
Nicholas Clark committed Jul 25, 2024
1 parent b7ee4ec commit 0a3ae5e
Show file tree
Hide file tree
Showing 80 changed files with 957 additions and 398 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
## New functionalities
* Allow intercepts to be included in process models when `trend_formula` is supplied. This breaks the assumption that the process has to be zero-centred, adding flexibility but also potentially inducing nonidentifiabilities with respect to any observation model intercepts. Thoughtful priors are a must for these models
* Added `stancode.mvgam` and `stancode.mvgam_prefit` methods
* Added 'gratia' to *Enhancements* to allow popular methods such as `draw()` to be used for 'mvgam' models if 'gratia' is already installed

## Deprecations
* The `drift` argument has been deprecated. It is now recommended for users to include parametric fixed effects of "time" in their respective GAM formulae to capture any expected drift effects
Expand Down
2 changes: 1 addition & 1 deletion R/dynamic.R
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
#'@examples
#'\donttest{
#'# Simulate a time-varying coefficient
#'#(as a Gaussian Process with length scale = 10)
#'# (as a Gaussian Process with length scale = 10)
#'set.seed(1111)
#'N <- 200
#'
Expand Down
147 changes: 54 additions & 93 deletions R/gratia_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,17 @@
}
}

#' Plot estimated smooths from a fitted mvgam model
#'
#' Plots estimated smooths from a fitted mvgam model in a similar way to
#' `mgcv::plot.gam()` but instead of using base graphics, [ggplot2::ggplot()]
#' is used instead.
#' Enhance mvgam post-processing using gratia functionality
#'
#' These evaluation and plotting functions exist to allow some popular `gratia`
#' methods to work with `mvgam` models
#' @name gratia_mvgam_enhancements
#' @param object a fitted mvgam, the result of a call to [mvgam()].
#' @param data a optional data frame that may or may not be used? FIXME!
#' @param data a data frame of covariate values at which to evaluate the
#' model's smooth functions.
#' @param smooth a smooth object of class `"gp.smooth"` (returned from a model using either the
#' `dynamic()` function or the `gp()` function) or of class `"moi.smooth"` or `"mod.smooth"`
#' (returned from a model using the 'moi' or 'mod' basis).
#' @param trend_effects logical specifying whether smooth terms from the `trend_formula` should
#' be drawn. If `FALSE`, only terms from the observation formula are drawn. If `TRUE`, only
#' terms from the `trend_formula` are drawn.
Expand Down Expand Up @@ -65,15 +68,16 @@
#' @param ci_level numeric between 0 and 1; the coverage of credible interval.
#' @param n numeric; the number of points over the range of the covariate at
#' which to evaluate the smooth.
#' @param n_3d numeric; the number of new observations to generate for the third
#' dimension of a 3D smooth.
#' @param n_4d numeric; the number of new observations to generate for the
#' dimensions higher than 2 (!) of a *k*D smooth (*k* >= 4). For example, if
#' the smooth is a 4D smooth, each of dimensions 3 and 4 will get `n_4d`
#' new observations.
#' @param unconditional logical; should confidence intervals include the
#' uncertainty due to smoothness selection? If `TRUE`, the corrected Bayesian
#' covariance matrix will be used.
#' @param n_3d,n_4d numeric; the number of points over the range of last
#' covariate in a 3D or 4D smooth. The default is `NULL` which achieves the
#' standard behaviour of using `n` points over the range of all covariate,
#' resulting in `n^d` evaluation points, where `d` is the dimension of the
#' smooth. For `d > 2` this can result in very many evaluation points and slow
#' performance. For smooths of `d > 4`, the value of `n_4d` will be used for
#' all dimensions `> 4`, unless this is `NULL`, in which case the default
#' behaviour (using `n` for all dimensions) will be observed.
#' @param unconditional ignored for `mvgam` models as all appropriate
#' uncertainties are already included in the posterior estimates.
#' @param overall_uncertainty ignored for `mvgam` models as all appropriate
#' uncertainties are already included in the posterior estimates.
#' @param dist numeric; if greater than 0, this is used to determine when
Expand All @@ -93,7 +97,6 @@
#' @param ci_col colour specification for the confidence/credible intervals
#' band. Affects the fill of the interval.
#' @param smooth_col colour specification for the smooth line.
#' @param resid_col colour specification for the partial residuals.
#' @param contour_col colour specification for contour lines.
#' @param n_contour numeric; the number of contour bins. Will result in
#' `n_contour - 1` contour lines being drawn. See [ggplot2::geom_contour()].
Expand Down Expand Up @@ -140,20 +143,32 @@
#' @param wrap logical; wrap plots as a patchwork? If \code{FALSE}, a list of
#' ggplot objects is returned, 1 per term plotted.
#' @param envir an environment to look up the data within.
#' @param ... additional arguments passed to [patchwork::wrap_plots()].
#' @param ... additional arguments passed to other methods.
#'
#' @note Internally, plots of each smooth are created using [ggplot2::ggplot()]
#' and composed into a single plot using [patchwork::wrap_plots()]. As a
#' result, it is not possible to use `+` to add to the plots in the way one
#' might typically work with `ggplot()` plots. Instead, use the `&` operator;
#' see the examples.
#'
#' @return The object returned is created by [patchwork::wrap_plots()].
#'
#' @author Nicholas Clark
#' @details These methods allow `mvgam` models to be *Enhanced* if users have the `gratia`
#' package installed, making available the popular `draw()` function to plot partial effects
#' of `mvgam` smooth functions using [ggplot2::ggplot()] utilities
#' @author Nicholas J Clark
#' @examples
#' \dontrun{
#' # Fit a simple GAM and draw partial effects of smooths using gratia
#' set.seed(0)
#' library(ggplot2); theme_set(theme_bw())
#' library(gratia)
#' dat <- mgcv::gamSim(1, n = 200, scale = 2)
#' mod <- mvgam(y ~ s(x1, bs = 'moi') +
#' te(x0, x2), data = dat,
#' family = gaussian())
#'
#' draw(mod)
#'}

NULL



#' @rdname gratia_mvgam_enhancements
#' @aliases draw.mvgam
#'
#' @export
#'
`drawDotmvgam` <- function(
Expand Down Expand Up @@ -217,7 +232,7 @@
n = n,
n_3d = n_3d,
n_4d = n_4d,
unconditional = unconditional,
unconditional = FALSE,
overall_uncertainty = FALSE,
constant = constant,
fun = fun,
Expand Down Expand Up @@ -263,7 +278,7 @@
n = n,
n_3d = n_3d,
n_4d = n_4d,
unconditional = unconditional,
unconditional = FALSE,
overall_uncertainty = FALSE,
constant = constant,
fun = fun,
Expand Down Expand Up @@ -299,38 +314,7 @@
sm_plots
}

#' Evaluation of a Hilbert space approximate GP functions in mvgam
#' These evaluation functions are needed so that gratia::draw methods work with mvgam
#' dynamic smooths
#' @rdname GP
#' @param model an object of class `"gam"`
#' @param smooth a smooth object of class `"gp.smooth"`, returned from a model using either the
#' `dynamic()` function or the `gp()` function, which both make use of with a Hilbert
#' space approximate GPs
#' @param n numeric; the number of points over the range of the covariate at
#' which to evaluate the smooth.
#' @param n_3d,n_4d numeric; the number of points over the range of last
#' covariate in a 3D or 4D smooth. The default is `NULL` which achieves the
#' standard behaviour of using `n` points over the range of all covariate,
#' resulting in `n^d` evaluation points, where `d` is the dimension of the
#' smooth. For `d > 2` this can result in very many evaluation points and slow
#' performance. For smooths of `d > 4`, the value of `n_4d` will be used for
#' all dimensions `> 4`, unless this is `NULL`, in which case the default
#' behaviour (using `n` for all dimensions) will be observed.
#' @param data a data frame of covariate values at which to evaluate the
#' smooth.
#' @param unconditional logical; should confidence intervals include the
#' uncertainty due to smoothness selection? If `TRUE`, the corrected Bayesian
#' covariance matrix will be used.
#' @param overall_uncertainty logical; should the uncertainty in the model
#' constant term be included in the standard error of the evaluate values of
#' the smooth?
#' @param dist numeric; if greater than 0, this is used to determine when
#' a location is too far from data to be plotted when plotting 2-D smooths.
#' The data are scaled into the unit square before deciding what to exclude,
#' and `dist` is a distance within the unit square. See
#' [mgcv::exclude.too.far()] for further details.
#' @param ... ignored.
#' @rdname gratia_mvgam_enhancements
#' @aliases eval_smooth.hilbert.smooth
#' @export
eval_smoothDothilbertDotsmooth = function(smooth,
Expand Down Expand Up @@ -476,36 +460,7 @@ eval_smoothDothilbertDotsmooth = function(smooth,
return(eval_sm)
}

#' Evaluation of a monotonic functions in mvgam
#' These evaluation functions are needed so that gratia::draw methods work with mvgam
#' monotonic smooths
#' @rdname eval_smooth_monotonic
#' @param model an object of class `"gam"`
#' @param smooth a smooth object of class `"moi.smooth"` or `"mod.smooth"`
#' @param n numeric; the number of points over the range of the covariate at
#' which to evaluate the smooth.
#' @param n_3d,n_4d numeric; the number of points over the range of last
#' covariate in a 3D or 4D smooth. The default is `NULL` which achieves the
#' standard behaviour of using `n` points over the range of all covariate,
#' resulting in `n^d` evaluation points, where `d` is the dimension of the
#' smooth. For `d > 2` this can result in very many evaluation points and slow
#' performance. For smooths of `d > 4`, the value of `n_4d` will be used for
#' all dimensions `> 4`, unless this is `NULL`, in which case the default
#' behaviour (using `n` for all dimensions) will be observed.
#' @param data a data frame of covariate values at which to evaluate the
#' smooth.
#' @param unconditional logical; should confidence intervals include the
#' uncertainty due to smoothness selection? If `TRUE`, the corrected Bayesian
#' covariance matrix will be used.
#' @param overall_uncertainty logical; should the uncertainty in the model
#' constant term be included in the standard error of the evaluate values of
#' the smooth?
#' @param dist numeric; if greater than 0, this is used to determine when
#' a location is too far from data to be plotted when plotting 2-D smooths.
#' The data are scaled into the unit square before deciding what to exclude,
#' and `dist` is a distance within the unit square. See
#' [mgcv::exclude.too.far()] for further details.
#' @param ... ignored.
#' @rdname gratia_mvgam_enhancements
#' @aliases eval_smooth.mod.smooth
#' @export
eval_smoothDotmodDotsmooth = function(smooth,
Expand Down Expand Up @@ -562,7 +517,7 @@ eval_smoothDotmodDotsmooth = function(smooth,
eval_sm
}

#' @rdname eval_smooth_monotonic
#' @rdname gratia_mvgam_enhancements
#' @aliases eval_smooth.moi.smooth
#' @export
eval_smoothDotmoiDotsmooth = function(smooth,
Expand Down Expand Up @@ -666,7 +621,7 @@ eval_smoothDotmoiDotsmooth = function(smooth,
)
} else {
smooth <- gratia::get_smooths_by_id(model, id)[[1L]]
vars <- gratia::smooth_variable(smooth)
vars <- smooth_variable(smooth)
by_var <- gratia::by_variable(smooth)
if (!identical(by_var, "NA")) {
vars <- append(vars, by_var)
Expand Down Expand Up @@ -710,3 +665,9 @@ lss_eta_index <- function(object){
lpi
}
}

#' @noRd
smooth_variable <- function (smooth) {
gratia::check_is_mgcv_smooth(smooth)
smooth[["term"]]
}
6 changes: 6 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,12 @@ plot_mvgam_smooth(lynx_mvgam, series = 1,
derivatives = TRUE)
```

If you have the `gratia` package installed, it can also be used to plot partial effects of smooths on the link scale
```{r, fig.alt = "Plotting GAM smooth functions in mvgam using gratia"}
require(gratia)
draw(lynx_mvgam)
```

As for many types of regression models, it is often more useful to plot model effects on the outcome scale. `mvgam` has support for the wonderful `marginaleffects` package, allowing a wide variety of posterior contrasts, averages, conditional and marginal predictions to be calculated and plotted. Below is the conditional effect of season plotted on the outcome scale, for example:
```{r, fig.alt = "Using marginaleffects and mvgam to plot GAM smooth functions in R"}
require(ggplot2); require(marginaleffects)
Expand Down
65 changes: 42 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,29 +234,29 @@ summary(lynx_mvgam)
#>
#>
#> GAM coefficient (beta) estimates:
#> 2.5% 50% 97.5% Rhat n_eff
#> (Intercept) 6.400 6.60 6.900 1 709
#> s(season).1 -0.680 -0.13 0.360 1 1111
#> s(season).2 0.730 1.30 1.900 1 1091
#> s(season).3 1.200 1.90 2.500 1 733
#> s(season).4 -0.085 0.54 1.100 1 900
#> s(season).5 -1.300 -0.68 -0.089 1 850
#> s(season).6 -1.200 -0.54 0.130 1 1139
#> s(season).7 0.074 0.71 1.400 1 1063
#> s(season).8 0.620 1.30 2.100 1 715
#> s(season).9 -0.380 0.21 0.830 1 839
#> s(season).10 -1.400 -0.85 -0.350 1 871
#> 2.5% 50% 97.5% Rhat n_eff
#> (Intercept) 6.4000 6.60 6.900 1 718
#> s(season).1 -0.6500 -0.13 0.420 1 987
#> s(season).2 0.7600 1.30 1.900 1 865
#> s(season).3 1.3000 1.90 2.600 1 816
#> s(season).4 -0.0360 0.53 1.100 1 909
#> s(season).5 -1.4000 -0.69 -0.096 1 759
#> s(season).6 -1.3000 -0.56 0.150 1 869
#> s(season).7 0.0092 0.71 1.400 1 970
#> s(season).8 0.6100 1.40 2.000 1 802
#> s(season).9 -0.3500 0.21 0.840 1 897
#> s(season).10 -1.4000 -0.86 -0.370 1 1366
#>
#> Approximate significance of GAM smooths:
#> edf Ref.df Chi.sq p-value
#> s(season) 9.97 10 48.3 <2e-16 ***
#> s(season) 9.96 10 49.6 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Latent trend parameter AR estimates:
#> 2.5% 50% 97.5% Rhat n_eff
#> ar1[1] 0.60 0.83 0.97 1 656
#> sigma[1] 0.39 0.47 0.62 1 715
#> ar1[1] 0.57 0.83 0.98 1.01 659
#> sigma[1] 0.38 0.48 0.60 1.01 745
#>
#> Stan MCMC diagnostics:
#> n_eff / iter looks reasonable for all parameters
Expand All @@ -265,7 +265,7 @@ summary(lynx_mvgam)
#> 0 of 2000 iterations saturated the maximum tree depth of 12 (0%)
#> E-FMI indicated no pathological behavior
#>
#> Samples were drawn using NUTS(diag_e) at Mon Jul 01 8:32:43 AM 2024.
#> Samples were drawn using NUTS(diag_e) at Thu Jul 25 3:12:50 PM 2024.
#> For each parameter, n_eff is a crude measure of effective sample size,
#> and Rhat is the potential scale reduction factor on split MCMC chains
#> (at convergence, Rhat = 1)
Expand Down Expand Up @@ -364,6 +364,25 @@ plot_mvgam_smooth(lynx_mvgam, series = 1,

<img src="man/figures/README-unnamed-chunk-18-1.png" alt="Plotting GAM smooth functions in mvgam and R" width="60%" style="display: block; margin: auto;" />

If you have the `gratia` package installed, it can also be used to plot
partial effects of smooths on the link scale

``` r
require(gratia)
#> Loading required package: gratia
#>
#> Attaching package: 'gratia'
#> The following object is masked from 'package:mvgam':
#>
#> add_residuals
#> The following object is masked from 'package:brms':
#>
#> posterior_samples
draw(lynx_mvgam)
```

<img src="man/figures/README-unnamed-chunk-19-1.png" alt="Plotting GAM smooth functions in mvgam using gratia" width="60%" style="display: block; margin: auto;" />

As for many types of regression models, it is often more useful to plot
model effects on the outcome scale. `mvgam` has support for the
wonderful `marginaleffects` package, allowing a wide variety of
Expand All @@ -378,18 +397,18 @@ plot_predictions(lynx_mvgam, condition = 'season', points = 0.5) +
theme_classic()
```

<img src="man/figures/README-unnamed-chunk-19-1.png" alt="Using marginaleffects and mvgam to plot GAM smooth functions in R" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-unnamed-chunk-20-1.png" alt="Using marginaleffects and mvgam to plot GAM smooth functions in R" width="60%" style="display: block; margin: auto;" />

We can also view the `mvgam`’s posterior predictions for the entire
series (testing and training)

``` r
plot(lynx_mvgam, type = 'forecast', newdata = lynx_test)
#> Out of sample DRPS:
#> 2420.7128115
#> 2380.85453525
```

<img src="man/figures/README-unnamed-chunk-20-1.png" alt="Plotting forecast distributions using mvgam in R" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-unnamed-chunk-21-1.png" alt="Plotting forecast distributions using mvgam in R" width="60%" style="display: block; margin: auto;" />

And the estimated latent trend component, again using the more flexible
`plot_mvgam_...()` option to show first derivatives of the estimated
Expand All @@ -399,7 +418,7 @@ trend
plot_mvgam_trend(lynx_mvgam, newdata = lynx_test, derivatives = TRUE)
```

<img src="man/figures/README-unnamed-chunk-21-1.png" alt="Plotting dynamic trend components using mvgam in R" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-unnamed-chunk-22-1.png" alt="Plotting dynamic trend components using mvgam in R" width="60%" style="display: block; margin: auto;" />

A key aspect of ecological forecasting is to understand <a
href="https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/eap.1589"
Expand All @@ -416,7 +435,7 @@ text(1, 0.8, cex = 1.5, label="Trend component",
pos = 4, col="#7C0000", family = 'serif')
```

<img src="man/figures/README-unnamed-chunk-22-1.png" alt="Decomposing uncertainty contributions to forecasts in mvgam in R" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-unnamed-chunk-23-1.png" alt="Decomposing uncertainty contributions to forecasts in mvgam in R" width="60%" style="display: block; margin: auto;" />

Both components contribute to forecast uncertainty. Diagnostics of the
model can also be performed using `mvgam`. Have a look at the model’s
Expand All @@ -429,7 +448,7 @@ our AR1 model is appropriate for the latent trend
plot(lynx_mvgam, type = 'residuals')
```

<img src="man/figures/README-unnamed-chunk-23-1.png" alt="Plotting Dunn-Smyth residuals for time series analysis in mvgam and R" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-unnamed-chunk-24-1.png" alt="Plotting Dunn-Smyth residuals for time series analysis in mvgam and R" width="60%" style="display: block; margin: auto;" />

## Extended observation families

Expand Down Expand Up @@ -550,7 +569,7 @@ summary(mod, include_betas = FALSE)
#> 0 of 2000 iterations saturated the maximum tree depth of 12 (0%)
#> E-FMI indicated no pathological behavior
#>
#> Samples were drawn using NUTS(diag_e) at Mon Jul 01 8:34:07 AM 2024.
#> Samples were drawn using NUTS(diag_e) at Thu Jul 25 3:13:40 PM 2024.
#> For each parameter, n_eff is a crude measure of effective sample size,
#> and Rhat is the potential scale reduction factor on split MCMC chains
#> (at convergence, Rhat = 1)
Expand Down
Loading

0 comments on commit 0a3ae5e

Please sign in to comment.