Skip to content

Commit

Permalink
rerun vignettes
Browse files Browse the repository at this point in the history
  • Loading branch information
paul-buerkner committed Aug 18, 2021
1 parent 06e40e6 commit 5fd9aed
Show file tree
Hide file tree
Showing 23 changed files with 1,158 additions and 2,316 deletions.
5 changes: 4 additions & 1 deletion doc/brms_customfamilies.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ list of distributions, which are not natively supported. The present vignette
will explain how to specify such *custom families* in **brms**. By doing that,
users can benefit from the modeling flexibility and post-processing options of
**brms** even when using self-defined response distributions.
If you have built a custom family that you want to make available to other
users, you can submit a pull request to this
[GitHub repository](https://github.com/paul-buerkner/custom-brms-families).

## A Case Study

Expand Down Expand Up @@ -223,7 +226,7 @@ to worry too much about how `prep` is created (if you are interested, check
out the `prepare_predictions` function). Instead, all you need to know is
that parameters are stored in slot `dpars` and data are stored in slot `data`.
Generally, parameters take on the form of a $S \times N$ matrix (with $S =$
number of posterior samples and $N =$ number of observations) if they are
number of posterior draws and $N =$ number of observations) if they are
predicted (as is `mu` in our example) and a vector of size $N$ if the are not
predicted (as is `phi`).

Expand Down
75 changes: 38 additions & 37 deletions doc/brms_customfamilies.html

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions doc/brms_distreg.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ model, in which we can specify predictor terms for all parameters of the assumed
response distribution. In the vast majority of regression model implementations,
only the location parameter (usually the mean) of the response distribution
depends on the predictors and corresponding regression parameters. Other
parameters (e.g., scale or shape parameters) are estimated as auxilliary
parameters (e.g., scale or shape parameters) are estimated as auxiliary
parameters assuming them to be constant across observations. This assumption is
so common that most researchers applying regression models are often (in my
experience) not aware of the possibility of relaxing it. This is understandable
Expand Down Expand Up @@ -74,8 +74,8 @@ linear predictor can be any real number.

Unequal variance models are possibly the most simple, but nevertheless very
important application of distributional models. Suppose we have two groups of
patients: One group recieves a treatment (e.g., an antidepressive drug) and
another group recieves placebo. Since the treatment may not work equally well
patients: One group receives a treatment (e.g., an antidepressive drug) and
another group receives placebo. Since the treatment may not work equally well
for all patients, the symptom variance of the treatment group may be larger than
the symptom variance of the placebo group after some weeks of treatment. For
simplicity, assume that we only investigate the post-treatment values.
Expand Down Expand Up @@ -176,8 +176,8 @@ According to the parameter estimates, larger groups catch more fish, campers
catch more fish than non-campers, and groups with more children catch less fish.
The zero-inflation probability `zi` is pretty large with a mean of 41%. Please
note that the probability of catching no fish is actually higher than 41%, but
parts of this probability are already modeled by the poisson distribution itself
(hence the name zero-*inflation*). If you want to treat all zeros as origniating
parts of this probability are already modeled by the Poisson distribution itself
(hence the name zero-*inflation*). If you want to treat all zeros as originating
from a separate process, you can use hurdle models instead (not shown here).

Now, we try to additionally predict the zero-inflation probability by the number
Expand Down
428 changes: 133 additions & 295 deletions doc/brms_distreg.html

Large diffs are not rendered by default.

32 changes: 20 additions & 12 deletions doc/brms_families.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,13 @@ $$
with location parameter $\mu \in [0, 1]$ and positive shape parameter $\alpha$.
-->

## Survival models
## Time-to-event models

With survival models we mean all models that are defined on the positive reals
only, that is $y \in \mathbb{R}^+$. The density of the **lognormal** family is
given by
With time-to-event models we mean all models that are defined on the positive
reals only, that is $y \in \mathbb{R}^+$. The density of the **lognormal**
family is given by
$$
f(y) = \frac{1}{\sqrt{2\pi}\sigma x} \exp\left(-\frac{1}{2}\left(\frac{\log(y) - \mu}{\sigma}\right)^2\right)
f(y) = \frac{1}{\sqrt{2\pi}\sigma y} \exp\left(-\frac{1}{2}\left(\frac{\log(y) - \mu}{\sigma}\right)^2\right)
$$
where $\sigma$ is the residual standard deviation on the log-scale.
The density of the **Gamma** family is given by
Expand All @@ -122,7 +122,15 @@ is set to $1$ for either the gamma or Weibull distribution. The density of the
$$
f(y) = \left(\frac{\alpha}{2 \pi y^3}\right)^{1/2} \exp \left(\frac{-\alpha (y - \mu)^2}{2 \mu^2 y} \right)
$$
where $\alpha$ is a positive shape parameter.
where $\alpha$ is a positive shape parameter. The **cox** family implements Cox
proportional hazards model which assumes a hazard function of the form $h(y) =
h_0(y) \mu$ with baseline hazard $h_0(y)$ expressed via M-splines (which
integrate to I-splines) in order to ensure monotonicity. The density of the cox
model is then given by
$$
f(y) = h(y) S(y)
$$
where $S(y)$ is the survival function implied by $h(y)$.

## Extreme value models

Expand Down Expand Up @@ -154,20 +162,20 @@ $$
f(y) = \frac{1}{2 \beta} \exp\left(\frac{1}{2 \beta} \left(2\xi + \sigma^2 / \beta - 2 y \right) \right) \text{erfc}\left(\frac{\xi + \sigma^2 / \beta - y}{\sqrt{2} \sigma} \right)
$$
where $\beta$ is the scale (inverse rate) of the exponential component, $\xi$ is
the mean of the Gaussian componenent, $\sigma$ is the standard deviation of the
the mean of the Gaussian component, $\sigma$ is the standard deviation of the
Gaussian component, and $\text{erfc}$ is the complementary error function. We
parameterize $\mu = \xi + \beta$ so that the main predictor term equals the
mean of the distribution.

Another family well suited for modelling response times is the
Another family well suited for modeling response times is the
**shifted_lognormal** distribution. It's density equals that of the
**lognormal** distribution except that the whole distribution is shifted to the
right by a positive parameter called *ndt* (for consistency with the **wiener**
diffusion model explained below).

A family concerned with the combined modelling of reaction times and
A family concerned with the combined modeling of reaction times and
corresponding binary responses is the **wiener** diffusion model. It has four
model parameters each with a natural interpreation. The parameter $\alpha > 0$
model parameters each with a natural interpretation. The parameter $\alpha > 0$
describes the separation between two boundaries of the diffusion process,
$\tau > 0$ describes the non-decision time (e.g., due to image or motor processing),
$\beta \in [0, 1]$ describes the initial bias in favor of the upper alternative,
Expand Down Expand Up @@ -273,7 +281,7 @@ $k$ for a subset of predictors. This leads to category specific
effects (for details on how to specify them see `help(brm)`). Note that
**cumulative** and **sratio** models use $\tau - \eta$, whereas **cratio** and
**acat** use $\eta - \tau$. This is done to ensure that larger values of $\eta$
increase the probability of *higher* reponse categories.
increase the probability of *higher* response categories.

The **categorical** family is currently only implemented with the multivariate
logit link function and has density
Expand All @@ -294,7 +302,7 @@ function shown above.
## Zero-inflated and hurdle models

**Zero-inflated** and **hurdle** families extend existing families by adding
special processes for responses that are zero. The densitiy of a
special processes for responses that are zero. The density of a
**zero-inflated** family is given by
$$
f_z(y) = z + (1 - z) f(0) \quad \text{if } y = 0 \\
Expand Down
Loading

0 comments on commit 5fd9aed

Please sign in to comment.