Skip to content

Commit

Permalink
Added references to vignettes/articles
Browse files Browse the repository at this point in the history
  • Loading branch information
melff committed Dec 27, 2023
1 parent 3a02de9 commit 215d695
Show file tree
Hide file tree
Showing 6 changed files with 196 additions and 30 deletions.
26 changes: 14 additions & 12 deletions pkg/vignettes/approximations.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ vignette: >
% \VignetteIndexEntry{Approximate Inference for Multinomial Logit Models with Random Effects}
% \VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: mclogit.bib
---

# The problem
Expand Down Expand Up @@ -111,7 +112,7 @@ Since this quadratic expansion---let us call it
$\ell^*_{\text{Lapl}}(\boldsymbol{y},\boldsymbol{b})$---is a
(multivariate) quadratic function of $\boldsymbol{b}$, the integral of
its exponential does have a closed-form solution (the relevant formula
can be found in `harville:matrix.algebra`).
can be found in @harville:matrix.algebra).

For purposes of estimation, the resulting approximate log-likelihood is
more useful:
Expand Down Expand Up @@ -139,7 +140,7 @@ $\ell_{\text{cpl}}(\boldsymbol{y},\boldsymbol{b})$ but also
$\ell^*_{\text{Lapl}}$. This motivates the following IWLS/Fisher
scoring equations for $\hat{\boldsymbol{\alpha}}$ and
$\tilde{\boldsymbol{b}}$ (see
`breslow.clayton:approximate.inference.glmm` and [this
@breslow.clayton:approximate.inference.glmm and [this
page](fitting-mclogit.html)):

$$
Expand Down Expand Up @@ -187,7 +188,7 @@ $$
\boldsymbol{Z}'\boldsymbol{W}(\boldsymbol{y}^*-\boldsymbol{X}\boldsymbol{\alpha})
$$

which can be solved to compute $hat{\boldsymbol{\alpha}}$ and
which can be solved to compute $\hat{\boldsymbol{\alpha}}$ and
$\tilde{\boldsymbol{b}}$ (for given $\boldsymbol{\Sigma}$)

Here
Expand All @@ -204,8 +205,8 @@ $$
\boldsymbol{W}\boldsymbol{Z}'\left(\boldsymbol{Z}'\boldsymbol{W}\boldsymbol{Z}+\boldsymbol{\Sigma}^{-1}\right)^{-1}\boldsymbol{Z}\boldsymbol{W}
$$

Following `breslow.clayton:approximate.inference.glmm` the variance
parameters in $\boldsymbol{Sigma}$ are estimated by minimizing
Following @breslow.clayton:approximate.inference.glmm the variance
parameters in $\boldsymbol{\Sigma}$ are estimated by minimizing

$$
q_1 =
Expand All @@ -221,7 +222,7 @@ $$

This motivates the following algorithm, which is strongly inspired by
the `glmmPQL()` function in Brian Ripley's *R* package
[MASS](https://cran.r-project.org/package=MASS):
[MASS](https://cran.r-project.org/package=MASS) [@MASS]:

1. Create some suitable starting values for $\boldsymbol{\pi}$,
$\boldsymbol{W}$, and $\boldsymbol{y}^*$
Expand All @@ -242,16 +243,16 @@ algorithm used to fit conditional logit models without random effects.
Instead of just solving a linear requatoin in step 3, it estimates a
weighted linear mixed-effects model. In contrast to `glmmPQL()` it does
not use the `lme()` function from package
[nlme](https://cran.r-project.org/package=nlme) for this, because the
[nlme](https://cran.r-project.org/package=nlme) [@nlme-book] for this, because the
weighting matrix $\boldsymbol{W}$ is non-diagonal. Instead, $q_1$ or
$q_2$ are minimized using the function `nlminb` from the standard *R*
package "stats".
package "stats" or some other optimizer chosen by the user.

# The Solomon-Cox approximation and MQL

## The Solomon-Cox approximation

The (first-order) Solomon approximation is based on the quadratic
The (first-order) Solomon approximation [@Solomon.Cox:1992] is based on the quadratic
expansion the integrand

$$
Expand Down Expand Up @@ -299,13 +300,13 @@ $$

## Marginal quasi-likelhood (MQL)

The resulting estimation technique is very similar to PQL (again, see
`breslow.clayton:approximate.inference.glmm` for a discussion). The only
The resulting estimation technique is very similar to PQL [again, see
@breslow.clayton:approximate.inference.glmm for a discussion]. The only
difference is the construction of the "working dependent" variable
$\boldsymbol{y}^*$. With PQL it is constructed as
$$\boldsymbol{y}^* =
\boldsymbol{X}\boldsymbol{\alpha} + \boldsymbol{Z}\boldsymbol{b} +
\boldsymbol{W}^{-}(\boldsymbol{y}-\boldsymbol{pi})$$
\boldsymbol{W}^{-}(\boldsymbol{y}-\boldsymbol{\pi})$$
while the MQL working
dependent variable is just

Expand All @@ -330,3 +331,4 @@ so that the algorithm has the following steps:
Otherwise go back to step 2 with the updated values of
$\hat{\boldsymbol{\alpha}}$.

# References
13 changes: 7 additions & 6 deletions pkg/vignettes/baseline-logit.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ vignette: >
% \VignetteIndexEntry{Baseline-category logit models}
% \VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: mclogit.bib
---


Expand All @@ -13,12 +14,11 @@ logistic regression, that allow to model not only binary or dichotomous
responses, but also polychotomous responses. In addition, they allow to
model responses in the form of counts that have a pre-determined sum.
These models are described in
`agresti:categorical.data.analysis.2002`{.interpreted-text
role="citet"}. Estimating these models is also supported by the function
`multinom()` in the *R* package \"nnet\" `MASS`{.interpreted-text
role="cite"}. In the package \"mclogit\", the function to estimate these
models is called `mblogit()` (see the relevant [manual
page](reference/mblogit.html)), which uses the infrastructure for estimating
@agresti:categorical.data.analysis.2002.
Estimating these models is also supported by the function
`multinom()` in the *R* package "nnet" [@MASS].
In the package "mclogit", the function to estimate these
models is called `mblogit()`, which uses the infrastructure for estimating
conditional logit models, exploiting the fact that baseline-category
logit models can be re-expressed as condigional logit models.

Expand Down Expand Up @@ -61,3 +61,4 @@ taking a value $j$ versus taking the value $1$. Note that there is
one coefficient for each independent variable and *each response* other
than the baseline category.

# References
13 changes: 7 additions & 6 deletions pkg/vignettes/conditional-logit.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ vignette: >
% \VignetteIndexEntry{Conditional logit models}
% \VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: mclogit.bib
---

Conditional logit models are motivated by a variety of considerations,
notably as a way to model binary panel data or responses in
case-control-studies. The variant supported by the package "mclogit"
is motivated by the analysis of discrete choices and goes back to
`mcfadden:conditional.logit`{.interpreted-text role="citet"}. Here, a
series of individuals $i=1,ldots,n$ is observed to have made a choice
@mcfadden:conditional.logit. Here, a
series of individuals $i=1,\ldots,n$ is observed to have made a choice
(represented by a number $j$) from a choice set $\mathcal{S}_i$, the
set of alternatives at the individual's disposal. Each alternatives
$j$ in the choice set can be described by the values
Expand Down Expand Up @@ -46,14 +47,14 @@ $$
Conditional logit models appear more parsimonious than baseline-category
logit models in so far as they have only one coefficient for each
independent variables.[^1] In the "mclogi\" package, these models can
be estimated using the function `mclogit()` (see the relevant [manual
page](reference/mclogit.html)).
be estimated using the function `mclogit()`.

My interest in conditional logit models derives from my research into
the influence of parties\' political positions on the patterns of
voting. Here, the political positions are the attributes of the
alternatives and the choice sets are the sets of parties that run
candidates in a countries at various points in time. For the application
of the conditional logit models, see my doctoral thesis
`elff:politische.ideologien`{.interpreted-text role="cite"}.
of the conditional logit models, see
@elff:divisions.positions.voting.

# References
7 changes: 5 additions & 2 deletions pkg/vignettes/fitting-mclogit.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,14 @@ vignette: >
% \VignetteIndexEntry{The IWLS algorithm used to fit conditional logit models}
% \VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: mclogit.bib
---

The package "mclogit" fits conditional logit models using a maximum
likelihood estimator. It does this by maximizing the log-likelihood
function using an *iterative weighted least-squares* (IWLS) algorithm,
which follows the algorithm used by the `glm.fit()` function from the
"stats" package of *R*.
"stats" package of *R* [@nelder.wedderburn:glm;@mccullagh.nelder:glm.2ed;@Rcore].

If $\pi_{ij}$ is the probability that individual $i$ chooses
alternative $j$ from his/her choice set $\mathcal{S}_i$, where
Expand Down Expand Up @@ -94,7 +95,7 @@ n_{i+}
$$

Here $y_{ij}=n_{ij}/n_{i+}$, while
$boldsymbol{N}$ is a diagonal matrix with diagonal elements
$\boldsymbol{N}$ is a diagonal matrix with diagonal elements
$n_{i+}$.

Newton-Raphson iterations then take the form
Expand Down Expand Up @@ -200,3 +201,5 @@ constructe as follows:
=
\eta_{ij}^{(0)}+\frac{y_{ij}-\pi_{ij}^{(0)}}{\pi_{ij}^{(0)}}
$$

# References
158 changes: 158 additions & 0 deletions pkg/vignettes/mclogit.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@

@book{agresti:categorical.data.analysis.2002,
title = {Categorical Data Analysis},
author = {Agresti, Alan},
year = {2002},
edition = {Second},
publisher = {Wiley},
address = {New York},
}



@incollection{mcfadden:conditional.logit,
title = {Conditional Logit Analysis of Qualitative Choice Behaviour},
booktitle = {Frontiers in Econometrics},
author = {McFadden, Daniel},
editor = {Zarembka, Paul},
year = {1974},
pages = {105-142},
publisher = {Academic Press},
address = {New York},
}


@article{breslow.clayton:approximate.inference.glmm,
title = {Approximate Inference in Generalized Linear Mixed Models},
author = {Breslow, Norman E. and Clayton, David G.},
year = {1993},
volume = {88},
pages = {9-25},
journal = {Journal of the American Statistical Association},
number = {421}
}


@article{nelder.wedderburn:glm,
title = {Generalized Linear Models},
author = {Nelder, J. A. and Wedderburn, R. W. M.},
year = {1972},
month = jan,
volume = {135},
pages = {370-384},
issn = {0035-9238},
doi = {10.2307/2344614},
abstract = {The technique of iterative weighted linear regression can be used
to obtain maximum likelihood estimates of the parameters with
observations distributed according to some exponential family
and systematic effects that can be made linear by a suitable
transformation. A generalization of the analysis of variance
is given for these models using log-likelihoods. These
generalized linear models are illustrated by examples relating
to four distributions; the Normal, Binomial (probit analysis,
etc.), Poisson (contingency tables) and gamma (variance
components). The implications of the approach in designing
statistics courses are discussed.},
journal = {Journal of the Royal Statistical Society. Series A (General)},
number = {3}
}


@book{mccullagh.nelder:glm.2ed,
title = {Generalized Linear Models},
author = {McCullagh, P. and Nelder, J.A.},
year = {1989},
publisher = {Chapman \& Hall/CRC},
address = {Boca Raton et al.},
series = {Monographs on Statistics \& Applied Probability}
}



@article{mcfadden.train:mixed.mlogit,
title = {Mixed {{MNL}} Models for Discrete Response},
author = {McFadden, Daniel and Train, Kenneth},
year = {2000},
volume = {15},
pages = {447-470},
journal = {Journal of Applied Econometrics},
number = {5}
}

@Book{MASS,
title = {Modern Applied Statistics with S},
author = {W. N. Venables and B. D. Ripley},
publisher = {Springer},
edition = {Fourth},
address = {New York},
year = {2002},
url = {http://www.stats.ox.ac.uk/pub/MASS4},
}



@book{harville:matrix.algebra,
title = {Matrix Algebra From a Statistician's Perspective},
author = {Harville, David A.},
year = {1997},
publisher = {Springer},
address = {New York},
}

@article{elff:divisions.positions.voting,
author = {Martin Elff},
title = {Social Divisions, Party Positions, and Electoral Behaviour},
journal = {Electoral Studies},
year = {2009},
volume = {28},
number = {2},
pages = {297-308},
doi = {10.1016/j.electstud.2009.02.002}
}

@Manual{Rcore,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2023},
url = {https://www.R-project.org/},
}

@Book{MASS,
title = {Modern Applied Statistics with S},
author = {W. N. Venables and B. D. Ripley},
publisher = {Springer},
edition = {Fourth},
address = {New York},
year = {2002},
note = {ISBN 0-387-95457-0},
url = {https://www.stats.ox.ac.uk/pub/MASS4/},
}

@Book{nlme-book,
title = {Mixed-Effects Models in S and S-PLUS},
author = {José C. Pinheiro and Douglas M. Bates},
year = {2000},
publisher = {Springer},
address = {New York},
doi = {10.1007/b98882},
}


@article{Solomon.Cox:1992,
title = {Nonlinear component of variance models},
volume = {79},
issn = {0006-3444, 1464-3510},
doi = {10.1093/biomet/79.1.1},
number = {1},
journal = {Biometrika},
author = {Solomon, P. J. and Cox, D. R.},
year = {1992},
pages = {1--11},
}





9 changes: 5 additions & 4 deletions pkg/vignettes/random-effects.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ vignette: >
% \VignetteIndexEntry{Random effects in baseline logit models and conditional logit models}
% \VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: mclogit.bib
---

The "mclogit" package allows for the presence of random effects in
Expand All @@ -22,8 +23,7 @@ motivation for working on conditional logit models with random effects
was to make it possible to assess the impact of parties' political
positions on the patterns of voting behaviour in various European
countries. The results of this research are published in an article in
*Electoral Studies* `elff:divisions.positions.voting`{.interpreted-text
role="cite"}.
@elff:divisions.positions.voting.

In its earliest incarnation, the package supported only a very simple
random-intercept extension of conditional logit models (or "mixed
Expand Down Expand Up @@ -69,5 +69,6 @@ PQL-technique based on a (first-order) Laplace approximation was
supported, release 0.8, "mclogit" also supports the MQL technique,
which is based on a (first-order) Solomon-Cox approximation. The ideas
behind the PQL and MQL techniques are described e.g. in
`breslow.clayton:approximate.inference.glmm`{.interpreted-text
role="citet"}.
@breslow.clayton:approximate.inference.glmm.

# References

0 comments on commit 215d695

Please sign in to comment.