Computational Statistics ("Estatística Computacional")
Course materials for Computational Statistics, a PhD-level course at EMAp.
- We will be using the excellent materials from Professor Patrick Rebeschini (Oxford University) as a general guide for our course.
As complementary material,
-
These lecture notes by stellar statistician Professor Susan Holmes are also well worth taking a look.
-
Monte Carlo theory, methods and examples by Professor Art Owen, gives a nice and complete treatment of all the topics on simulation, including a whole chapter on variance reduction.
Other materials, including lecture notes and slides may be posted here as the course progresses.
Here you can find a nascent annotated bibliography with landmark papers in the field. This review paper by Professor Hedibert Lopes is far better than anything I could conjure, however.
Books marked with [a] are advanced material.
Main
- Gamerman, D., & Lopes, H. F. (2006). Markov chain Monte Carlo: stochastic simulation for Bayesian inference. Chapman and Hall/CRC.
- Robert, C. P., Casella, G. (2004). Monte Carlo Statistical Methods. John Wiley & Sons, Ltd.
Supplementary
- Givens, G. H., & Hoeting, J. A. (2012). Computational Statistics (Vol. 710). John Wiley & Sons.
- [a] Meyn, S. P., & Tweedie, R. L. (2012). Markov chains and stochastic stability. Springer Science & Business Media. [PDF].
- [a] Nummelin, E. (2004). General irreducible Markov chains and non-negative operators (Vol. 83). Cambridge University Press.
An assigment on Gibbs samplers for linear regression with heterokedasticity under conjugate priors is now available.
- Random Number Generation by Pierre L'Ecuyer;
- Non-Uniform Random Variate Generation by the great Luc Devroye;
- Walker's Alias method is a fast way to generate discrete random variables;
- Rejection Control and Sequential importance sampling (1998), by Liu et al. discusses how to improve importance sampling by controlling rejections.
- This is a nice general comment about the role of simulation in numerical integration.
- These notes from David Levin and Yuval Peres are excellent and cover a lot of material one might find interesting on Markov processes.
- Charlie Geyer's website is a treasure trove of material on Statistics in general, MCMC methods in particular. See, for instance, On the Bogosity of MCMC Diagnostics.
- Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions is nice paper on the construction of efficient proposals for reversible jump/transdimensional MCMC.
The two definitive texts on HMC are Neal (2011) and Betancourt (2017). A nice set of notes is Vishnoi (2021). Moreover, Hoffman & Gelman (2014) describes the No-U-turn sampler.
This post by Radford Neal explains why the Harmonic Mean Estimator (HME) is a terrible estimator of the evidence.
- This book by Nicolas Chopin and Omiros Papaspiliopoulos is a great introduction (as it says in the title) about SMC. SMC finds application in many areas, but dynamic (linear) models deserve a special mention. The seminal 1997 book by West and Harrison remains the de facto text on the subject.
- This elementary tutorial is simple but effective.
- The book The EM algorithm and Extensions is a well-cited resource.
- Monte Carlo EM by Bob Carpenter (Columbia).
- The original 1983 paper in Science open link by Kirpatrick et al is a great read.
- These visualisations of the traveling salesman problem might prove useful.
- These notes have a little bit of theory on the cooling scheme.
- Efron (1979) is a great resource and a seminal paper.
- A good introductory book is An introduction to the bootstrap by Efron and Tibshirani (1993). PDF.
- The technical justification of the bootstrap relies on the Glivenko-Cantelli theorem. The proof given in class is taken from here.
-
In these notes, Terence Tao gives insights into concentration of measure, which is the reason why integrating with respect to a probability measure in high-dimensional spaces is hard.
-
A Primer for the Monte Carlo Method, by the great Ilya Sobol, is one of the first texts on the Monte Carlo method.
-
The Harris inequality,
E[fg] >= E[f]E[g]
, forf
andg
increasing, is a special case of the FKG inequality. -
In Markov Chain Monte Carlo Maximum Likelihood, Charlie Geyer shows how one can use MCMC to do maximum likelihood estimation when the likelihood cannot be written in closed-form. This paper is an example of MCMC methods being used outside of Bayesian statistics.
-
This paper discusses the solution of Problem A in assigment 0 (2021).
Sometimes a clever way to make a target distribution easier to compute expectations with respect to is to reparametrise it. Here are some resources:
- A youtube video Introduction of the concepts and a simple example;
- Hamiltonian Monte Carlo for Hierarchical Models from M. J. Betancourt and Mark Girolami;
- A General Framework for the Parametrization of Hierarchical Models from Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sköld;
- Efficient parametrisations for normal linear mixed models from Alan E. Gelfand, Sujit K. Sahu and Bradley P. Carlin.
See #4. Contributed by @lucasmoschen.
- Rao-Blackwellisation is a popular technique for obtaining estimators with lower variance. I recommend the recent International Statistical Review article by Christian Robert and Gareth Roberts on the topic.
- A Visualisation of MCMC for various algorithms and targets.
In these blogs and websites you will often find interesting discussions on computational, numerical and statistical aspects of applied Statistics and Mathematics.
- Christian Robert's blog;
- John Cook's website;
- Statisfaction blog.