Computational Statistics ("Estatística Computacional")

Course materials for Computational Statistics, a PhD-level course at EMAp.

Lecture notes and other resources

We will be using the excellent materials from Professor Patrick Rebeschini (Oxford University) as a general guide for our course.

As complementary material,

These lecture notes by stellar statistician Professor Susan Holmes are also well worth taking a look.
Monte Carlo theory, methods and examples by Professor Art Owen, gives a nice and complete treatment of all the topics on simulation, including a whole chapter on variance reduction.

Other materials, including lecture notes and slides may be posted here as the course progresses.

Here you can find a nascent annotated bibliography with landmark papers in the field. This review paper by Professor Hedibert Lopes is far better than anything I could conjure, however.

Books

Books marked with [a] are advanced material.

Main

Gamerman, D., & Lopes, H. F. (2006). Markov chain Monte Carlo: stochastic simulation for Bayesian inference. Chapman and Hall/CRC.
Robert, C. P., Casella, G. (2004). Monte Carlo Statistical Methods. John Wiley & Sons, Ltd.

Supplementary

Givens, G. H., & Hoeting, J. A. (2012). Computational Statistics (Vol. 710). John Wiley & Sons.
[a] Meyn, S. P., & Tweedie, R. L. (2012). Markov chains and stochastic stability. Springer Science & Business Media. [PDF].
[a] Nummelin, E. (2004). General irreducible Markov chains and non-negative operators (Vol. 83). Cambridge University Press.

News

An assigment on Gibbs samplers for linear regression with heterokedasticity under conjugate priors is now available.

Simulation

Random Number Generation by Pierre L'Ecuyer;
Non-Uniform Random Variate Generation by the great Luc Devroye;
Walker's Alias method is a fast way to generate discrete random variables;
Rejection Control and Sequential importance sampling (1998), by Liu et al. discusses how to improve importance sampling by controlling rejections.
This is a nice general comment about the role of simulation in numerical integration.

Markov chains

These notes from David Levin and Yuval Peres are excellent and cover a lot of material one might find interesting on Markov processes.

Markov chain Monte Carlo

Charlie Geyer's website is a treasure trove of material on Statistics in general, MCMC methods in particular. See, for instance, On the Bogosity of MCMC Diagnostics.
Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions is nice paper on the construction of efficient proposals for reversible jump/transdimensional MCMC.

Hamiltonian Monte Carlo

The two definitive texts on HMC are Neal (2011) and Betancourt (2017). A nice set of notes is Vishnoi (2021). Moreover, Hoffman & Gelman (2014) describes the No-U-turn sampler.

Normalising Constants

This post by Radford Neal explains why the Harmonic Mean Estimator (HME) is a terrible estimator of the evidence.

Sequential Monte Carlo and Dynamic models

This book by Nicolas Chopin and Omiros Papaspiliopoulos is a great introduction (as it says in the title) about SMC. SMC finds application in many areas, but dynamic (linear) models deserve a special mention. The seminal 1997 book by West and Harrison remains the de facto text on the subject.

Optmisation

The EM algorithm

This elementary tutorial is simple but effective.
The book The EM algorithm and Extensions is a well-cited resource.
Monte Carlo EM by Bob Carpenter (Columbia).

Simulated Annealing

The original 1983 paper in Science open link by Kirpatrick et al is a great read.
These visualisations of the traveling salesman problem might prove useful.
These notes have a little bit of theory on the cooling scheme.

Bootstrap

Efron (1979) is a great resource and a seminal paper.
A good introductory book is An introduction to the bootstrap by Efron and Tibshirani (1993). PDF.
The technical justification of the bootstrap relies on the Glivenko-Cantelli theorem. The proof given in class is taken from here.

Miscellanea

In these notes, Terence Tao gives insights into concentration of measure, which is the reason why integrating with respect to a probability measure in high-dimensional spaces is hard.
A Primer for the Monte Carlo Method, by the great Ilya Sobol, is one of the first texts on the Monte Carlo method.
The Harris inequality, E[fg] >= E[f]E[g], for f and g increasing, is a special case of the FKG inequality.
In Markov Chain Monte Carlo Maximum Likelihood, Charlie Geyer shows how one can use MCMC to do maximum likelihood estimation when the likelihood cannot be written in closed-form. This paper is an example of MCMC methods being used outside of Bayesian statistics.
This paper discusses the solution of Problem A in assigment 0 (2021).

Reparametrisation

Sometimes a clever way to make a target distribution easier to compute expectations with respect to is to reparametrise it. Here are some resources:

A youtube video Introduction of the concepts and a simple example;
Hamiltonian Monte Carlo for Hierarchical Models from M. J. Betancourt and Mark Girolami;
A General Framework for the Parametrization of Hierarchical Models from Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sköld;
Efficient parametrisations for normal linear mixed models from Alan E. Gelfand, Sujit K. Sahu and Bradley P. Carlin.

See #4. Contributed by @lucasmoschen.

Variance reduction

Rao-Blackwellisation is a popular technique for obtaining estimators with lower variance. I recommend the recent International Statistical Review article by Christian Robert and Gareth Roberts on the topic.

Extra (fun) resources

A Visualisation of MCMC for various algorithms and targets.

In these blogs and websites you will often find interesting discussions on computational, numerical and statistical aspects of applied Statistics and Mathematics.

Christian Robert's blog;
John Cook's website;
Statisfaction blog.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
assigments		assigments
code		code
slides		slides
supporting_material		supporting_material
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
annotated_bibliography.md		annotated_bibliography.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computational Statistics ("Estatística Computacional")

Lecture notes and other resources

Books

News

Simulation

Markov chains

Markov chain Monte Carlo

Hamiltonian Monte Carlo

Normalising Constants

Sequential Monte Carlo and Dynamic models

Optmisation

The EM algorithm

Simulated Annealing

Bootstrap

Miscellanea

Reparametrisation

Variance reduction

Extra (fun) resources

About

Releases 3

Packages

Contributors 2

Languages

License

maxbiostat/Computational_Statistics

Folders and files

Latest commit

History

Repository files navigation

Computational Statistics ("Estatística Computacional")

Lecture notes and other resources

Books

News

Simulation

Markov chains

Markov chain Monte Carlo

Hamiltonian Monte Carlo

Normalising Constants

Sequential Monte Carlo and Dynamic models

Optmisation

The EM algorithm

Simulated Annealing

Bootstrap

Miscellanea

Reparametrisation

Variance reduction

Extra (fun) resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages