This workshop will cover foundational elements of modern practices of causal inference such as the potential outcomes model as well as discuss in detail the most common designs: regression discontinuity, instrumental variables, difference in differences, comparative case studies using synthetic control and if time permitting matching. It will be accompanied by efforts to introduce students to basic practices in programming as well as good research practices more generally.
Hidden Curriculum
About
While causal inference is a design and model based approach to estimating causal effects, it ultimately uses large data sources, computers and programming languages to do that estimation. Thus while you can teach causal inference as separate from empirical workflow, you shouldn't. Here I discuss my own personal beliefs about empirical workflow, going through such things as missingness in data, hierarchy of directories, version control and more. We will discuss these things.
Slides
Potential Outcomes
About
The modern theory of causality is based on a seemingly simple idea called the "counterfactual". The counterfactual is an unusual features of the arsenal of modern statistics because it is more or less storytelling about alternative worlds that may or may not exist, but could have existed had one single decision gone a different way. Out of this idea grew what a model, complete with its own language, on top of which the field of causal inference is based, and the purpose of this lecture is to learn that language. The language is called potential outcomes and it forms the basis for many causal objects we tend to be interested in, such as the average treatment effect. I also cover randomization, selection bias and randomization inference.
Slides
Code
- Stata: ri.do, tea.do, thornton_ri.do
- R: Potential outcomes
- python: Potential outcomes
Readings
Mixtape chapter 4 Potential Outcomes Causal Model Software: Daggity
Directed Acyclic Graphs
About
Model-based approaches to identification can be sometimes better seen using causal graphs called directed acyclic graphs (DAGs). These modeling approaches are compatible with the design-based approach, but tend to emphasize a priori domain knowledge as opposed to treatment manipulation exclusively. Here we discuss the backdoor criterion, the frontdoor criterion, and collider bias.
Slides
Code
- Stata: moviestar.do, collider_discrimination.do
- R: DAGs
- python: DAGs
Readings
Mixtape chapter 3 Directed Acyclic Graphs
Sharp Regression Discontinuity
About One of the most desired quasi-experimental designs -- desired because it is viewed as highly credible despite being based on observational data -- is the regression discontinuity design. Here I will discuss the sharp RDD in great detail, going through identification, estimation, specification tests and tips, as well as a replication.
Slides RDD slides
Code
Readings Mixtape chapter 6: Regression discontinuity
Instrumental Variables
About ...
Slides ...
Code ...
Readings ...
Difference-in-Differences
About ...
Slides ...
Code ...
Readings ...
Synthetic Control
About ...
Slides ...
Code ...
Readings ...
Matching
About ...
Slides ...
Code ...
Readings ...