-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
69 lines (51 loc) · 2.67 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# distreg
The `distreg` package is a collection of functions built for distribution regression. That is, one can estimate the distribution of $Y$ conditional on $X$ using a model for a binary outcome. For example,
$$
F_{Y|X}(y|x) = \Lambda(x'\beta)
$$
where $\Lambda$ is some known link function, such as logit.
This package was formerly called `TempleMetrics`. For the time being, we are just hosting the package on Github, but it should work just fine for you. The main methods are:
* `distreg` -- a function to implement distribution regression as in Foresi and Peracchi (1995) <doi:10.2307/2291056> and Chernozhukov, Fernandez-Val, and Melly (2013) <doi:10.3982/ECTA10582>.
* `lldistreg` -- a function to implement a local linear regression (it's local in just one particular variable -- often a ``treatment'')
In either case, it is often useful to convert these estimates into estimates of conditional distributions themselves. This can be done with the function `Fycondx`. We give some examples below.
## Installation
You can install `distreg` from github with:
```{r gh-installation}
# install.packages("devtools")
devtools::install_github("bcallaway11/distreg")
```
## Example 1
The first example is how to run distribution regression for a single value of $y$ of $Y$. This example uses the `igm` dataset which is a collection of 500 parent-child pairs of income along with the parent's education level which comes from the Panel Study of Income Dynamics (PSID).
```{r}
library(distreg)
data(igm)
head(igm)
```
```{r}
y0 <- median(igm$lcfincome)
dreg <- distreg(lcfincome ~ lfincome + HEDUC, igm, y0)
dreg
```
## Example 2
In many cases, of primary interest with distribution regression is obtaining $\hat{F}_{Y|X}(y|x)$ for some particular values of $y$ and $x$. That's what we do in this example.
```{r}
yvals <- seq(quantile(igm$lcfincome,.05,type=1),
quantile(igm$lcfincome,.95, type=1), length.out=100)
dres <- distreg(lcfincome ~ lfincome + HEDUC, igm, yvals)
xdf <- data.frame(lfincome=10, HEDUC="LessHS")
y0 <- yvals[50]
ecdf(igm$lcfincome)(y0)
Fycondx(dres, y0, xdf)
```
This example says that: (1) the fraction of "children" in the dataset with income below `r format(exp(y0),digits=4)` is `r round(ecdf(igm$lcfincome)(y0),2)`, but (2) we estimate that the fraction of children whose parent's income is `r format(exp(10),digits=4)` and have parent's with less than a HS education with income below `r format(exp(y0),digits=4)` is `r round(Fycondx(dres,y0,xdf)[[1]](y0),2)`.