-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
122 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,70 @@ | ||
|
||
<!-- README.md is generated from README.Rmd. Please edit that file --> | ||
|
||
# saeczi | ||
|
||
<!-- badges: start --> | ||
[![R-CMD-check](https://github.com/harvard-ufds/saeczi/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/harvard-ufds/saeczi/actions/workflows/R-CMD-check.yaml) | ||
<!-- badges: end --> | ||
|
||
# Development Mode | ||
## saeczi | ||
|
||
#### (Small Area Estimation for Continous Zero Inflated data) | ||
|
||
`saeczi` is an R package that implements a small area estimator that | ||
uses a two-stage modeling approach for zero-inflated response variables. | ||
In particular, we are working with variables that follow a | ||
semi-continuous distribution with a mixture of zeroes and positive | ||
continuously distributed values. An example can be seen below: | ||
|
||
saeczi is still under development. Please use at your own risk! | ||
![](README-zi-plot-1.png)<!-- --> | ||
|
||
# saeczi | ||
`saeczi` first fits a linear mixed model to the non-zero portion of the | ||
response and then a generalized linear mixed model with binomial | ||
response to classify the probability of zero for a given data point. In | ||
estimation these models are each applied to new data points and combined | ||
to compute a final prediction. | ||
|
||
saeczi is an R package that allows for the fitting of a zero-inflation | ||
estimator onto a sample dataset. Please note that, in order for a | ||
dataset to compatible with the zero-inflation estimator, the dataset | ||
must be of a sample dataset, where the means at the domain level of what | ||
would be considered a “population” dataset for the auxiliary variables | ||
must be available. To fit the zero-inflation estimator,first analyze the | ||
sample data and find a set of auxiliary variables that create a good | ||
model for both the linear regression model, as well as the logistic | ||
regression model. Lastly, assess the number of repetitions desired when | ||
fitting the bootstrap sample to estimate the variance. Once all of those | ||
things are decided, the unit_zi function can be used to predict domain | ||
level estimates of a sample dataset. | ||
The package can also generate MSE estimates using a parametric bootstrap | ||
approach described in Chandra and Sud (2012) either in parallel or | ||
sequentially. | ||
|
||
## Installation | ||
|
||
You can install saezi from github with: | ||
You can install the developmental version of `saeczi` from GitHub with: | ||
|
||
``` r | ||
# install.packages("pak") | ||
pak::pkg_install("harvard-ufds/saeczi") | ||
``` | ||
|
||
## Example | ||
|
||
We’ll use the internal package datasets to show an example of how to use | ||
`saeczi`. | ||
|
||
``` r | ||
install.packages("devtools") | ||
devtools::install_github("harvard-ufds/saeczi") | ||
library(saeczi) | ||
data(pop) | ||
data(samp) | ||
|
||
lin_formula <- DRYBIO_AG_TPA_live_ADJ ~ tcc16 + elev | ||
|
||
set.seed(5) | ||
result <- unit_zi(samp_dat = samp, | ||
pop_dat = pop, | ||
lin_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16 + elev, | ||
log_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16 + elev, | ||
domain_level = "COUNTYFIPS", | ||
mse_est = TRUE, | ||
B = 100, | ||
parallel = FALSE) | ||
|
||
|
||
result$res |> head() | ||
#> domain mse est | ||
#> 1 41001 61.01495 14.85495 | ||
#> 2 41003 87.99835 97.74967 | ||
#> 3 41005 176.88206 86.02207 | ||
#> 4 41007 344.48027 76.24752 | ||
#> 5 41009 76.81402 70.28624 | ||
#> 6 41011 80.75565 87.65072 | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.