diff --git a/README.Rmd b/README.Rmd index c20f6aa..97bb33b 100644 --- a/README.Rmd +++ b/README.Rmd @@ -22,7 +22,7 @@ knitr::opts_chunk$set( `saeczi` is an R package that implements a small area estimator that uses a two-stage modeling approach for zero-inflated response variables. In particular, we are working with variables that follow a semi-continuous distribution with a mixture of zeroes and positive continuously distributed values. An example can be seen below. -![](figs/README-zi-plot-1.png) +![](figs/README-zi-plot-1.png){width = 70%} `saeczi` first fits a linear mixed model to the non-zero portion of the response and then a generalized linear mixed model with binomial response to classify the probability of zero for a given data point. In estimation these models are each applied to new data points and combined to compute a final prediction. @@ -66,6 +66,9 @@ result <- saeczi(samp_dat = samp, B = 1000L) ``` + +#### Return + The function returns the following objects: | Name | Description | @@ -84,5 +87,20 @@ result$res |> head() ### Parallelization +`saeczi` supports parallelization through the `future` package to speed up the bootstrapping process, but requires a small amount of additional work on the part of the user. It is not enough just to specify `parallel = TRUE` in the function signature as a `future::plan` must also be specified. + +Below is an example that uses multisession' future resolution with 6 threads: + +```{r, eval = FALSE} +future::plan("multisession", workers = 6) +result_par <- saeczi(samp_dat = samp, + pop_dat = pop, + lin_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16 + elev, + log_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16, + domain_level = "COUNTYFIPS", + mse_est = TRUE, + parallel = TRUE, + B = 1000L) +``` diff --git a/README.md b/README.md index ae64f87..b08043f 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ In particular, we are working with variables that follow a semi-continuous distribution with a mixture of zeroes and positive continuously distributed values. An example can be seen below. -![](figs/README-zi-plot-1.png) +![](figs/README-zi-plot-1.png){width = 70%} `saeczi` first fits a linear mixed model to the non-zero portion of the response and then a generalized linear mixed model with binomial @@ -80,6 +80,8 @@ result <- saeczi(samp_dat = samp, B = 1000L) ``` +#### Return + The function returns the following objects: | Name | Description | @@ -95,13 +97,34 @@ few rows of the results: ``` r result$res |> head() -#> COUNTYFIPS mse est -#> 1 41001 216.3487 14.57288 -#> 2 41003 144.6466 103.33016 -#> 3 41005 276.4164 86.08616 -#> 4 41007 584.4503 78.79615 -#> 5 41009 169.2617 73.98920 -#> 6 41011 656.6422 90.44174 +#> COUNTYFIPS mse est +#> 1 41001 524.33803 14.57288 +#> 2 41003 1176.47914 103.33016 +#> 3 41005 18891.79642 86.08616 +#> 4 41007 27.43582 78.79615 +#> 5 41009 4674.77845 73.98920 +#> 6 41011 14.29977 90.44174 ``` ### Parallelization + +`saeczi` supports parallelization through the `future` package to speed +up the bootstrapping process, but requires a small amount of additional +work on the part of the user. It is not enough just to specify +`parallel = TRUE` in the function signature as a `future::plan` must +also be specified. + +Below is an example that uses multisession’ future resolution with 6 +threads: + +``` r +future::plan("multisession", workers = 6) +result_par <- saeczi(samp_dat = samp, + pop_dat = pop, + lin_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16 + elev, + log_formula = DRYBIO_AG_TPA_live_ADJ ~ tcc16, + domain_level = "COUNTYFIPS", + mse_est = TRUE, + parallel = TRUE, + B = 1000L) +```