README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r setup, include = FALSE}
options(width = 100)
Sys.setlocale("LC_COLLATE", "en_US.UTF-8") # ensure common sorting envir
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
ver <- desc::desc_get_version(".")
ver <- paste0("https://img.shields.io/badge/Version-", ver,
              "-success.svg?style=flat&logo=github")
```

# The power package

<!-- badges: start -->
![GitHub version](`r ver`)
[![CRAN status](http://www.r-pkg.org/badges/version/power)](https://cran.r-project.org/package=power)
[![R-CMD-check](https://github.com/stufield/power/workflows/R-CMD-check/badge.svg)](https://github.com/stufield/power/actions)
[![](https://cranlogs.r-pkg.org/badges/grand-total/power)](https://cran.r-project.org/package=power)
[![Codecov test coverage](https://codecov.io/gh/stufield/power/branch/main/graph/badge.svg)](https://app.codecov.io/gh/stufield/power?branch=main)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://choosealicense.com/licenses/mit/)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->


The `power` package contains some simple functions to empirically
simulate and estimate statistical power under various statistical
test conditions. In general, simulations are performed with *known*
effect sizes or differences, and the proportion of detected significant
*p*-values represents the empirical power, i.e. $1 - \beta$ or 
`1 - TypeII` error.

The goal is typically get an idea of the required sample size
given an experimental design, statistical test, effect size, and 
desired power.


---------------


## Installation

The `power` package is not currently on [CRAN](https://CRAN.R-project.org)
but you can install the latest version from 
[github](https://github.com/stufield/power) via:

```{r install-github, eval = FALSE}
remotes::install_github("stufield/power")
```


## Loading `power`

Loading the `power` package is as simple as:

```{r load}
library(power)
```


## Plot power curves

The simplest way to generally plot power curves is via `plot_power_curves()`
which uses `power.t.test()` under the hood:

```{r plot-power-curves, fig.height = 6, fig.width = 11}
plot_power_curves(
  delta_vec = seq(0.5, 2, 0.1),
  power_vec = seq(0.5, 0.9, 0.1)
)
```

## Power curves and KS-distance

There is a loose "rule-of-thumb" relationship between Sensitivity/Specificity
and KS-distance, and of course, KS is related to effect size. So we
can visualize this relationship also via a standard power curve:

```{r ks-curves, fig.height = 6, fig.width = 11}
ks_tables <- ks_power_table()
ks_tables

# `power` as y-axis
plot(ks_tables)

# `n` as y-axis
plot(ks_tables, plot_power = FALSE)
```


-------------


## Two-Groups
### Empirical Power via Simulation

A more robust (?) empirical calculation of power can be
generated via simulation:

```{r power-curve-n, fig.height = 6, fig.width = 11}
# constant effect size (delta)
size_tbl <- withr::with_seed(1,
  t_power_curve(seq(10, 50, 2), delta = 0.75, nsim = 25L)
)
size_tbl

plot(size_tbl)
```

```{r power-curve-d, fig.height = 6, fig.width = 11}
# constant sample size (n)
delta_tbl <- withr::with_seed(2,
  t_power_curve(seq(0.5, 2.5, 0.1), n = 10, nsim = 25L)
)
delta_tbl

plot(delta_tbl)
```

### Solve for Sample Size

To solve for the sample size given a corresponding power value
you must have simulate power keeping "delta" (effect size) constant
and varying `n`. For example, the `size_tbl` object created above:

```{r solve-n}
solve_n(size_tbl, 0.8)
```


## Fisher's Exact for Count Data

For count data, Fisher's Exact tests assume a 2x2 contingency 
matrix and equal proportions across the margins (rows x cols):

```{r fisher-power}
fisher_power(0.85, 0.75, 200, 200, nsim = 200L)
```

### Fisher's Power Curve

```{r fisher-curve}
f_tbl <- fisher_power_curve(seq(50, 400, 5), p = 0.85, p_diff = -0.1,
                            nsim = 200L)
f_tbl
```


### Plot the Power Curve
```{r plot-fisher-curve, fig.height = 6, fig.width = 11}
gg_pwr <- plot(f_tbl)
gg_pwr
```

### Solve for `n`

```{r solve-n-fisher}
pwr_n <- solve_n(f_tbl, 0.8)
pwr_n
```


Visually check the curve and add the solution to the `ggplot`.


```{r plot-fisher-curve2, fig.height = 6, fig.width = 11}
gg_pwr +
  ggplot2::annotate("segment",
    x        = c(pwr_n[["n"]], min(f_tbl$n)),
    xend     = c(pwr_n[["n"]],pwr_n[["n"]]),
    y        = c(min(f_tbl$power), pwr_n[["power"]]),
    yend     = c(pwr_n[["power"]], pwr_n[["power"]]),
    linetype = "dashed", colour = "#00A499")
```