Skip to content

Latest commit

 

History

History
330 lines (254 loc) · 12.8 KB

README.md

File metadata and controls

330 lines (254 loc) · 12.8 KB

demotool

  • Description: Repository for demoing how to make a usable tool in an R package
  • Author: Tim Fraser, PhD

Want to make a small, compact R package? It's not as hard as you'd think! Here's a brief tutorial that will walk you through the process of building your first publicly available tool as an R package!

1. Necessary Files

Your R package will need the following documents.

  • DESCRIPTION: the package metadata file, including eg. the package name, description, maintainers, etc. Always work from a template, because it can be persnickety.
  • /R code folder, with scripts whose names should match your functions, 1 function per script, usually.
  • NAMESPACE: a list of all functions for exporting. You don't edit this yourself; devtools::document() does it for you.
  • /man: manual folder, with manuals summarizing your functions. You don't make this yourself; devtools::document() does it for you. Automatically generated from your roxygen comments for your functions.
  • /data: data folder, with .rda files for any objects your package relies on to run functions, eg. data.frames, vectors, etc. Name of .rda file should match name of object when loaded.
  • /z: extra folder, for any other scripts or files you may need. I put scripts for building and testing the package here, as well as raw data.
  • .buildignore: a list of all files and folders to ignore when building the package. Eg. /z should go in here.

1.1 DESCRIPTION file template

For example, my DESCRIPTION file looks like this (see below)! Update the Package: name, Title:, Authors@R:, Maintainer:, and Description: fields.

Package: demotool
Version: 0.1.0
Date: 2023-11-01
Title: Demo package for Publicly Available R Package Tool
Authors@R: c(
  person("Fraser", "Timothy", email = "tmf77@cornell.edu", role = c("aut", "cre")),
  person("Julia", "Fakeperson", email = "myemail@cornell.edu", role = "aut"),
  person("Javier", "Fakeperson", email = "myemail2@cornell.edu", role = "ctb") )
Author: Timothy Fraser [aut, cre],
  Julia Fakeperson [aut],
  Javier Fakeperson [ctb]
Maintainer: Timothy Fraser <tmf77@cornell.edu>
Description: One-line description for package goes here. Make sure this script ends with an extra blank line after `Imports:`
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Rxoygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Depends: R (>= 3.5.0)

1.2 NAMESPACE file template

Your NAMESPACE file is automatically generated when you devtools::document() the package. Don't edit it yourself. It tells the package what functions to export for use. Here's a short sample:

# Generated by roxygen2: do not edit by hand

export(get_prob)

1.3 .buildignore file template

Your .buildignore file tells the R package which files and folders to ignore when building the package. Important - the package will get confused by anything other than its core files (DESCRIPTION, NAMESPACE, .buildignore, /R folder, /data folder, /man folder).

  • To say, "ignore this file", write ^filename.
  • To say, "ignore this type of file", write ^.filetype.
  • To say, "ignore this folder and its contents", write ^z$.

Here's my template .buildignore file.

^.*\.Rproj$
^\.Rproj\.user$
^.Renviron
^.Rhistory
^z$

1.4 function template

Your /R folder contains scripts, one for each function. Each script will need a a special header, called roxygen2 commenting. Instead of #, we write #', followed by a tag like @name, which carries special meaning and helps the package auto generate its own documentation. I've written up a short function called plus_one(), as well as a long function called get_prob() that you can use as templates when building your functions.

Let's look at them!

1.4.1 Short function template: plus_one()

#' @name plus_one
#' @title Plus One
#' @description Function to add 1 to a numeric input vector.
#' @author Tim Fraser, PhD
#' @params x (numeric) vector of 1 or more numeric values.
#' @note Adding `@export` below means this function will become accessible by package users, rather than being an internal-only function.
#' @export
plus_one = function(x){
  output = x + 1
  return(output)
} 

1.4.2 Long function template: get_prob()

Here's a template for a longer, more complex function. Check out my extra commenting, which will help you edit the code to suit your needs.

#' @name get_prob
#' @title `get_prob()`
#' @description
#' A short description of the function!
#' Can be multiple lines
#' Notice how roxygen commenting always starts with #'
#' For example:
#' This function calculates system reliability over time t given a set of lambdas supplied.
#' @note
#' - `@name` should have no spaces and match the function name.
#' - `@title` is the official title and can be anything.
#' - `@description` is a description that can span multiple lines
#' - `@param` marks each input parameters for your function.
#' - `@export` exports your function to be available to a user of your package. (Eg. not an internal function)
#' - `@importFrom` bundles into your package 1 or more specific functions from another package, so that your package will always function.
#' @param t [integer] time passed. Can be a single integer or a vector of integers.
#' @param lambdas [vector] a vector of failure rates for components, named $\lambda$
#' @param type [character] a single value describing whether these probabilities should be combined using the rules of series or parallel systems.
#'
#' @note You can specify default inputs for an input parameter like with `type = "series"` below.
#' @examples 
#' 
#' # Get series system probability at each time t
#' get_prob(t = c(2,4,5,6), lambdas = c(0.001, 0.02), type = "series")
#' 
#' # Get parallel system probability at each time t
#' get_prob(t = c(2,4,5,6), lambdas = c(0.001, 0.02), type = "parallel")
#' 
#' @importFrom tidyr expand_grid
#' @importFrom dplyr `%>%` mutate summarize group_by
#' 
#' @export

get_prob = function(t = 100, lambdas, type = "series"){
  
  # Testing values #########################################
  #    I often store my testing values at the top of the function, 
  #    **commented out**, so I can run them then test the function easily.
  # t = c(100, 200, 300, 400)
  # lambdas = c(0.001, 0.002, 0.01)
  # type = "series"
  
  # Testing packages #######################################
  #    I often list my packages for testing here. 
  #    But be sure to add specific functions to @importFrom
  # library(dplyr)
  # library(tidyr)
  
  # Error handling #####################################
  # If type is not in series or parallel
  if(!type %in% c("series", "parallel")){
    # Stop and provide this error message.
    stop("type must equal 'series' or 'parallel'.")
  }
  
  
  # Get a grid of t and lambdas
  # Numbering your objects can help show progression when developing functions
  grid1 = tidyr::expand_grid(t = t, lambda = lambdas)
  
  grid2 = grid1 %>%
    # For each row,
    dplyr::mutate(
      # Calculate cumulative probability of failure by time t 
      # using exponential distribution, with rate from lambda column
      p_fail = pexp(t, rate = lambda),
      # Then get probability of survival/reliability by time t
      p_reliability = 1 - p_fail)
  
  # If type parameter equals 'series'
  if(type == "series"){
    # Get stats...
    stat = grid2 %>%
      # For each time t
      group_by(t) %>%
      # calculating the product of each component's reliability
      summarize(prob = prod(p_reliability) )
    
    # Or, instead, if type parameter equals 'parallel'
  }else if(type == "parallel"){
    # Get stats...
    stat = grid2 %>%
      # For each time t
      group_by(t) %>%
      # calculating 1 minus the product of each component's chance of failure
      summarize(prob = 1 - prod(p_fail))  
    # Or, if some other value was input...
  }
  
  # I like to always use return(). It's very clear.
  return(stat)
 
}

1.5 Adding data to the package (optional)

If you need to add data to your package, you can put a z/make_data.R script in your /z folder (which is ignored when building the package, according to the .buildignore file). You would run this, saving your data as a .rda file in the /data folder. It might look like my z/make_data.R template below.

#' @name make_data.R
#' @description Script for data generation
#' @note These roxygen2 comments are just for kicks. Not needed anywhere except the /R folder.

# Make the data, and give it an easy to use name (that's what users will call it when coding with it)
helper = tibble(x = 1, y = 2, z = 3)
# Alternatively, read it in from somewhere, eg.
# helper = readr::read_csv("z/helper.csv") # a hypothetical file

# Save it as an .rda file in the `/data` folder
save(helper, "data/helper.rda")

2. devtools

Once you have your prerequisite files, you can build your R package! To build an R package, you will need to install devtools, the package for R package development!

There are 3 main steps: document() your functions, build() your package, and then install it with install.packages("packagename.tar.gz", type = "source").

I encourage you create 2 helper files in the /z folder for this purpose, and add the /z folder to the .buildignore file. These files include...

  • z/workflow.R: a script where you design functions. A workspace.
  • z/dev.R: a script where you document() the package, test functions, build() the package when satisfied, and test install it.

Here's examples of each.

2.1 z/workflow.R:

#' @name workflow.R
#' @title Example Workflow for `demotool` package functions
#' @author Tim Fraser, your names here...
#' @description Script for test workflow of `demotool` package functions.


# Load functions straight from file
source("R/plus_one.R")
source("R/get_prob.R")

# Or use load_all() from devtools to load them as if they were a package
# devtools::load_all(".")

plus_one(x = 1)
get_prob(t = c(2,4,5,6), lambdas = c(0.001, 0.02), type = "series")


# Test out making new functions
plus_n = function(x, n){ x + n }

plus_n(x = c(1,2,3), n = 2)

# When you're happy with a function, go put it in an R script in the /R folder.

# Always a good idea to clear your environment and cache
rm(list = ls()); gc()

2.2 z/dev.R:

# Unload your package and uninstall it first.
unloadNamespace("demotool"); remove.packages("demotool")
# Auto-document your package, turning roxygen comments into manuals in the `/man` folder
devtools::document(".")
# Load your package temporarily!
devtools::load_all(".")

# Test out our functions

# Add 1 to x
demotool::plus_one(x = 1)
# Add 1 to a vector of x values
demotool::plus_one(x = c(1,2,3,4))

# Get series system probability at each time t
demotool::get_prob(t = c(2,4,5,6), lambdas = c(0.001, 0.02), type = "series")
 
# Get parallel system probability at each time t
demotool::get_prob(t = c(2,4,5,6), lambdas = c(0.001, 0.02), type = "parallel")

# Just think: you could make many functions,
# outputting vectors, data.frames, ggplot visuals, etc.
# So many options!

# When finished, remember to unload the package
unloadNamespace("demotool"); remove.packages("demotool")

# Then, when ready, document, unload, build, and install the package!
# For speedy build, use binary = FALSE and vignettes = FALSE
devtools::document(".");
unloadNamespace("demotool");
devtools::build(pkg = ".", path = getwd(), binary = FALSE, vignettes = FALSE)


# Install your package from a local build file
# such as 
# install.packages("nameofyourpackagefile.tar.gz", type = "source")
# or in our case:
install.packages("demotool_0.1.0.tar.gz", type = "source")

# Load your package!
library("demotool")


# When finished, remember to unload the package
unloadNamespace("demotool"); remove.packages("demotool")

# Always a good idea to clear your environment and cache
rm(list = ls()); gc()

2.3 Download Your Package

Finally, once happy with your package, put the whole thing in a public github repository (be sure to commit and push it). That repository's name must match the package name exactly, just like this one (demotool). Then, you can use devtools to install your package straight from github! Here's our z/install_from_github.R helper script:

#' @name install_from_github.R
#' @title Download Package from Github!
#' @author Tim Fraser
#' @description This script shows you how to install your package straight from a public github repository!
#' Note that for your own package, you'll need to put your finalized package on github first.

# Install your package straight from github!
# Try installing mine:
devtools::install_github("timothyfraser/demotool")
library(demotool) # load package 
plus_one(x = 2) # test function

# Now do yours!
# devtools::install_github("yourgithubname/your_repo_name_which_should_be_the_same_as_your_package_name")