From 1bfcf63692ff723cbbc204412cf0c8e97465024c Mon Sep 17 00:00:00 2001
From: Marie Vendettuoli <mariecv26@gmail.com>
Date: Tue, 5 Apr 2022 11:07:06 -0700
Subject: [PATCH] Fix compare digests (#95)

* Author changes in description file

* Dev version bump

* update compare for same number of dataobjects

* Update CONTRIBUTING.md with Git workflow guidlines and CI reminder. (#87)

Co-authored-by: Jason Taylor jmtaylor@fredhutch.org

* Vignette edits (#92)

* text edits to vignettes
* match casing for vignette names
* rm non-rmd files in vignettes, add .gitignore to vignettes
* add style pointers for vignettes to contributing

* alter valid name expression to allow for different names in each set

* add tests for .compare_digests

* Usethis patch (#74)

* Patch to address #73
* bump version, update travis, test locally, rerun codemeta. #73
* imports for withr for #73
* update news. #73

* refactor logic of .compare_digests, fix typo package_build help

* add test for .compare_digests refactor

* Add maintainer email and role.

* improve test coverage

* new_digest zero length check in .check_digests not needed, removed

* specify serialization for backward compatibility (#94)

* specify serialization for backward compatibility

* removed whitespace edits

Co-authored-by: Jason Taylor <jmtaylor@fredhutch.org>

* add condition for checking for removed data in digest

* fix list subsetting, update tests

* add back deleted dataversion.R

Co-authored-by: William Fulp <wfulp@fredhutch.org>
Co-authored-by: jmtaylor-fhcrc <42221209+jmtaylor-fhcrc@users.noreply.github.com>
Co-authored-by: Jason Taylor <jmtaylor@fredhutch.org>
Co-authored-by: Greg Finak <greg.finak@gmail.com>
Co-authored-by: Jimmy Fulp <WilliamJFulp@gmail.com>
---
 .github/CONTRIBUTING.md                       |  25 +-
 DESCRIPTION                                   |  11 +-
 NAMESPACE                                     |   1 +
 NEWS.md                                       |   3 +
 R/build.R                                     |   2 +-
 R/{dataversion.r => dataversion.R}            |   0
 R/digests.R                                   |  56 +-
 R/processData.R                               |   3 +-
 R/yamlR.R                                     |   1 +
 codemeta.json                                 |  12 +
 tests/testthat/test-data-name-change.R        |  61 ++
 tests/testthat/test-edge-cases.R              |  61 +-
 tests/testthat/test-ignore.R                  |   5 +-
 vignettes/.gitignore                          |   3 +
 ...ataPackageR.Rmd => Using_DataPackageR.Rmd} | 238 +++--
 vignettes/YAML_CONFIG.R                       |  69 --
 vignettes/YAML_CONFIG.html                    | 522 -----------
 vignettes/YAML_CONFIG.md                      | 305 -------
 ...FIG.Rmd => YAML_Configuration_Details.Rmd} |  60 +-
 vignettes/usingDataPackageR.R                 |  69 --
 vignettes/usingDataPackageR.html              | 819 ------------------
 vignettes/usingDataPackageR.md                | 585 -------------
 22 files changed, 338 insertions(+), 2573 deletions(-)
 rename R/{dataversion.r => dataversion.R} (100%)
 create mode 100644 tests/testthat/test-data-name-change.R
 create mode 100644 vignettes/.gitignore
 rename vignettes/{usingDataPackageR.Rmd => Using_DataPackageR.Rmd} (64%)
 delete mode 100644 vignettes/YAML_CONFIG.R
 delete mode 100644 vignettes/YAML_CONFIG.html
 delete mode 100644 vignettes/YAML_CONFIG.md
 rename vignettes/{YAML_CONFIG.Rmd => YAML_Configuration_Details.Rmd} (65%)
 delete mode 100644 vignettes/usingDataPackageR.R
 delete mode 100644 vignettes/usingDataPackageR.html
 delete mode 100644 vignettes/usingDataPackageR.md

diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
index 0bd8a17..cae5a97 100644
--- a/.github/CONTRIBUTING.md
+++ b/.github/CONTRIBUTING.md
@@ -6,7 +6,7 @@ Small typos or grammatical errors in documentation may be edited directly using
 the GitHub web interface, so long as the changes are made in the _source_ file.
 
 *  YES: you edit a roxygen comment in a `.R` file below `R/`.
-*  NO: you edit an `.Rd` file below `man/`.
+*  NO: you should not edit an `.Rd` file below `man/`.
 
 ### Prerequisites
 
@@ -17,8 +17,11 @@ bug, create an associated issue and illustrate the bug with a minimal
 
 ### Pull request process
 
-*  We recommend that you create a Git branch for each pull request (PR).  
-*  Look at the Travis and AppVeyor build status before and after making changes.
+*  We are using the Git commit workflow found here: 
+https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow  
+*  We recommend that you create a Git branch for each pull request (PR) and keep
+issues addressed to one issue per branch.  
+*  Look at the git workflow checks before and after making changes.
 The `README` should contain badges for any continuous integration services used
 by the package.  
 *  We recommend the tidyverse [style guide](https://style.tidyverse.org).
@@ -41,6 +44,22 @@ project you agree to abide by its terms.
 ### See rOpenSci [contributing guide](https://ropensci.github.io/dev_guide/contributingguide.html)
 for further details.
 
+### Style pointers for vignettes
+
+* Headings are in sentence case and contain a period at the end of the heading when appropriate.
+* Conjunctions are excluded from text.
+* Multiline function calls use the following whitespace schema:
+
+``` R
+myFunction <- aFunctionCall(
+  input1 = "this",
+  input2 = "that"
+)
+```
+
+* Vignette file names are snake cased with capital first letters: Like_This.Rmd
+* Vignette `VignetteIndexEntry` are in sentence case: My new vignette
+
 ### Discussion forum
 
 Check out our [discussion forum](https://discuss.ropensci.org) if you think your issue requires a longer form discussion.
diff --git a/DESCRIPTION b/DESCRIPTION
index 7ff6060..e4280b9 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -7,15 +7,15 @@ Authors@R:
              email="greg.finak@gmail.com",
              comment="Original author and creator of DataPackageR"),
       person(given = "Paul Obrecht", role=c("ctb")),
-      person(given = "Ellis Hughes", role=c("ctb","cre"), email = "ehhughes@scharp.org"),
+      person(given = "Ellis Hughes", role=c("ctb"), email = "ellishughes@live.com"),
       person(given = "Jimmy Fulp", role=c("ctb"), email = "wfulp@scharp.org"),
-      person(given = "Marie Vendettuoli", role=c("ctb"), email = "mvendett@scharp.org"),
-      person(given = "Jason Taylor", role=c("ctb")),
+      person(given = "Marie Vendettuoli", role=c("ctb", "cre"), email = "sc.mvendett.viscdevelopers@scharp.org"),
+      person(given = "Jason Taylor", role=c("ctb"), email = "jmtaylor@fredhutch.org"),
       person("Kara", "Woo", role = "rev",
       comment = "Kara reviewed the package for ropensci, see <https://github.com/ropensci/onboarding/issues/230>"),
       person("William", "Landau", role = "rev",
       comment = "William reviewed the package for ropensci, see <https://github.com/ropensci/onboarding/issues/230>"))
-Version: 0.15.8
+Version: 0.15.8.9000
 Description: A framework to help construct R data packages in a 
   reproducible manner. Potentially time consuming processing of 
   raw data sets into analysis ready data sets is done in a reproducible 
@@ -45,7 +45,8 @@ Imports:
     futile.logger,
     rprojroot,
     usethis,
-    crayon
+    crayon,
+    withr
 VignetteBuilder: knitr
 RoxygenNote: 7.1.1
 Encoding: UTF-8
diff --git a/NAMESPACE b/NAMESPACE
index a2e6532..e63f3a9 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -75,6 +75,7 @@ importFrom(utils,modifyList)
 importFrom(utils,package.skeleton)
 importFrom(utils,packageDescription)
 importFrom(utils,packageVersion)
+importFrom(withr,with_options)
 importFrom(yaml,as.yaml)
 importFrom(yaml,read_yaml)
 importFrom(yaml,write_yaml)
diff --git a/NEWS.md b/NEWS.md
index 2af6c51..70aa7e4 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,6 @@
+# DataPackageR 0.15.8.9000
+* Fix tests for compatibility with usethis >1.5.2
+
 # DataPackageR 0.15.8
 * Fix to datapackager_object_read that was causing a test to break. `get` needs to have `inherits=FALSE`. 
 * Other fixes for `usethis` 1.6.0
diff --git a/R/build.R b/R/build.R
index c7f5952..6ca66d9 100644
--- a/R/build.R
+++ b/R/build.R
@@ -8,7 +8,7 @@
 #' @param vignettes \code{logical} specify whether to build vignettes. Default FALSE.
 #' @param log log level \code{INFO,WARN,DEBUG,FATAL}
 #' @param deps \code{logical} should we pass data objects into subsequent scripts? Default TRUE
-#' @param install \code{logical} automatically install and load the package after building. (default TRUE)
+#' @param install \code{logical} automatically install and load the package after building. Default FALSE
 #' @param ... additional arguments passed to \code{install.packages} when \code{install=TRUE}.
 #' @importFrom roxygen2 roxygenise roxygenize
 #' @importFrom devtools build_vignettes build parse_deps reload
diff --git a/R/dataversion.r b/R/dataversion.R
similarity index 100%
rename from R/dataversion.r
rename to R/dataversion.R
diff --git a/R/digests.R b/R/digests.R
index ebdbd44..a59191c 100644
--- a/R/digests.R
+++ b/R/digests.R
@@ -30,42 +30,38 @@
 }
 
 .compare_digests <- function(old_digest, new_digest) {
-  valid <- ifelse(length(old_digest) != length(new_digest), FALSE, TRUE)
-  if (valid) {
-    for (i in names(new_digest)[-1L]) {
-      if (new_digest[[i]] != old_digest[[i]]) {
-        .multilog_warn(paste0(i, " has changed."))
-        valid <- FALSE
+  # Returns FALSE when any exisiting data has is changed, new data is added, or data is removed, else return TRUE.
+  # Use .mutlilog_warn when there is a change and multilog_debug when new data is added.
+
+  existed <- names(new_digest)[names(new_digest) %in% names(old_digest)]
+  added <- setdiff(names(new_digest), existed)
+  removed <- names(old_digest)[!names(old_digest) %in% names(new_digest)]
+  existed <- existed[existed != "DataVersion"]  
+
+  if(length(existed) > 0){
+    changed <- names(new_digest)[ unlist(new_digest[existed]) != unlist(old_digest[existed]) ] 
+    if(length(changed) > 0){
+      for(name in changed){
+        .multilog_warn(paste(name, "has changed."))
       }
-    }
-    if (!valid) {
       warning()
-    }
-  } else {
-    difference <- setdiff(names(new_digest), names(old_digest))
-    intersection <- intersect(names(new_digest), names(old_digest))
-    # No existing or new objects are changed
-    if (length(difference) == 0) {
-      valid <- TRUE
-    } else {
-      # some new elements exist
-      valid <- FALSE
-      for (i in difference) {
-        .multilog_debug(paste0(i, " added."))
-      }
-    }
-    for (i in intersection) {
-      if (new_digest[[i]] != old_digest[[i]]) {
-        .multilog_debug(paste0(i, " changed"))
-        # some new elements are not the same
-        valid <- FALSE
-      }
+      return(FALSE)
     }
   }
-  return(valid)
-}
 
+  for(name in removed){
+    .multilog_debug(paste(name, "was removed."))
+    return(FALSE)
+  }
+  
+  for(name in added){
+    .multilog_debug(paste(name, "was added."))
+    return(FALSE)
+  }
 
+  return(TRUE)
+};
+  
 .combine_digests <- function(new, old) {
   intersection <- intersect(names(new), names(old))
   difference <- setdiff(names(new), names(old))
diff --git a/R/processData.R b/R/processData.R
index ad174c3..972d177 100644
--- a/R/processData.R
+++ b/R/processData.R
@@ -360,7 +360,8 @@ DataPackageR <- function(arg = NULL, deps = TRUE) {
           assign(o, get(o, dataenv), ENVS)
           # write the object to render_root
           o_instance <- get(o,dataenv)
-          saveRDS(o_instance, file = paste0(file.path(render_root,o),".rds"))
+          saveRDS(o_instance, file = paste0(file.path(render_root,o),".rds"),
+                  version = 2)
         }
       }
     }
diff --git a/R/yamlR.R b/R/yamlR.R
index b02b748..12a36fb 100644
--- a/R/yamlR.R
+++ b/R/yamlR.R
@@ -7,6 +7,7 @@
 #' @details Add, remove files and objects, enable or disable parsing of specific files,  list objects or files in a yaml config, or write a config back to a package.
 #' @importFrom yaml yaml.load_file as.yaml write_yaml
 #' @importFrom stats runif
+#' @importFrom withr with_options 
 #' @export
 #'
 #' @examples
diff --git a/codemeta.json b/codemeta.json
index b60faaf..8ea22bd 100644
--- a/codemeta.json
+++ b/codemeta.json
@@ -295,6 +295,18 @@
       },
       "sameAs": "https://CRAN.R-project.org/package=crayon"
     },
+    {
+      "@type": "SoftwareApplication",
+      "identifier": "withr",
+      "name": "withr",
+      "provider": {
+        "@id": "https://cran.r-project.org",
+        "@type": "Organization",
+        "name": "Comprehensive R Archive Network (CRAN)",
+        "url": "https://cran.r-project.org"
+      },
+      "sameAs": "https://CRAN.R-project.org/package=withr"
+    },
     {
       "@type": "SoftwareApplication",
       "identifier": "https://sysreqs.r-hub.io/get/pandoc"
diff --git a/tests/testthat/test-data-name-change.R b/tests/testthat/test-data-name-change.R
new file mode 100644
index 0000000..68fea06
--- /dev/null
+++ b/tests/testthat/test-data-name-change.R
@@ -0,0 +1,61 @@
+test_that("data object can be renamed", {
+
+  addData <- function(dataset, pname){
+    fil <- sprintf("data(%s, envir=environment())", dataset)
+    writeLines(fil, file.path(tempdir(), sprintf("%s/data-raw/%s.R", pname, dataset)))
+    
+    yml <- yml_add_files(file.path(tempdir(), pname), c(sprintf("%s.R", dataset)))
+    yml <- yml_add_objects(yml, dataset)
+    yml_write(yml)
+
+    package_build(file.path(tempdir(), pname))
+  }
+
+  changeName <- function(old_dataset_name, new_dataset_name, pname){
+    process_path <- file.path(tempdir(), sprintf("%s/data-raw/%s.R", pname, old_dataset_name))
+    fil <- c(readLines(process_path), sprintf("%s <- %s", new_dataset_name, old_dataset_name))
+    writeLines(fil, process_path)
+
+    yml <- yml_remove_objects(file.path(tempdir(), pname), old_dataset_name)
+    yml <- yml_add_objects(yml, new_dataset_name)
+    yml_write(yml)
+    
+    package_build(file.path(tempdir(), pname))
+  }
+
+  removeName <- function(dataset_name, script, pname){
+    process_path <- file.path(tempdir(), sprintf("%s/data-raw/%s", pname, script))
+    fil <- gsub(paste0("^", dataset_name, ".+$"), "", readLines(process_path))
+    writeLines(fil, process_path)
+
+    yml <- yml_remove_objects(file.path(tempdir(), pname), dataset_name)
+    yml_write(yml)
+    
+    package_build(file.path(tempdir(), pname))
+  }
+
+  ## test change when one object is present 
+  pname <- "nameChangeTest1"
+  datapackage_skeleton(pname, tempdir(), force = TRUE)
+  addData("mtcars", pname)
+  expect_error(changeName("mtcars", "mtcars2", pname), NA)
+  expect_error(removeName("mtcars2", "mtcars.R", pname), "exiting")
+
+  ## test change when two objects are present 
+  pname <- "nameChangeTest2"
+  datapackage_skeleton(pname, tempdir(), force = TRUE)
+  addData("mtcars", pname)
+  addData("iris", pname)
+  expect_error(changeName("mtcars", "mtcars2", pname), NA)
+  expect_error(removeName("mtcars2", "mtcars.R", pname), NA)
+  
+  ## test change when more than 2 objects are present 
+  pname <- "nameChangeTest3"
+  datapackage_skeleton(pname, tempdir(), force = TRUE)
+  addData("mtcars", pname)
+  addData("iris", pname)
+  addData("ToothGrowth", pname)
+  expect_error(changeName("mtcars", "mtcars2", pname), NA)
+  expect_error(removeName("mtcars2", "mtcars.R", pname), NA)
+  
+})
diff --git a/tests/testthat/test-edge-cases.R b/tests/testthat/test-edge-cases.R
index 30e6be0..4a22a23 100644
--- a/tests/testthat/test-edge-cases.R
+++ b/tests/testthat/test-edge-cases.R
@@ -171,7 +171,8 @@ test_that("package built in different edge cases", {
   package.skeleton("foo", path = tempdir())
   DataPackageR:::.multilog_setup(normalizePath(file.path(tempdir(),"test.log"), winslash = "/"))
   DataPackageR:::.multilog_thresold(INFO, TRACE)
-  
+
+  # data in digest changes while names do not
   suppressWarnings(expect_false({
     DataPackageR:::.compare_digests(
       list(
@@ -185,6 +186,64 @@ test_that("package built in different edge cases", {
     )
   }))
 
+  # names in digest changes while data do not
+  suppressWarnings(expect_false({
+    DataPackageR:::.compare_digests(
+      list(
+        DataVersion = "1.1.1",
+        a = paste0(letters[1:10], collapse = "")
+      ),
+      list(
+        DataVersion = "1.1.2",
+        b = paste0(letters[1:10], collapse = "")
+      )
+    )
+  }))
+
+  # names in digest nor data changes
+  suppressWarnings(expect_true({
+    DataPackageR:::.compare_digests(
+      list(
+        DataVersion = "1.1.1",
+        a = paste0(letters[1:10], collapse = "")
+      ),
+      list(
+        DataVersion = "1.1.1",
+        a = paste0(letters[1:10], collapse = "")
+      )
+    )
+  }))
+
+  # names in old digest have one more than new
+  suppressWarnings(expect_false({
+    DataPackageR:::.compare_digests(
+      list(
+        DataVersion = "1.1.1",
+        a = paste0(letters[1:10], collapse = ""),
+        b = paste0(LETTERS[1:10], collapse = "")
+      ),
+      list(
+        DataVersion = "1.1.2",
+        a = paste0(letters[1:10], collapse = "")
+      )
+    )
+  }))
+
+  # names in new digest have one more than old
+  suppressWarnings(expect_false({
+    DataPackageR:::.compare_digests(
+      list(
+        DataVersion = "1.1.1",
+        a = paste0(letters[1:10], collapse = "")
+      ),
+      list(
+        DataVersion = "1.1.2",
+        a = paste0(letters[1:10], collapse = ""),
+        b = paste0(LETTERS[1:10], collapse = "")
+      )
+    )
+  }))
+  
   unlink(file.path(tempdir(), "foo"),
     force = TRUE,
     recursive = TRUE
diff --git a/tests/testthat/test-ignore.R b/tests/testthat/test-ignore.R
index c79f1b9..6666860 100644
--- a/tests/testthat/test-ignore.R
+++ b/tests/testthat/test-ignore.R
@@ -13,6 +13,9 @@ test_that("use_ignore works", {
       r_object_names = c("cars_over_20")
     )
   )
-  expect_invisible(use_ignore(file = "mydata.csv",path = "inst/extdata"))#,"Adding 'mydata.csv' to 'inst/extdata/\\.gitignore'|Adding '\\^inst/extdata/mydata\\\\.csv\\$' to '\\.Rbuildignore'")
+  #expect_output(use_ignore(file = "mydata.csv",path = "inst/extdata"),"Adding 'mydata.csv' to 'inst/extdata/\\.gitignore'|Adding '\\^inst/extdata/mydata\\\\.csv\\$' to '\\.Rbuildignore'")
+  withr::with_options(c(crayon.enabled = FALSE), {
+	expect_message(use_ignore(file = "mydata.csv", path = "inst/extdata"), "Adding 'mydata.csv' to 'inst/extdata/\\.gitignore'|Adding '\\^inst/extdata/mydata\\.csv\\$' to '\\.Rbuildignore'")
+  })
   expect_message(use_ignore(),"No file name provided to ignore.")
 })
diff --git a/vignettes/.gitignore b/vignettes/.gitignore
new file mode 100644
index 0000000..3c98fed
--- /dev/null
+++ b/vignettes/.gitignore
@@ -0,0 +1,3 @@
+*.R
+*.md
+*.html
diff --git a/vignettes/usingDataPackageR.Rmd b/vignettes/Using_DataPackageR.Rmd
similarity index 64%
rename from vignettes/usingDataPackageR.Rmd
rename to vignettes/Using_DataPackageR.Rmd
index 7cf06fe..563f438 100644
--- a/vignettes/usingDataPackageR.Rmd
+++ b/vignettes/Using_DataPackageR.Rmd
@@ -7,7 +7,7 @@ output:
     keep_md: TRUE
     toc: yes
 vignette: >
-  %\VignetteIndexEntry{A Guide to using DataPackageR}
+  %\VignetteIndexEntry{A guide to using DataPackageR}
   %\VignetteEngine{knitr::rmarkdown}
   %\usepackage[utf8]{inputenc}
   %\usepackage{graphicx}
@@ -23,72 +23,62 @@ knitr::opts_chunk$set(
 
 ## Purpose
 
-This vignette demonstrates how to use DataPackageR to build a data package. 
+This vignette demonstrates how to use DataPackageR to build a data package. DataPackageR aims to simplify data package construction. It provides mechanisms for reproducibly preprocessing and tidying raw data into into documented, versioned, and packaged analysis-ready data sets. Long-running or computationally intensive data processing can be decoupled from the usual `R CMD build` process while maintinaing [data lineage](https://en.wikipedia.org/wiki/Data_lineage).
 
-DataPackageR aims to simplify data package construction.
-
-It provides mechanisms for reproducibly preprocessing and tidying raw data into into documented, versioned, and packaged analysis-ready data sets. 
-
-Long-running or computationally intensive data processing can be decoupled from the usual `R CMD build` process while maintinaing [data lineage](https://en.wikipedia.org/wiki/Data_lineage).
-
-In this vignette we will subset and package the `mtcars` data set.
+For demonstration purposes, in this vignette we will subset and package the `mtcars` data set.
 
 ## Set up a new data package.
 
-We'll set up a new data package based on `mtcars` example in the [README](https://github.com/ropensci/DataPackageR/blob/master/README.md).
+We will set up a new data package based on the `mtcars` example in the [README](https://github.com/ropensci/DataPackageR/blob/master/README.md).
 The `datapackage_skeleton()` API is used to set up a new package. 
 The user needs to provide:
 
-- R or Rmd code files that do data processing.
-- A list of R object names created by those code files.
-- Optionally a path to a directory of raw data (will be copied into the package).
-- Optionally a list of additional code files that may be dependencies of your R scripts. 
-
+- R or Rmd files that process data.
+- A list of R object names created by those files.
+- Optionally, a path to a directory of raw data (will be copied into the package).
+- Optionally, a list of additional files that may be dependencies of your R or Rmd data processing files. 
 
 ```{r minimal_example, results='hide', eval = rmarkdown::pandoc_available()}
 library(DataPackageR)
 
-# Let's reproducibly package up
-# the cars in the mtcars dataset
-# with speed > 20.
-# Our dataset will be called cars_over_20.
+# Let's reproducibly package the cars in the mtcars dataset with speed
+# > 20. Our dataset will be called `cars_over_20`.
 
-# Get the code file that turns the raw data
-# to our packaged and processed analysis-ready dataset.
+# Get the code file that turns the raw data to our packaged and
+# processed analysis-ready dataset.
 processing_code <-
-  system.file("extdata", 
-              "tests",
-              "subsetCars.Rmd",
-              package = "DataPackageR")
+    system.file("extdata", "tests", "subsetCars.Rmd", package = "DataPackageR")
 
 # Create the package framework.
 DataPackageR::datapackage_skeleton(name = "mtcars20",
-  force = TRUE,
-  code_files = processing_code,
-  r_object_names = "cars_over_20",
-  path = tempdir() 
-  #dependencies argument is empty
-  #raw_data_dir argument is empty.
-  )
+                                   force = TRUE,
+                                   code_files = processing_code,
+                                   r_object_names = "cars_over_20",
+                                   path = tempdir() 
+                                   #dependencies argument is empty
+                                   #raw_data_dir argument is empty.
+                                   )
+
 ```
 
 ### What's in the package skeleton structure?
 
-This has created a datapackage source tree named "mtcars20" (in a temporary directory). 
-For a real use case you would pick a `path` on your filesystem where you could then initialize a new github repository for the package.
+The process above has created a DataPackageR source tree named "mtcars20" in a temporary directory. For a real use case, you would pick a path on your filesystem where you could then initialize a new github repository for the package.
 
 The contents of `mtcars20` are:
 
 ```{r dirstructure,echo=FALSE, eval = rmarkdown::pandoc_available()}
 library(data.tree)
-df <- data.frame(pathString = file.path(
-  "mtcars20",
-  list.files(
-  file.path(tempdir(), "mtcars20"),
-  include.dirs = TRUE,
-  recursive = TRUE
+df <- data.frame(
+  pathString = file.path(
+    "mtcars20",
+    list.files(
+      file.path(tempdir(), "mtcars20"),
+      include.dirs = TRUE,
+      recursive = TRUE
+    )
   )
-  ))
+)
 as.Node(df)
 ```
 
@@ -97,7 +87,7 @@ It contains a new `DataVersion` string that will be automatically incremented wh
 
 The user-provided code files reside in `data-raw`. They are executed during the data package build process.
 
-### A few words about the YAML config file
+### A note about the YAML config file.
 
 A `datapackager.yml` file is used to configure and control the build process.
 
@@ -109,15 +99,13 @@ cat(yaml::as.yaml(yaml::yaml.load_file(file.path(tempdir(),"mtcars20","datapacka
 
 The two main pieces of information in the configuration are a list of the files to be processed and the data sets the package will store.
 
-This example packages an R data set named `cars_over_20` (the name was passed in to `datapackage_skeleton()`).
-It is created by the `subsetCars.Rmd` file. 
-
+This example packages an R data set named `cars_over_20` (the name was passed to `datapackage_skeleton()`), which is created by the `subsetCars.Rmd` file. 
 
-The objects must be listed in the yaml configuration file. `datapackage_skeleton()`  ensures this is done for you automatically. 
+The objects must be listed in the yaml configuration file. `datapackage_skeleton()` ensures this is done for you automatically. 
 
 DataPackageR provides an API for modifying this file, so it does not need to be done by hand. 
 
-Further information on the contents of the YAML configuration file, and the API are in the [YAML Configuration Details](YAML_CONFIG.html)
+Further information on the contents of the YAML configuration file, and the API are in the [YAML Configuration Details](YAML_Configuration_Details.html) vignette.
 
 ### Where do I put my raw datasets?
 
@@ -129,11 +117,9 @@ In this example we are reading the `mtcars` data set that is already in memory,
 
 ### An API to read raw data sets from within an R or Rmd procesing script.
 
-As stated in the README, in order for your processing scripts to be portable, you should not use absolute paths to files. 
-DataPackageR provides an API to point to the data package root directory and the `inst/extdata` and `data` subdirectories.
-These are useful for constructing portable paths in your code to read files from these locations.
+As stated in the README, in order for your processing scripts to be portable, you should not use absolute paths to files. DataPackageR provides an API to point to the data package root directory and the `inst/extdata` and `data` subdirectories. These are useful for constructing portable paths in your code to read files from these locations.
 
-For example: to construct a path to a file named "mydata.csv" located in `inst/extdata` in your data package source tree:
+For example, to construct a path to a file named "mydata.csv" located in `inst/extdata` in your data package source tree:
 
 - use `DataPackageR::project_extdata_path("mydata.csv")` in your `R` or `Rmd` file. This would return: e.g., `r file.path(tempdir(),"mtcars20","inst","extdata","mydata.csv")`
 
@@ -148,11 +134,11 @@ Raw data sets that are stored externally (outside the data package source tree)
 
 If your processing scripts are Rmd files, the usual yaml header for rmarkdown documents should be present.
 
-If you have Rmd files, you can still include a yaml header, but it should be commented with `#'` and it should be at the top of your R file. For example, a test R file in the DataPackageR package looks as follows:
+If your processing scripts are R files, you can still include a yaml header, but it should be commented with `#'` and it should be at the top of your R file. For example, a test R file in the DataPackageR package looks as follows:
 
 ```
 #'---
-#'title: Sample report  from R script
+#\'title: Sample report from R script
 #'author: Greg Finak
 #'date: August 1, 2018
 #'---
@@ -164,19 +150,21 @@ This will be converted to an Rmd file with a proper yaml header, which will then
 
 ## Build the data package.
 
-Once the skeleton framework is set up, 
+Once the skeleton framework is set up, run the preprocessing code to build `cars_over_20`, and reproducibly enclose it in a package.
 
 ```{r , eval = rmarkdown::pandoc_available()}
-# Run the preprocessing code to build cars_over_20
-# and reproducibly enclose it in a package.
 dir.create(file.path(tempdir(),"lib"))
-DataPackageR:::package_build(file.path(tempdir(),"mtcars20"), install = TRUE,  lib = file.path(tempdir(),"lib"))
+
+DataPackageR:::package_build(
+  file.path(tempdir(),"mtcars20"),
+  install = TRUE,
+  lib = file.path(tempdir(),"lib")
+)
 ```
 
-### Documenting your data set changes in NEWS.md
+### Documenting your data set changes in NEWS.
 
-When you build a package in interactive mode, you will be
-prompted to input text describing the changes to your data package (one line). 
+When you build a package in interactive mode, you will be prompted to input text describing the changes to your data package (one line). 
 
 These will appear in the NEWS.md file in the following format:
 
@@ -188,12 +176,7 @@ A description of your changes to the package
 [The rest of the file]
 ```
 
-
-### Why not just use R CMD build?
-
-If the processing script is time consuming or the data set is particularly large, then `R CMD build` would run the code each time the package is installed. In such cases, raw data may not be available, or the environment to do the data processing may not be set up for each user of the data. DataPackageR decouples data processing from package building/installation for data consumers.
-
-### A log of the build process
+### Logging the build process.
 
 DataPackageR uses the `futile.logger` package to log progress. 
 
@@ -201,29 +184,30 @@ If there are errors in the processing, the script will notify you via logging to
 
 If everything goes smoothly, you will have a new package built in the parent directory. 
 
-In this case we have a new package 
-`mtcars20_1.0.tar.gz`. 
-
+In this case we have a new package: `mtcars20_1.0.tar.gz`. 
 
 ### A note about the package source directory after building.
 
-The pacakge source directory changes after the first build.
+The package source directory changes after the first build.
 
 ```{r, echo=FALSE, eval = rmarkdown::pandoc_available()}
-df <- data.frame(pathString = file.path(
-  "mtcars20",
-  list.files(
-  file.path(tempdir(), "mtcars20"),
-  include.dirs = TRUE,
-  recursive = TRUE
+df <- data.frame(
+  pathString = file.path(
+    "mtcars20",
+    list.files(
+      file.path(tempdir(), "mtcars20"),
+      include.dirs = TRUE,
+      recursive = TRUE
+    )
   )
-  ))
-  as.Node(df)
+)
+
+as.Node(df)
 ```
 
 ### Update the autogenerated documentation. 
 
-After the first build, the `R` directory contains `mtcars.R` that has autogenerated `roxygen2` markup documentation for the data package and for the packaged data `cars_over20`. 
+After the first build, the `R` directory contains `mtcars.R` that has autogenerated `roxygen2` markup documentation for the data package and for the `cars_over20` packaged data. 
 
 The processed `Rd` files can be found in `man`. 
 
@@ -236,15 +220,15 @@ dir.create(file.path(tempdir(),"lib")) # a temporary library directory
 document(file.path(tempdir(),"mtcars20"), lib = file.path(tempdir(),"lib"))
 ```
 
-This is done without reprocessing the data.
-
-#### Dont' forget to rebuild the package.
+Updating documentation does not reprocess the data.
 
-You should update the documentation in `R/mtcars.R`, then call `package_build()` again.
+Once the the documentation is updated in `R/mtcars.R`, then run `package_build()` again.
 
+### Why not just use R CMD build?
 
-## Installing and using the new data package
+If the processing script is time consuming or the data set is particularly large, then `R CMD build` would run the code each time the package is installed. In such cases, raw data may not be available, or the environment to do the data processing may not be set up for each user of the data. DataPackageR decouples data processing from package building/installation for data consumers.
 
+## Installing and using the new data package.
 
 ### Accessing vignettes, data sets, and data set documentation. 
 
@@ -257,11 +241,12 @@ The vignette will detail the processing performed by the `subsetCars.Rmd` proces
 The data set documentation will be accessible via `?cars_over_20`, and the data sets via `data()`. 
 
 ```{r, eval = rmarkdown::pandoc_available()}
-# create a temporary library to install into.
+# Create a temporary library to install into.
 dir.create(file.path(tempdir(),"lib"))
+
 # Let's use the package we just created.
 install.packages(file.path(tempdir(),"mtcars20_1.0.tar.gz"), type = "source", repos = NULL, lib = file.path(tempdir(),"lib"))
-lns <- loadNamespace    
+lns <- loadNamespace
 if (!"package:mtcars20"%in%search())
   attachNamespace(lns('mtcars20',lib.loc = file.path(tempdir(),"lib"))) #use library() in your code
 data("cars_over_20") # load the data
@@ -273,63 +258,59 @@ vignettes <- vignette(package = "mtcars20", lib.loc = file.path(tempdir(),"lib")
 vignettes$results
 ```
 
-
-### Using the DataVersion
+### Using the DataVersion.
 
 Your downstream data analysis can depend on a specific version of the data in your data package by testing the DataVersion string in the DESCRIPTION file. 
 
 We provide an API for this:
 
 ```{r, eval = rmarkdown::pandoc_available()}
-# We can easily check the version of the data
+# We can easily check the version of the data.
 DataPackageR::data_version("mtcars20", lib.loc = file.path(tempdir(),"lib"))
 
 # You can use an assert to check the data version in  reports and
 # analyses that use the packaged data.
 assert_data_version(data_package_name = "mtcars20",
-                    version_string = "0.1.0",
-                    acceptable = "equal",
+                    version_string = "0.1.0", acceptable = "equal",
                     lib.loc = file.path(tempdir(),"lib"))  #If this fails, execution stops
                                            #and provides an informative error.
 ```
 
-
 # Migrating old data packages.
 
 Version 1.12.0 has moved away from controlling the build process using `datasets.R` and an additional `masterfile` argument. 
 
-The build process is now controlled via a `datapackager.yml` configuration file located in the package root directory.  (see [YAML Configuration Details](YAML_CONFIG.html))
+The build process is now controlled via a `datapackager.yml` configuration file located in the package root directory. See [YAML Configuration Details](YAML_Configuration_Details.html).
 
-### Create a datapackager.yml file
+## Create a datapackager.yml file.
 
 You can migrate an old package by constructing such a config file using the `construct_yml_config()` API.
 
 ```{r  construct_config, eval = rmarkdown::pandoc_available()}
-# assume I have file1.Rmd and file2.R located in /data-raw, 
-# and these create 'object1' and 'object2' respectively.
+# Assume I have file1.Rmd and file2.R located in /data-raw, and these
+# create 'object1' and 'object2' respectively.
 
 config <- construct_yml_config(code = c("file1.Rmd", "file2.R"),
-                              data = c("object1", "object2"))
+                               data = c("object1", "object2"))
 cat(yaml::as.yaml(config))
 ```
 
 `config` is a newly constructed yaml configuration object. It can be written to the package directory:
 
 ```{r, eval = rmarkdown::pandoc_available()}
-path_to_package <- tempdir() #e.g., if tempdir() was the root of our package.
+path_to_package <- tempdir() # e.g., if tempdir() was the root of our package.
 yml_write(config, path = path_to_package)
 ```
 
 Now the package at `path_to_package` will build with version 1.12.0 or greater.
 
-### Reading data sets from Rmd files
+### Reading data sets from Rmd files.
 
-In versions prior to 1.12.1 we would read data sets from `inst/extdata` in an `Rmd` script using paths relative to
-`data-raw` in the data package source tree. 
+In versions prior to 1.12.1 we would read data sets from `inst/extdata` in an `Rmd` script using paths relative to `data-raw` in the data package source tree. 
 
 For example:
 
-#### The old way
+#### The old way.
 
 ```r
 # read 'myfile.csv' from inst/extdata relative to data-raw where the Rmd is rendered.
@@ -340,21 +321,18 @@ Now `Rmd` and `R` scripts are processed in `render_root` defined in the yaml con
 
 To read a raw data set we can get the path to the package source directory using an API call:
 
-
-#### The new way
+#### The new way.
 
 ```r
 # DataPackageR::project_extdata_path() returns the path to the data package inst/extdata subdirectory directory.
 # DataPackageR::project_path() returns the path to the data package root directory.
 # DataPackageR::project_data_path() returns the path to the data package data subdirectory directory.
-read.csv(
-    DataPackageR::project_extdata_path("myfile.csv")
-    )
+read.csv(DataPackageR::project_extdata_path("myfile.csv"))
 ```
 
-# Partial builds
+# Partial builds.
 
-We can also perform partial builds of a subset of files in a package by toggling the `enabled` key in the config file.
+We can also perform partial builds of a subset of files in a package by toggling the `enabled` key in the yaml config file.
 
 This can be done with the following API:
 
@@ -370,37 +348,38 @@ changes to take effect.
 The consequence of toggling a file to `enable: no` is that it will be skipped when the package is rebuilt, 
 but the data will still be retained in the package, and the documentation will not be altered. 
 
-This is useful in situations where we have multiple data sets, and want to re-run one script to update a specific data set, but
-not the other scripts because they may be too time consuming, for example.
+This is useful in situations where we have multiple data sets, and we want to re-run one script to update a specific data set, but not the other scripts because they may be too time consuming.
 
 # Multi-script pipelines.
 
 We may have situations where we have mutli-script pipelines. There are two ways to share data among scripts. 
 
 1. filesystem artifacts
-2. data objects passed to subsequent scripts.
+2. data objects passed to subsequent scripts
 
-### File system artifacts
+## File system artifacts.
 
 The yaml configuration property `render_root` specifies the working directory where scripts will be rendered.
 
 If a script writes files to the working directory, that is where files will appear. These can be read by subsequent scripts.
 
-### Passing data objects to subsequent scripts.
+## Passing data objects to subsequent scripts.
+
+A script can access a data object designated to be packaged by previously ran scripts using `datapackager_object_read()`.
 
-A script (e.g., `script2.Rmd`) running after `script1.Rmd` can access a stored data object named `script1_dataset` created by `script1.Rmd` by calling
+For example, `script2.Rmd` will run after `script1.Rmd`. `script2.Rmd` needs to access a data object that has been designated to be packaged named `dataset1`, which was created by `script1.Rmd`. This data set can be accessed by `script2.Rmd` using the following expression:
 
-`script1_dataset <- DataPackageR::datapackager_object_read("script1_dataset")`. 
+`dataset1 <- DataPackageR::datapackager_object_read("dataset1")`. 
 
 Passing of data objects amongst scripts can be turned off via:
 
 `package_build(deps = FALSE)`
 
-# Next steps 
+# Next steps.
 
 We recommend the following once your package is created.
 
-## Place your package under source control
+## Place your package under source control.
 
 You now have a data package source tree. 
 
@@ -410,26 +389,22 @@ You now have a data package source tree.
     3. Push your local package repository to `github`. [see step 7](https://help.github.com/articles/adding-an-existing-project-to-github-using-the-command-line/)
 
 
-This will let you version control your data processing code, and provide a mechanism for sharing your package with others.
-
+This will let you version control your data processing code, and will provide a mechanism for sharing your package with others.
 
 For more details on using git and github with R, there is an excellent guide provided by Jenny Bryan: [Happy Git and GitHub for the useR](https://happygitwithr.com/) and Hadley Wickham's [book on R packages]( https://r-pkgs.org/).
 
-# Additional Details
-
-We provide some additional details for the interested.
+# Additional Details.
 
-### Fingerprints of stored data objects
+## Fingerprints of stored data objects.
 
-DataPackageR calculates an md5 checksum of each data object it stores, and keeps track of them in a file
+DataPackageR calcudevtools::install(build_vignettes = T)lates an md5 checksum of each data object it stores, and keeps track of them in a file
 called `DATADIGEST`.
 
-- Each time the package is rebuilt, the md5 sums of the new data objects are compared against the DATADIGEST.
-- If they don't match, the build process checks that the `DataVersion` string has been incremented in the `DESCRIPTION` file.
-- If it has not the build process will exit and produce an error message.
-
-#### DATADIGEST
+- Each time the package is rebuilt, the md5 sums of the new data objects are compared against `DATADIGEST`.
+- If they do not match, the build process checks that the `DataVersion` string has been incremented in the `DESCRIPTION` file.
+- If it has not, the build process will exit and produce an error message.
 
+### DATADIGEST
 
 The `DATADIGEST` file contains the following:
 
@@ -437,8 +412,7 @@ The `DATADIGEST` file contains the following:
 cat(readLines(file.path(tempdir(),"mtcars20","DATADIGEST")),sep="\n")
 ```
 
-
-#### DESCRIPTION
+### DESCRIPTION
 
 The description file has the new `DataVersion` string.
 
diff --git a/vignettes/YAML_CONFIG.R b/vignettes/YAML_CONFIG.R
deleted file mode 100644
index a77c385..0000000
--- a/vignettes/YAML_CONFIG.R
+++ /dev/null
@@ -1,69 +0,0 @@
-## ---- echo = FALSE, results = 'hide', eval = rmarkdown::pandoc_available()----
-library(DataPackageR)
-library(yaml)
-yml <- DataPackageR::construct_yml_config(code = "subsetCars.Rmd", data = "cars_over_20")
-
-## ---- echo = FALSE, comment="", eval = rmarkdown::pandoc_available()-----
-cat(yaml::as.yaml(yml))
-
-## ---- eval = rmarkdown::pandoc_available()-------------------------------
-# Note this is done by the datapackage_skeleton. 
-# The user doesn't usually need to call 
-# construct_yml_config()
-yml <- DataPackageR::construct_yml_config(
-  code = "subsetCars.Rmd",
-  data = "cars_over_20"
-  )
-
-## ----eval=FALSE----------------------------------------------------------
-#  # returns an r object representation of
-#  # the config file.
-#  mtcars20_config <- yml_find(
-#    file.path(tempdir(),"mtcars20")
-#    )
-
-## ---- comment="", eval = rmarkdown::pandoc_available()-------------------
-  yml_list_objects(yml)
-
-## ---- comment="", eval = rmarkdown::pandoc_available()-------------------
-  yml_list_files(yml)
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_disabled <- yml_disable_compile(
-    yml,
-    filenames = "subsetCars.Rmd")
-cat(as.yaml(yml_disabled))
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_enabled <- yml_enable_compile(
-    yml,
-    filenames = "subsetCars.Rmd")
-cat(as.yaml(yml_enabled))
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_twofiles <- yml_add_files(
-    yml,
-    filenames = "anotherFile.Rmd")
-# cat(as.yaml(yml_twofiles))
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_twoobj <- yml_add_objects(
-    yml_twofiles,
-    objects = "another_object")
-# cat(as.yaml(yml_twoobj))
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_twoobj <- yml_remove_files(
-    yml_twoobj,
-    filenames = "anotherFile.Rmd")
-# cat(as.yaml(yml_twoobj))
-
-## ---- comment="", echo = 1, eval = rmarkdown::pandoc_available()---------
-yml_oneobj <- yml_remove_objects(
-    yml_twoobj,
-    objects = "another_object")
-# cat(as.yaml(yml_oneobj))
-
-## ---- eval = FALSE-------------------------------------------------------
-#  yml_write(yml_oneobj, path = "path_to_package")
-
diff --git a/vignettes/YAML_CONFIG.html b/vignettes/YAML_CONFIG.html
deleted file mode 100644
index fe7e590..0000000
--- a/vignettes/YAML_CONFIG.html
+++ /dev/null
@@ -1,522 +0,0 @@
-<!DOCTYPE html>
-
-<html xmlns="http://www.w3.org/1999/xhtml">
-
-<head>
-
-<meta charset="utf-8" />
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-<meta name="generator" content="pandoc" />
-
-<meta name="viewport" content="width=device-width, initial-scale=1">
-
-<meta name="author" content="Greg Finak gfinak@fredhutch.org" />
-
-<meta name="date" content="2018-10-24" />
-
-<title>The DataPackageR YAML configuration file.</title>
-
-
-
-<style type="text/css">code{white-space: pre;}</style>
-<style type="text/css">
-a.sourceLine { display: inline-block; line-height: 1.25; }
-a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; }
-a.sourceLine:empty { height: 1.2em; }
-.sourceCode { overflow: visible; }
-code.sourceCode { white-space: pre; position: relative; }
-div.sourceCode { margin: 1em 0; }
-pre.sourceCode { margin: 0; }
-@media screen {
-div.sourceCode { overflow: auto; }
-}
-@media print {
-code.sourceCode { white-space: pre-wrap; }
-a.sourceLine { text-indent: -1em; padding-left: 1em; }
-}
-pre.numberSource a.sourceLine
-  { position: relative; left: -4em; }
-pre.numberSource a.sourceLine::before
-  { content: attr(data-line-number);
-    position: relative; left: -1em; text-align: right; vertical-align: baseline;
-    border: none; pointer-events: all; display: inline-block;
-    -webkit-touch-callout: none; -webkit-user-select: none;
-    -khtml-user-select: none; -moz-user-select: none;
-    -ms-user-select: none; user-select: none;
-    padding: 0 4px; width: 4em;
-    color: #aaaaaa;
-  }
-pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
-div.sourceCode
-  {  }
-@media screen {
-a.sourceLine::before { text-decoration: underline; }
-}
-code span.al { color: #ff0000; font-weight: bold; } /* Alert */
-code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
-code span.at { color: #7d9029; } /* Attribute */
-code span.bn { color: #40a070; } /* BaseN */
-code span.bu { } /* BuiltIn */
-code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
-code span.ch { color: #4070a0; } /* Char */
-code span.cn { color: #880000; } /* Constant */
-code span.co { color: #60a0b0; font-style: italic; } /* Comment */
-code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
-code span.do { color: #ba2121; font-style: italic; } /* Documentation */
-code span.dt { color: #902000; } /* DataType */
-code span.dv { color: #40a070; } /* DecVal */
-code span.er { color: #ff0000; font-weight: bold; } /* Error */
-code span.ex { } /* Extension */
-code span.fl { color: #40a070; } /* Float */
-code span.fu { color: #06287e; } /* Function */
-code span.im { } /* Import */
-code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
-code span.kw { color: #007020; font-weight: bold; } /* Keyword */
-code span.op { color: #666666; } /* Operator */
-code span.ot { color: #007020; } /* Other */
-code span.pp { color: #bc7a00; } /* Preprocessor */
-code span.sc { color: #4070a0; } /* SpecialChar */
-code span.ss { color: #bb6688; } /* SpecialString */
-code span.st { color: #4070a0; } /* String */
-code span.va { color: #19177c; } /* Variable */
-code span.vs { color: #4070a0; } /* VerbatimString */
-code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
-</style>
-
-
-
-<style type="text/css">body {
-background-color: #fff;
-margin: 1em auto;
-max-width: 700px;
-overflow: visible;
-padding-left: 2em;
-padding-right: 2em;
-font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
-font-size: 14px;
-line-height: 1.35;
-}
-#header {
-text-align: center;
-}
-#TOC {
-clear: both;
-margin: 0 0 10px 10px;
-padding: 4px;
-width: 400px;
-border: 1px solid #CCCCCC;
-border-radius: 5px;
-background-color: #f6f6f6;
-font-size: 13px;
-line-height: 1.3;
-}
-#TOC .toctitle {
-font-weight: bold;
-font-size: 15px;
-margin-left: 5px;
-}
-#TOC ul {
-padding-left: 40px;
-margin-left: -1.5em;
-margin-top: 5px;
-margin-bottom: 5px;
-}
-#TOC ul ul {
-margin-left: -2em;
-}
-#TOC li {
-line-height: 16px;
-}
-table {
-margin: 1em auto;
-border-width: 1px;
-border-color: #DDDDDD;
-border-style: outset;
-border-collapse: collapse;
-}
-table th {
-border-width: 2px;
-padding: 5px;
-border-style: inset;
-}
-table td {
-border-width: 1px;
-border-style: inset;
-line-height: 18px;
-padding: 5px 5px;
-}
-table, table th, table td {
-border-left-style: none;
-border-right-style: none;
-}
-table thead, table tr.even {
-background-color: #f7f7f7;
-}
-p {
-margin: 0.5em 0;
-}
-blockquote {
-background-color: #f6f6f6;
-padding: 0.25em 0.75em;
-}
-hr {
-border-style: solid;
-border: none;
-border-top: 1px solid #777;
-margin: 28px 0;
-}
-dl {
-margin-left: 0;
-}
-dl dd {
-margin-bottom: 13px;
-margin-left: 13px;
-}
-dl dt {
-font-weight: bold;
-}
-ul {
-margin-top: 0;
-}
-ul li {
-list-style: circle outside;
-}
-ul ul {
-margin-bottom: 0;
-}
-pre, code {
-background-color: #f7f7f7;
-border-radius: 3px;
-color: #333;
-white-space: pre-wrap; 
-}
-pre {
-border-radius: 3px;
-margin: 5px 0px 10px 0px;
-padding: 10px;
-}
-pre:not([class]) {
-background-color: #f7f7f7;
-}
-code {
-font-family: Consolas, Monaco, 'Courier New', monospace;
-font-size: 85%;
-}
-p > code, li > code {
-padding: 2px 0px;
-}
-div.figure {
-text-align: center;
-}
-img {
-background-color: #FFFFFF;
-padding: 2px;
-border: 1px solid #DDDDDD;
-border-radius: 3px;
-border: 1px solid #CCCCCC;
-margin: 0 5px;
-}
-h1 {
-margin-top: 0;
-font-size: 35px;
-line-height: 40px;
-}
-h2 {
-border-bottom: 4px solid #f7f7f7;
-padding-top: 10px;
-padding-bottom: 2px;
-font-size: 145%;
-}
-h3 {
-border-bottom: 2px solid #f7f7f7;
-padding-top: 10px;
-font-size: 120%;
-}
-h4 {
-border-bottom: 1px solid #f7f7f7;
-margin-left: 8px;
-font-size: 105%;
-}
-h5, h6 {
-border-bottom: 1px solid #ccc;
-font-size: 105%;
-}
-a {
-color: #0033dd;
-text-decoration: none;
-}
-a:hover {
-color: #6666ff; }
-a:visited {
-color: #800080; }
-a:visited:hover {
-color: #BB00BB; }
-a[href^="http:"] {
-text-decoration: underline; }
-a[href^="https:"] {
-text-decoration: underline; }
-
-code > span.kw { color: #555; font-weight: bold; } 
-code > span.dt { color: #902000; } 
-code > span.dv { color: #40a070; } 
-code > span.bn { color: #d14; } 
-code > span.fl { color: #d14; } 
-code > span.ch { color: #d14; } 
-code > span.st { color: #d14; } 
-code > span.co { color: #888888; font-style: italic; } 
-code > span.ot { color: #007020; } 
-code > span.al { color: #ff0000; font-weight: bold; } 
-code > span.fu { color: #900; font-weight: bold; }  code > span.er { color: #a61717; background-color: #e3d2d2; } 
-</style>
-
-</head>
-
-<body>
-
-
-
-
-<h1 class="title toc-ignore">The DataPackageR YAML configuration file.</h1>
-<h4 class="author"><em>Greg Finak <a href="mailto:gfinak@fredhutch.org" class="email">gfinak@fredhutch.org</a></em></h4>
-<h4 class="date"><em>2018-10-24</em></h4>
-
-
-<div id="TOC">
-<ul>
-<li><a href="#configuring-and-controlling-datapackager-builds.">Configuring and controlling DataPackageR builds.</a><ul>
-<li><a href="#the-datapackager.yml-file.">The <code>datapackager.yml</code> file.</a></li>
-<li><a href="#yaml-config-file-properties.">YAML config file properties.</a></li>
-<li><a href="#editing-the-yaml-config-file.">Editing the YAML config file.</a><ul>
-<li><a href="#api-calls">API calls</a></li>
-</ul></li>
-</ul></li>
-</ul>
-</div>
-
-<div id="configuring-and-controlling-datapackager-builds." class="section level1">
-<h1>Configuring and controlling DataPackageR builds.</h1>
-<p>Data package builds are controlled using the <code>datapackager.yml</code> file.</p>
-<p>This file is created in the package source tree when the user creates a package using <code>datapackage_skeleton()</code>.</p>
-<p>It is automatically populated with the names of the <code>code_files</code> and <code>data_objects</code> the passed in to datapackage_skeleton.</p>
-<div id="the-datapackager.yml-file." class="section level2">
-<h2>The <code>datapackager.yml</code> file.</h2>
-<p>The structure of a correctly formatted <code>datapackager.yml</code> file is shown below:</p>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: &#39;450393&#39;</code></pre>
-</div>
-<div id="yaml-config-file-properties." class="section level2">
-<h2>YAML config file properties.</h2>
-<p>The main section of the file is the <code>configuration:</code> section.</p>
-<p>It has three properties:</p>
-<ul>
-<li><p><code>files:</code></p>
-<p>The files (<code>R</code> or <code>Rmd</code>) to be processed by DataPackageR. They are processed in the order shown. Users running multi-script workflows with dependencies between the scripts need to ensure the files are processed in the correct order.</p>
-<p>Here <code>subsetCars.Rmd</code> is the only file to process. The name is transformed to an absolute path within the package.</p>
-<p>Each file itself has just one property:</p>
-<!-- - `name:`  -->
-<p><!-- The name of the file. This is transformed to an absolute path within the package. --></p>
-<ul>
-<li><code>enabled:</code> A logical <code>yes</code>, <code>no</code> flag indicating whether the file should be rendered during the build, or whether it should be skipped. This is useful for ‘turning off’ long running processing tasks if they have not changed. Disabling processing of a file will not overwrite existing documentation or data objecs created during previous builds.</li>
-</ul></li>
-<li><p><code>objects:</code></p>
-The names of the data objects created by the processing files, to be stored in the package. These names are compared against the objects created in the render environment by each file. They names must match.</li>
-<li><p><code>render_root:</code></p>
-<p>The directory where the <code>Rmd</code> or <code>R</code> files will be rendered. Defaults to a randomly named subdirectory of <code>tempdir()</code>. Allows workflows that use multiple scripts and create file system artifacts to function correctly by simply writing to and reading from the working directory.</p></li>
-</ul>
-</div>
-<div id="editing-the-yaml-config-file." class="section level2">
-<h2>Editing the YAML config file.</h2>
-<p>The structure of the YAML is simple enough to understand but complex enough that it can be a pain to edit by hand.</p>
-<p>DataPackageR provides a number of API calls to construct, read, modify, and write the yaml config file.</p>
-<div id="api-calls" class="section level3">
-<h3>API calls</h3>
-<div id="construct_yml_config" class="section level4">
-<h4><code>construct_yml_config</code></h4>
-<p>Make an r object representing a YAML config file.</p>
-<div id="example" class="section level5">
-<h5>Example</h5>
-<p>The YAML config shown above was created by:</p>
-<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" data-line-number="1"><span class="co"># Note this is done by the datapackage_skeleton. </span></a>
-<a class="sourceLine" id="cb2-2" data-line-number="2"><span class="co"># The user doesn&#39;t usually need to call </span></a>
-<a class="sourceLine" id="cb2-3" data-line-number="3"><span class="co"># construct_yml_config()</span></a>
-<a class="sourceLine" id="cb2-4" data-line-number="4">yml &lt;-<span class="st"> </span>DataPackageR<span class="op">::</span><span class="kw">construct_yml_config</span>(</a>
-<a class="sourceLine" id="cb2-5" data-line-number="5">  <span class="dt">code =</span> <span class="st">&quot;subsetCars.Rmd&quot;</span>,</a>
-<a class="sourceLine" id="cb2-6" data-line-number="6">  <span class="dt">data =</span> <span class="st">&quot;cars_over_20&quot;</span></a>
-<a class="sourceLine" id="cb2-7" data-line-number="7">  )</a></code></pre></div>
-</div>
-</div>
-<div id="yml_find" class="section level4">
-<h4><code>yml_find</code></h4>
-<p>Read a yaml config file from a package path into an r object.</p>
-<div id="example-1" class="section level5">
-<h5>Example</h5>
-<p>Read the YAML config file from the <code>mtcars20</code> example.</p>
-<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" data-line-number="1"><span class="co"># returns an r object representation of</span></a>
-<a class="sourceLine" id="cb3-2" data-line-number="2"><span class="co"># the config file.</span></a>
-<a class="sourceLine" id="cb3-3" data-line-number="3">mtcars20_config &lt;-<span class="st"> </span><span class="kw">yml_find</span>(</a>
-<a class="sourceLine" id="cb3-4" data-line-number="4">  <span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;mtcars20&quot;</span>)</a>
-<a class="sourceLine" id="cb3-5" data-line-number="5">  )</a></code></pre></div>
-</div>
-</div>
-<div id="yml_list_objects" class="section level4">
-<h4><code>yml_list_objects</code></h4>
-<p>List the <code>objects</code> in a config read by <code>yml_find</code>.</p>
-<div id="example-2" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" data-line-number="1">  <span class="kw">yml_list_objects</span>(yml)</a></code></pre></div>
-<pre><code>
-cars_over_20</code></pre>
-</div>
-</div>
-<div id="yml_list_files" class="section level4">
-<h4><code>yml_list_files</code></h4>
-<p>List the <code>files</code> in a config read by <code>yml_find</code>.</p>
-<div id="example-3" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" data-line-number="1">  <span class="kw">yml_list_files</span>(yml)</a></code></pre></div>
-<pre><code>
-subsetCars.Rmd</code></pre>
-</div>
-</div>
-<div id="yml_disable_compile" class="section level4">
-<h4><code>yml_disable_compile</code></h4>
-<p>Disable compilation of named files in a config read by <code>yml_find</code>.</p>
-<div id="example-4" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" data-line-number="1">yml_disabled &lt;-<span class="st"> </span><span class="kw">yml_disable_compile</span>(</a>
-<a class="sourceLine" id="cb8-2" data-line-number="2">    yml,</a>
-<a class="sourceLine" id="cb8-3" data-line-number="3">    <span class="dt">filenames =</span> <span class="st">&quot;subsetCars.Rmd&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: no
-  objects: cars_over_20
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_enable_compile" class="section level4">
-<h4><code>yml_enable_compile</code></h4>
-<p>Enable compilation of named files in a config read by <code>yml_find</code>.</p>
-<div id="example-5" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb10-1" data-line-number="1">yml_enabled &lt;-<span class="st"> </span><span class="kw">yml_enable_compile</span>(</a>
-<a class="sourceLine" id="cb10-2" data-line-number="2">    yml,</a>
-<a class="sourceLine" id="cb10-3" data-line-number="3">    <span class="dt">filenames =</span> <span class="st">&quot;subsetCars.Rmd&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_add_files" class="section level4">
-<h4><code>yml_add_files</code></h4>
-<p>Add named files to a config read by <code>yml_find</code>.</p>
-<div id="example-6" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" data-line-number="1">yml_twofiles &lt;-<span class="st"> </span><span class="kw">yml_add_files</span>(</a>
-<a class="sourceLine" id="cb12-2" data-line-number="2">    yml,</a>
-<a class="sourceLine" id="cb12-3" data-line-number="3">    <span class="dt">filenames =</span> <span class="st">&quot;anotherFile.Rmd&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-    anotherFile.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_add_objects" class="section level4">
-<h4><code>yml_add_objects</code></h4>
-<p>Add named objects to a config read by <code>yml_find</code>.</p>
-<div id="example-7" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" data-line-number="1">yml_twoobj &lt;-<span class="st"> </span><span class="kw">yml_add_objects</span>(</a>
-<a class="sourceLine" id="cb14-2" data-line-number="2">    yml_twofiles,</a>
-<a class="sourceLine" id="cb14-3" data-line-number="3">    <span class="dt">objects =</span> <span class="st">&quot;another_object&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-    anotherFile.Rmd:
-      enabled: yes
-  objects:
-  - cars_over_20
-  - another_object
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_remove_files" class="section level4">
-<h4><code>yml_remove_files</code></h4>
-<p>Remove named files from a config read by <code>yml_find</code>.</p>
-<div id="example-8" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb16"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb16-1" data-line-number="1">yml_twoobj &lt;-<span class="st"> </span><span class="kw">yml_remove_files</span>(</a>
-<a class="sourceLine" id="cb16-2" data-line-number="2">    yml_twoobj,</a>
-<a class="sourceLine" id="cb16-3" data-line-number="3">    <span class="dt">filenames =</span> <span class="st">&quot;anotherFile.Rmd&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects:
-  - cars_over_20
-  - another_object
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_remove_objects" class="section level4">
-<h4><code>yml_remove_objects</code></h4>
-<p>Remove named objects from a config read by <code>yml_find</code>.</p>
-<div id="example-9" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb18-1" data-line-number="1">yml_oneobj &lt;-<span class="st"> </span><span class="kw">yml_remove_objects</span>(</a>
-<a class="sourceLine" id="cb18-2" data-line-number="2">    yml_twoobj,</a>
-<a class="sourceLine" id="cb18-3" data-line-number="3">    <span class="dt">objects =</span> <span class="st">&quot;another_object&quot;</span>)</a></code></pre></div>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: &#39;912178&#39;</code></pre>
-</div>
-</div>
-<div id="yml_write" class="section level4">
-<h4><code>yml_write</code></h4>
-<p>Write a modified config to its package path.</p>
-<div id="example-10" class="section level5">
-<h5>Example</h5>
-<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" data-line-number="1"><span class="kw">yml_write</span>(yml_oneobj, <span class="dt">path =</span> <span class="st">&quot;path_to_package&quot;</span>)</a></code></pre></div>
-<p>The <code>yml_oneobj</code> read by <code>yml_find()</code> carries an attribute that is the path to the package. The user doesn’t need to pass a <code>path</code> to <code>yml_write</code> if the config has been read by <code>yml_find</code>.</p>
-</div>
-</div>
-</div>
-</div>
-</div>
-
-
-
-<!-- dynamically load mathjax for compatibility with self-contained -->
-<script>
-  (function () {
-    var script = document.createElement("script");
-    script.type = "text/javascript";
-    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
-    document.getElementsByTagName("head")[0].appendChild(script);
-  })();
-</script>
-
-</body>
-</html>
diff --git a/vignettes/YAML_CONFIG.md b/vignettes/YAML_CONFIG.md
deleted file mode 100644
index 97dff9f..0000000
--- a/vignettes/YAML_CONFIG.md
+++ /dev/null
@@ -1,305 +0,0 @@
----
-title: "The DataPackageR YAML configuration file."
-author: "Greg Finak <gfinak@fredhutch.org>"
-date: "2018-10-24"
-output: 
-  rmarkdown::html_vignette:
-    keep_md: TRUE
-    toc: yes
-  bibliography: bibliography.bib
-vignette: >
-  %\VignetteIndexEntry{DataPackageR YAML configuration.}
-  %\VignetteEngine{knitr::rmarkdown}
-  %\usepackage[utf8]{inputenc}
-  %\usepackage{graphicx}
-editor_options: 
-  chunk_output_type: inline
----
-
-# Configuring and controlling DataPackageR builds.
-
-Data package builds are controlled using the `datapackager.yml` file. 
-
-This file is created in the package source tree when the user creates a package using `datapackage_skeleton()`. 
-
-It is automatically populated with the names of the `code_files` and `data_objects` the passed in to datapackage_skeleton.
-
-## The `datapackager.yml` file.
-
-The structure of a correctly formatted `datapackager.yml` file is shown below:
-
-
-
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '450393'
-```
-
-## YAML config file properties.
-
-The main section of the file is the `configuration:` section.
-
-It has three properties:
-
-- `files:` 
-  
-  The files (`R` or `Rmd`) to be processed by DataPackageR. They are processed in the order shown. Users running multi-script workflows with dependencies between the scripts need to ensure the files are processed in the correct order. 
-  
-  Here `subsetCars.Rmd` is the only file to process. The name is transformed to an absolute path within the package.
-  
-  Each file itself has just one property:
-  
-  <!-- - `name:`  -->
-     <!-- The name of the file. This is transformed to an absolute path within the package. -->
-    
-  - `enabled:` 
-     A logical `yes`, `no` flag indicating whether the file should be rendered during the build, or whether it should be skipped.
-    This is useful for 'turning off' long running processing tasks if they have not changed. Disabling processing of a file will not overwrite existing documentation or data objecs created during previous builds.
-  
-- `objects:` 
-  
-  The names of the data objects created by the processing files, to be stored in the package. These names are compared against the objects created in the render environment by each file. They names must match.
-- `render_root:` 
-  
-  The directory where the `Rmd` or `R` files will be rendered. Defaults to a randomly named subdirectory of `tempdir()`. Allows workflows that use multiple scripts and create file system artifacts to function correctly by simply writing to and reading from the working directory.
-
-## Editing the YAML config file.
-
-The structure of the YAML is simple enough to understand but complex enough that it can be a pain to edit by hand.
-
-DataPackageR provides a number of API calls to construct, read, modify, and write the yaml config file.
-
-### API calls
-
-#### `construct_yml_config` 
-
-  Make an r object representing a YAML config file.
-  
-##### Example
-  The YAML config shown above was created by: 
-
-```r
-# Note this is done by the datapackage_skeleton. 
-# The user doesn't usually need to call 
-# construct_yml_config()
-yml <- DataPackageR::construct_yml_config(
-  code = "subsetCars.Rmd",
-  data = "cars_over_20"
-  )
-```
-  
-  
-#### `yml_find` 
-
-  Read a yaml config file from a package path into an r object.
-  
-##### Example
-  Read the YAML config file from the `mtcars20` example.
-  
-
-```r
-# returns an r object representation of
-# the config file.
-mtcars20_config <- yml_find(
-  file.path(tempdir(),"mtcars20")
-  )
-```
-
-#### `yml_list_objects` 
-
-  List the `objects` in a config read by `yml_find`.
-  
-
-##### Example
-
-
-```r
-  yml_list_objects(yml)
-```
-
-```
-
-cars_over_20
-```
-  
-#### `yml_list_files` 
-
-  List the `files` in a config read by `yml_find`.
-
-##### Example
-
-
-```r
-  yml_list_files(yml)
-```
-
-```
-
-subsetCars.Rmd
-```
-  
-#### `yml_disable_compile` 
-
-  Disable compilation of named files in a config read by `yml_find`.
-
-##### Example
-
-
-```r
-yml_disabled <- yml_disable_compile(
-    yml,
-    filenames = "subsetCars.Rmd")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: no
-  objects: cars_over_20
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_enable_compile` 
-
-  Enable compilation of named files in a config read by `yml_find`.
-
-##### Example
-
-
-```r
-yml_enabled <- yml_enable_compile(
-    yml,
-    filenames = "subsetCars.Rmd")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_add_files` 
-  
-  Add named files to a config read by `yml_find`.
-  
-##### Example
-
-
-```r
-yml_twofiles <- yml_add_files(
-    yml,
-    filenames = "anotherFile.Rmd")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-    anotherFile.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_add_objects` 
-
-  Add named objects to a config read by `yml_find`.
-
-##### Example
-
-
-```r
-yml_twoobj <- yml_add_objects(
-    yml_twofiles,
-    objects = "another_object")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-    anotherFile.Rmd:
-      enabled: yes
-  objects:
-  - cars_over_20
-  - another_object
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_remove_files` 
-
-  Remove named files from a config read by `yml_find`.
-
-##### Example
-
-
-```r
-yml_twoobj <- yml_remove_files(
-    yml_twoobj,
-    filenames = "anotherFile.Rmd")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects:
-  - cars_over_20
-  - another_object
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_remove_objects` 
-
-  Remove named objects from a config read by `yml_find`.
-
-##### Example
-
-
-```r
-yml_oneobj <- yml_remove_objects(
-    yml_twoobj,
-    objects = "another_object")
-```
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '912178'
-```
-
-#### `yml_write` 
-
-  Write a modified config to its package path.
-
-##### Example
-
-
-```r
-yml_write(yml_oneobj, path = "path_to_package")
-```
-
-The `yml_oneobj` read by `yml_find()` carries an attribute
-that is the path to the package. The user doesn't need to pass a `path` to `yml_write` if the config has been read by `yml_find`.
diff --git a/vignettes/YAML_CONFIG.Rmd b/vignettes/YAML_Configuration_Details.Rmd
similarity index 65%
rename from vignettes/YAML_CONFIG.Rmd
rename to vignettes/YAML_Configuration_Details.Rmd
index 45f156f..b3d30a3 100644
--- a/vignettes/YAML_CONFIG.Rmd
+++ b/vignettes/YAML_Configuration_Details.Rmd
@@ -1,5 +1,5 @@
 ---
-title: "The DataPackageR YAML configuration file."
+title: "YAML Configuration Details."
 author: "Greg Finak <gfinak@fredhutch.org>"
 date: "`r Sys.Date()`"
 output: 
@@ -8,7 +8,7 @@ output:
     toc: yes
   bibliography: bibliography.bib
 vignette: >
-  %\VignetteIndexEntry{The DataPackageR YAML configuration file.}
+  %\VignetteIndexEntry{A guide to using the DataPackageR YAML configuration file}
   %\VignetteEngine{knitr::rmarkdown}
   %\usepackage[utf8]{inputenc}
   %\usepackage{graphicx}
@@ -18,11 +18,7 @@ editor_options:
 
 # Configuring and controlling DataPackageR builds.
 
-Data package builds are controlled using the `datapackager.yml` file. 
-
-This file is created in the package source tree when the user creates a package using `datapackage_skeleton()`. 
-
-It is automatically populated with the names of the `code_files` and `data_objects` the passed in to datapackage_skeleton.
+Data package builds are controlled using the `datapackager.yml` file. This file is created in the package source tree when the user creates a package using `datapackage_skeleton()`. It is automatically populated with the names of the `code_files` and `data_objects` the passed in to datapackage_skeleton.
 
 ## The `datapackager.yml` file.
 
@@ -46,33 +42,31 @@ It has three properties:
 
 - `files:` 
   
-  The files (`R` or `Rmd`) to be processed by DataPackageR. They are processed in the order shown. Users running multi-script workflows with dependencies between the scripts need to ensure the files are processed in the correct order. 
-  
-  Here `subsetCars.Rmd` is the only file to process. The name is transformed to an absolute path within the package.
+The files (`R` or `Rmd`) to be processed by DataPackageR. They are processed in the order shown. Users running multi-script workflows with dependencies between the scripts need to ensure the files are processed in the correct order. Here `subsetCars.Rmd` is the only file to process. The name is transformed to an absolute path within the package.
   
-  Each file itself has just one property:
+Each file itself has just one property:
   
-  <!-- - `name:`  -->
-     <!-- The name of the file. This is transformed to an absolute path within the package. -->
-    
   - `enabled:` 
-     A logical `yes`, `no` flag indicating whether the file should be rendered during the build, or whether it should be skipped.
-    This is useful for 'turning off' long running processing tasks if they have not changed. Disabling processing of a file will not overwrite existing documentation or data objecs created during previous builds.
+    
+  A logical `yes`, `no` flag indicating whether the file should be rendered during the build, or whether it should be skipped.
+
+  This is useful for 'turning off' long running processing tasks if they have not changed. Disabling processing of a file will not overwrite existing documentation or data objecs created during previous builds.
   
 - `objects:` 
   
-  The names of the data objects created by the processing files, to be stored in the package. These names are compared against the objects created in the render environment by each file. They names must match.
+The names of the data objects created by the processing files, to be stored in the package. These names are compared against the objects created in the render environment by each file. The names must match.
+
 - `render_root:` 
   
-  The directory where the `Rmd` or `R` files will be rendered. Defaults to a randomly named subdirectory of `tempdir()`. Allows workflows that use multiple scripts and create file system artifacts to function correctly by simply writing to and reading from the working directory.
+The directory where the `Rmd` or `R` files will be rendered which will default to a randomly named subdirectory given by `tempdir()`. This render location will allow workflows that use multiple scripts, and will create file system artifacts to function correctly by simply writing to and reading from the workoing directory.
 
 ## Editing the YAML config file.
 
-The structure of the YAML is simple enough to understand but complex enough that it can be a pain to edit by hand.
+The structure of the YAML is simple enough to understand but complex enough that it can be challenging to edit manually.
 
 DataPackageR provides a number of API calls to construct, read, modify, and write the yaml config file.
 
-### API calls
+### YAML config API calls.
 
 #### `construct_yml_config` 
 
@@ -82,7 +76,7 @@ DataPackageR provides a number of API calls to construct, read, modify, and writ
   The YAML config shown above was created by: 
 ```{r, eval = rmarkdown::pandoc_available()}
 # Note this is done by the datapackage_skeleton. 
-# The user doesn't usually need to call 
+# The user does not usually need to call 
 # construct_yml_config()
 yml <- DataPackageR::construct_yml_config(
   code = "subsetCars.Rmd",
@@ -96,6 +90,7 @@ yml <- DataPackageR::construct_yml_config(
   Read a yaml config file from a package path into an r object.
   
 ##### Example
+
   Read the YAML config file from the `mtcars20` example.
   
 ```{r eval=FALSE}
@@ -103,14 +98,13 @@ yml <- DataPackageR::construct_yml_config(
 # the config file.
 mtcars20_config <- yml_find(
   file.path(tempdir(),"mtcars20")
-  )
+)
 ```
 
 #### `yml_list_objects` 
 
   List the `objects` in a config read by `yml_find`.
   
-
 ##### Example
 
 ```{r, comment="", eval = rmarkdown::pandoc_available()}
@@ -136,7 +130,8 @@ mtcars20_config <- yml_find(
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_disabled <- yml_disable_compile(
     yml,
-    filenames = "subsetCars.Rmd")
+    filenames = "subsetCars.Rmd"
+)
 cat(as.yaml(yml_disabled))
 ```
 
@@ -149,7 +144,8 @@ cat(as.yaml(yml_disabled))
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_enabled <- yml_enable_compile(
     yml,
-    filenames = "subsetCars.Rmd")
+    filenames = "subsetCars.Rmd"
+)
 cat(as.yaml(yml_enabled))
 ```
 
@@ -162,7 +158,8 @@ cat(as.yaml(yml_enabled))
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_twofiles <- yml_add_files(
     yml,
-    filenames = "anotherFile.Rmd")
+    filenames = "anotherFile.Rmd"
+)
 # cat(as.yaml(yml_twofiles))
 ```
 
@@ -175,7 +172,8 @@ yml_twofiles <- yml_add_files(
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_twoobj <- yml_add_objects(
     yml_twofiles,
-    objects = "another_object")
+    objects = "another_object"
+)
 # cat(as.yaml(yml_twoobj))
 ```
 
@@ -188,7 +186,8 @@ yml_twoobj <- yml_add_objects(
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_twoobj <- yml_remove_files(
     yml_twoobj,
-    filenames = "anotherFile.Rmd")
+    filenames = "anotherFile.Rmd"
+)
 # cat(as.yaml(yml_twoobj))
 ```
 
@@ -201,7 +200,8 @@ yml_twoobj <- yml_remove_files(
 ```{r, comment="", echo = 1, eval = rmarkdown::pandoc_available()}
 yml_oneobj <- yml_remove_objects(
     yml_twoobj,
-    objects = "another_object")
+    objects = "another_object"
+)
 # cat(as.yaml(yml_oneobj))
 ```
 
@@ -216,4 +216,4 @@ yml_write(yml_oneobj, path = "path_to_package")
 ```
 
 The `yml_oneobj` read by `yml_find()` carries an attribute
-that is the path to the package. The user doesn't need to pass a `path` to `yml_write` if the config has been read by `yml_find`.
+that is the path to the package. The user does not need to pass a `path` to `yml_write` if the config has been read by `yml_find`.
diff --git a/vignettes/usingDataPackageR.R b/vignettes/usingDataPackageR.R
deleted file mode 100644
index 3e79d1a..0000000
--- a/vignettes/usingDataPackageR.R
+++ /dev/null
@@ -1,69 +0,0 @@
-## ---- echo = FALSE-------------------------------------------------------
-knitr::opts_chunk$set(
-  collapse = TRUE,
-  comment = "",
-  eval = TRUE
-)
-
-## ----minimal_example, results='hide', eval = rmarkdown::pandoc_available()----
-library(DataPackageR)
-
-# Let's reproducibly package up
-# the cars in the mtcars dataset
-# with speed > 20.
-# Our dataset will be called cars_over_20.
-
-# Get the code file that turns the raw data
-# to our packaged and processed analysis-ready dataset.
-processing_code <-
-  system.file("extdata", 
-              "tests",
-              "subsetCars.Rmd",
-              package = "DataPackageR")
-
-# Create the package framework.
-DataPackageR::datapackage_skeleton(name = "mtcars20",
-  force = TRUE,
-  code_files = processing_code,
-  r_object_names = "cars_over_20",
-  path = tempdir() 
-  #dependencies argument is empty
-  #raw_data_dir argument is empty.
-  ) 
-
-## ----dirstructure,echo=FALSE, eval = rmarkdown::pandoc_available()-------
-library(data.tree)
-df <- data.frame(pathString = file.path(
-  "mtcars20",
-  list.files(
-  file.path(tempdir(), "mtcars20"),
-  include.dirs = TRUE,
-  recursive = TRUE
-  )
-  ))
-as.Node(df)
-
-## ---- echo=FALSE, eval = rmarkdown::pandoc_available()-------------------
-cat(yaml::as.yaml(yaml::yaml.load_file(file.path(tempdir(),"mtcars20","datapackager.yml"))))
-
-## ---- eval = rmarkdown::pandoc_available()-------------------------------
-# Run the preprocessing code to build cars_over_20
-# and reproducibly enclose it in a package.
-dir.create(file.path(tempdir(),"lib"))
-DataPackageR:::package_build(file.path(tempdir(),"mtcars20"), install = TRUE, lib = file.path(tempdir(),"lib"))
-
-## ---- echo=FALSE, eval = rmarkdown::pandoc_available()-------------------
-df <- data.frame(pathString = file.path(
-  "mtcars20",
-  list.files(
-  file.path(tempdir(), "mtcars20"),
-  include.dirs = TRUE,
-  recursive = TRUE
-  )
-  ))
-  as.Node(df)
-
-## ----rebuild_docs, eval = rmarkdown::pandoc_available()------------------
-dir.create(file.path(tempdir(),"lib")) # a temporary library directory
-document(file.path(tempdir(),"mtcars20"), lib = file.path(tempdir(),"lib"))
-
diff --git a/vignettes/usingDataPackageR.html b/vignettes/usingDataPackageR.html
deleted file mode 100644
index c0fb20f..0000000
--- a/vignettes/usingDataPackageR.html
+++ /dev/null
@@ -1,819 +0,0 @@
-<!DOCTYPE html>
-
-<html xmlns="http://www.w3.org/1999/xhtml">
-
-<head>
-
-<meta charset="utf-8" />
-<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
-<meta name="generator" content="pandoc" />
-
-<meta name="viewport" content="width=device-width, initial-scale=1">
-
-<meta name="author" content="Greg Finak gfinak@fredhutch.org" />
-
-<meta name="date" content="2019-03-11" />
-
-<title>Using DataPackageR</title>
-
-
-
-<style type="text/css">code{white-space: pre;}</style>
-<style type="text/css" data-origin="pandoc">
-a.sourceLine { display: inline-block; line-height: 1.25; }
-a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; }
-a.sourceLine:empty { height: 1.2em; }
-.sourceCode { overflow: visible; }
-code.sourceCode { white-space: pre; position: relative; }
-div.sourceCode { margin: 1em 0; }
-pre.sourceCode { margin: 0; }
-@media screen {
-div.sourceCode { overflow: auto; }
-}
-@media print {
-code.sourceCode { white-space: pre-wrap; }
-a.sourceLine { text-indent: -1em; padding-left: 1em; }
-}
-pre.numberSource a.sourceLine
-  { position: relative; left: -4em; }
-pre.numberSource a.sourceLine::before
-  { content: attr(data-line-number);
-    position: relative; left: -1em; text-align: right; vertical-align: baseline;
-    border: none; pointer-events: all; display: inline-block;
-    -webkit-touch-callout: none; -webkit-user-select: none;
-    -khtml-user-select: none; -moz-user-select: none;
-    -ms-user-select: none; user-select: none;
-    padding: 0 4px; width: 4em;
-    color: #aaaaaa;
-  }
-pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
-div.sourceCode
-  {  }
-@media screen {
-a.sourceLine::before { text-decoration: underline; }
-}
-code span.al { color: #ff0000; font-weight: bold; } /* Alert */
-code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
-code span.at { color: #7d9029; } /* Attribute */
-code span.bn { color: #40a070; } /* BaseN */
-code span.bu { } /* BuiltIn */
-code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
-code span.ch { color: #4070a0; } /* Char */
-code span.cn { color: #880000; } /* Constant */
-code span.co { color: #60a0b0; font-style: italic; } /* Comment */
-code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
-code span.do { color: #ba2121; font-style: italic; } /* Documentation */
-code span.dt { color: #902000; } /* DataType */
-code span.dv { color: #40a070; } /* DecVal */
-code span.er { color: #ff0000; font-weight: bold; } /* Error */
-code span.ex { } /* Extension */
-code span.fl { color: #40a070; } /* Float */
-code span.fu { color: #06287e; } /* Function */
-code span.im { } /* Import */
-code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
-code span.kw { color: #007020; font-weight: bold; } /* Keyword */
-code span.op { color: #666666; } /* Operator */
-code span.ot { color: #007020; } /* Other */
-code span.pp { color: #bc7a00; } /* Preprocessor */
-code span.sc { color: #4070a0; } /* SpecialChar */
-code span.ss { color: #bb6688; } /* SpecialString */
-code span.st { color: #4070a0; } /* String */
-code span.va { color: #19177c; } /* Variable */
-code span.vs { color: #4070a0; } /* VerbatimString */
-code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
-
-</style>
-<script>
-// apply pandoc div.sourceCode style to pre.sourceCode instead
-(function() {
-  var sheets = document.styleSheets;
-  for (var i = 0; i < sheets.length; i++) {
-    if (sheets[i].ownerNode.dataset["origin"] !== "pandoc") continue;
-    try { var rules = sheets[i].cssRules; } catch (e) { continue; }
-    for (var j = 0; j < rules.length; j++) {
-      var rule = rules[j];
-      // check if there is a div.sourceCode rule
-      if (rule.type !== rule.STYLE_RULE || rule.selectorText !== "div.sourceCode") continue;
-      var style = rule.style.cssText;
-      // check if color or background-color is set
-      if (rule.style.color === '' || rule.style.backgroundColor === '') continue;
-      // replace div.sourceCode by a pre.sourceCode rule
-      sheets[i].deleteRule(j);
-      sheets[i].insertRule('pre.sourceCode{' + style + '}', j);
-    }
-  }
-})();
-</script>
-
-
-
-<style type="text/css">body {
-background-color: #fff;
-margin: 1em auto;
-max-width: 700px;
-overflow: visible;
-padding-left: 2em;
-padding-right: 2em;
-font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
-font-size: 14px;
-line-height: 1.35;
-}
-#header {
-text-align: center;
-}
-#TOC {
-clear: both;
-margin: 0 0 10px 10px;
-padding: 4px;
-width: 400px;
-border: 1px solid #CCCCCC;
-border-radius: 5px;
-background-color: #f6f6f6;
-font-size: 13px;
-line-height: 1.3;
-}
-#TOC .toctitle {
-font-weight: bold;
-font-size: 15px;
-margin-left: 5px;
-}
-#TOC ul {
-padding-left: 40px;
-margin-left: -1.5em;
-margin-top: 5px;
-margin-bottom: 5px;
-}
-#TOC ul ul {
-margin-left: -2em;
-}
-#TOC li {
-line-height: 16px;
-}
-table {
-margin: 1em auto;
-border-width: 1px;
-border-color: #DDDDDD;
-border-style: outset;
-border-collapse: collapse;
-}
-table th {
-border-width: 2px;
-padding: 5px;
-border-style: inset;
-}
-table td {
-border-width: 1px;
-border-style: inset;
-line-height: 18px;
-padding: 5px 5px;
-}
-table, table th, table td {
-border-left-style: none;
-border-right-style: none;
-}
-table thead, table tr.even {
-background-color: #f7f7f7;
-}
-p {
-margin: 0.5em 0;
-}
-blockquote {
-background-color: #f6f6f6;
-padding: 0.25em 0.75em;
-}
-hr {
-border-style: solid;
-border: none;
-border-top: 1px solid #777;
-margin: 28px 0;
-}
-dl {
-margin-left: 0;
-}
-dl dd {
-margin-bottom: 13px;
-margin-left: 13px;
-}
-dl dt {
-font-weight: bold;
-}
-ul {
-margin-top: 0;
-}
-ul li {
-list-style: circle outside;
-}
-ul ul {
-margin-bottom: 0;
-}
-pre, code {
-background-color: #f7f7f7;
-border-radius: 3px;
-color: #333;
-white-space: pre-wrap; 
-}
-pre {
-border-radius: 3px;
-margin: 5px 0px 10px 0px;
-padding: 10px;
-}
-pre:not([class]) {
-background-color: #f7f7f7;
-}
-code {
-font-family: Consolas, Monaco, 'Courier New', monospace;
-font-size: 85%;
-}
-p > code, li > code {
-padding: 2px 0px;
-}
-div.figure {
-text-align: center;
-}
-img {
-background-color: #FFFFFF;
-padding: 2px;
-border: 1px solid #DDDDDD;
-border-radius: 3px;
-border: 1px solid #CCCCCC;
-margin: 0 5px;
-}
-h1 {
-margin-top: 0;
-font-size: 35px;
-line-height: 40px;
-}
-h2 {
-border-bottom: 4px solid #f7f7f7;
-padding-top: 10px;
-padding-bottom: 2px;
-font-size: 145%;
-}
-h3 {
-border-bottom: 2px solid #f7f7f7;
-padding-top: 10px;
-font-size: 120%;
-}
-h4 {
-border-bottom: 1px solid #f7f7f7;
-margin-left: 8px;
-font-size: 105%;
-}
-h5, h6 {
-border-bottom: 1px solid #ccc;
-font-size: 105%;
-}
-a {
-color: #0033dd;
-text-decoration: none;
-}
-a:hover {
-color: #6666ff; }
-a:visited {
-color: #800080; }
-a:visited:hover {
-color: #BB00BB; }
-a[href^="http:"] {
-text-decoration: underline; }
-a[href^="https:"] {
-text-decoration: underline; }
-
-code > span.kw { color: #555; font-weight: bold; } 
-code > span.dt { color: #902000; } 
-code > span.dv { color: #40a070; } 
-code > span.bn { color: #d14; } 
-code > span.fl { color: #d14; } 
-code > span.ch { color: #d14; } 
-code > span.st { color: #d14; } 
-code > span.co { color: #888888; font-style: italic; } 
-code > span.ot { color: #007020; } 
-code > span.al { color: #ff0000; font-weight: bold; } 
-code > span.fu { color: #900; font-weight: bold; }  code > span.er { color: #a61717; background-color: #e3d2d2; } 
-</style>
-
-</head>
-
-<body>
-
-
-
-
-<h1 class="title toc-ignore">Using DataPackageR</h1>
-<h4 class="author"><em>Greg Finak <a href="mailto:gfinak@fredhutch.org" class="email">gfinak@fredhutch.org</a></em></h4>
-<h4 class="date"><em>2019-03-11</em></h4>
-
-
-<div id="TOC">
-<ul>
-<li><a href="#purpose">Purpose</a></li>
-<li><a href="#set-up-a-new-data-package.">Set up a new data package.</a><ul>
-<li><a href="#whats-in-the-package-skeleton-structure">What’s in the package skeleton structure?</a></li>
-<li><a href="#a-few-words-about-the-yaml-config-file">A few words about the YAML config file</a></li>
-<li><a href="#where-do-i-put-my-raw-datasets">Where do I put my raw datasets?</a></li>
-<li><a href="#an-api-to-read-raw-data-sets-from-within-an-r-or-rmd-procesing-script.">An API to read raw data sets from within an R or Rmd procesing script.</a></li>
-<li><a href="#yaml-header-metadata-for-r-files-and-rmd-files.">YAML header metadata for R files and Rmd files.</a></li>
-</ul></li>
-<li><a href="#build-the-data-package.">Build the data package.</a><ul>
-<li><a href="#documenting-your-data-set-changes-in-news.md">Documenting your data set changes in NEWS.md</a></li>
-<li><a href="#why-not-just-use-r-cmd-build">Why not just use R CMD build?</a></li>
-<li><a href="#a-log-of-the-build-process">A log of the build process</a></li>
-<li><a href="#a-note-about-the-package-source-directory-after-building.">A note about the package source directory after building.</a></li>
-<li><a href="#update-the-autogenerated-documentation.">Update the autogenerated documentation.</a></li>
-</ul></li>
-<li><a href="#installing-and-using-the-new-data-package">Installing and using the new data package</a><ul>
-<li><a href="#accessing-vignettes-data-sets-and-data-set-documentation.">Accessing vignettes, data sets, and data set documentation.</a></li>
-<li><a href="#using-the-dataversion">Using the DataVersion</a></li>
-</ul></li>
-<li><a href="#migrating-old-data-packages.">Migrating old data packages.</a><ul>
-<li><a href="#create-a-datapackager.yml-file">Create a datapackager.yml file</a></li>
-<li><a href="#reading-data-sets-from-rmd-files">Reading data sets from Rmd files</a></li>
-</ul></li>
-<li><a href="#partial-builds">Partial builds</a></li>
-<li><a href="#multi-script-pipelines.">Multi-script pipelines.</a><ul>
-<li><a href="#file-system-artifacts">File system artifacts</a></li>
-<li><a href="#passing-data-objects-to-subsequent-scripts.">Passing data objects to subsequent scripts.</a></li>
-</ul></li>
-<li><a href="#next-steps">Next steps</a><ul>
-<li><a href="#place-your-package-under-source-control">Place your package under source control</a></li>
-</ul></li>
-<li><a href="#additional-details">Additional Details</a><ul>
-<li><a href="#fingerprints-of-stored-data-objects">Fingerprints of stored data objects</a></li>
-</ul></li>
-</ul>
-</div>
-
-<div id="purpose" class="section level2">
-<h2>Purpose</h2>
-<p>This vignette demonstrates how to use DataPackageR to build a data package.</p>
-<p>DataPackageR aims to simplify data package construction.</p>
-<p>It provides mechanisms for reproducibly preprocessing and tidying raw data into into documented, versioned, and packaged analysis-ready data sets.</p>
-<p>Long-running or computationally intensive data processing can be decoupled from the usual <code>R CMD build</code> process while maintinaing <a href="https://en.wikipedia.org/wiki/Data_lineage">data lineage</a>.</p>
-<p>In this vignette we will subset and package the <code>mtcars</code> data set.</p>
-</div>
-<div id="set-up-a-new-data-package." class="section level2">
-<h2>Set up a new data package.</h2>
-<p>We’ll set up a new data package based on <code>mtcars</code> example in the <a href="https://github.com/RGLab/DataPackageR/blob/master/README.md">README</a>. The <code>datapackage_skeleton()</code> API is used to set up a new package. The user needs to provide:</p>
-<ul>
-<li>R or Rmd code files that do data processing.</li>
-<li>A list of R object names created by those code files.</li>
-<li>Optionally a path to a directory of raw data (will be copied into the package).</li>
-<li>Optionally a list of additional code files that may be dependencies of your R scripts.</li>
-</ul>
-<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="kw">library</span>(DataPackageR)</a>
-<a class="sourceLine" id="cb1-2" data-line-number="2"></a>
-<a class="sourceLine" id="cb1-3" data-line-number="3"><span class="co"># Let's reproducibly package up</span></a>
-<a class="sourceLine" id="cb1-4" data-line-number="4"><span class="co"># the cars in the mtcars dataset</span></a>
-<a class="sourceLine" id="cb1-5" data-line-number="5"><span class="co"># with speed &gt; 20.</span></a>
-<a class="sourceLine" id="cb1-6" data-line-number="6"><span class="co"># Our dataset will be called cars_over_20.</span></a>
-<a class="sourceLine" id="cb1-7" data-line-number="7"></a>
-<a class="sourceLine" id="cb1-8" data-line-number="8"><span class="co"># Get the code file that turns the raw data</span></a>
-<a class="sourceLine" id="cb1-9" data-line-number="9"><span class="co"># to our packaged and processed analysis-ready dataset.</span></a>
-<a class="sourceLine" id="cb1-10" data-line-number="10">processing_code &lt;-</a>
-<a class="sourceLine" id="cb1-11" data-line-number="11"><span class="st">  </span><span class="kw">system.file</span>(<span class="st">&quot;extdata&quot;</span>, </a>
-<a class="sourceLine" id="cb1-12" data-line-number="12">              <span class="st">&quot;tests&quot;</span>,</a>
-<a class="sourceLine" id="cb1-13" data-line-number="13">              <span class="st">&quot;subsetCars.Rmd&quot;</span>,</a>
-<a class="sourceLine" id="cb1-14" data-line-number="14">              <span class="dt">package =</span> <span class="st">&quot;DataPackageR&quot;</span>)</a>
-<a class="sourceLine" id="cb1-15" data-line-number="15"></a>
-<a class="sourceLine" id="cb1-16" data-line-number="16"><span class="co"># Create the package framework.</span></a>
-<a class="sourceLine" id="cb1-17" data-line-number="17">DataPackageR<span class="op">::</span><span class="kw">datapackage_skeleton</span>(<span class="dt">name =</span> <span class="st">&quot;mtcars20&quot;</span>,</a>
-<a class="sourceLine" id="cb1-18" data-line-number="18">  <span class="dt">force =</span> <span class="ot">TRUE</span>,</a>
-<a class="sourceLine" id="cb1-19" data-line-number="19">  <span class="dt">code_files =</span> processing_code,</a>
-<a class="sourceLine" id="cb1-20" data-line-number="20">  <span class="dt">r_object_names =</span> <span class="st">&quot;cars_over_20&quot;</span>,</a>
-<a class="sourceLine" id="cb1-21" data-line-number="21">  <span class="dt">path =</span> <span class="kw">tempdir</span>() </a>
-<a class="sourceLine" id="cb1-22" data-line-number="22">  <span class="co">#dependencies argument is empty</span></a>
-<a class="sourceLine" id="cb1-23" data-line-number="23">  <span class="co">#raw_data_dir argument is empty.</span></a>
-<a class="sourceLine" id="cb1-24" data-line-number="24">  ) </a></code></pre></div>
-<div id="whats-in-the-package-skeleton-structure" class="section level3">
-<h3>What’s in the package skeleton structure?</h3>
-<p>This has created a datapackage source tree named “mtcars20” (in a temporary directory). For a real use case you would pick a <code>path</code> on your filesystem where you could then initialize a new github repository for the package.</p>
-<p>The contents of <code>mtcars20</code> are:</p>
-<pre><code>Registered S3 methods overwritten by 'ggplot2':
-  method         from 
-  [.quosures     rlang
-  c.quosures     rlang
-  print.quosures rlang
-                levelName
-1  mtcars20              
-2   ¦--DESCRIPTION       
-3   ¦--R                 
-4   ¦--Read-and-delete-me
-5   ¦--data              
-6   ¦--data-raw          
-7   ¦   °--subsetCars.Rmd
-8   ¦--datapackager.yml  
-9   ¦--inst              
-10  ¦   °--extdata       
-11  °--man               </code></pre>
-<p>You should fill out the <code>DESCRIPTION</code> file to describe your data package. It contains a new <code>DataVersion</code> string that will be automatically incremented when the data package is built <em>if the packaged data has changed</em>.</p>
-<p>The user-provided code files reside in <code>data-raw</code>. They are executed during the data package build process.</p>
-</div>
-<div id="a-few-words-about-the-yaml-config-file" class="section level3">
-<h3>A few words about the YAML config file</h3>
-<p>A <code>datapackager.yml</code> file is used to configure and control the build process.</p>
-<p>The contents are:</p>
-<pre><code>configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '69649'</code></pre>
-<p>The two main pieces of information in the configuration are a list of the files to be processed and the data sets the package will store.</p>
-<p>This example packages an R data set named <code>cars_over_20</code> (the name was passed in to <code>datapackage_skeleton()</code>). It is created by the <code>subsetCars.Rmd</code> file.</p>
-<p>The objects must be listed in the yaml configuration file. <code>datapackage_skeleton()</code> ensures this is done for you automatically.</p>
-<p>DataPackageR provides an API for modifying this file, so it does not need to be done by hand.</p>
-<p>Further information on the contents of the YAML configuration file, and the API are in the <a href="YAML_CONFIG.md">YAML Configuration Details</a></p>
-</div>
-<div id="where-do-i-put-my-raw-datasets" class="section level3">
-<h3>Where do I put my raw datasets?</h3>
-<p>Raw data (provided the size is not prohibitive) can be placed in <code>inst/extdata</code>.</p>
-<p>The <code>datapackage_skeleton()</code> API has the <code>raw_data_dir</code> argument, which will copy the contents of <code>raw_data_dir</code> (and its subdirectories) into <code>inst/extdata</code> automatically.</p>
-<p>In this example we are reading the <code>mtcars</code> data set that is already in memory, rather than from the file system.</p>
-</div>
-<div id="an-api-to-read-raw-data-sets-from-within-an-r-or-rmd-procesing-script." class="section level3">
-<h3>An API to read raw data sets from within an R or Rmd procesing script.</h3>
-<p>As stated in the README, in order for your processing scripts to be portable, you should not use absolute paths to files. DataPackageR provides an API to point to the data package root directory and the <code>inst/extdata</code> and <code>data</code> subdirectories. These are useful for constructing portable paths in your code to read files from these locations.</p>
-<p>For example: to construct a path to a file named “mydata.csv” located in <code>inst/extdata</code> in your data package source tree:</p>
-<ul>
-<li>use <code>DataPackageR::project_extdata_path(&quot;mydata.csv&quot;)</code> in your <code>R</code> or <code>Rmd</code> file. This would return: e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20/inst/extdata/mydata.csv</li>
-</ul>
-<p>Similarly:</p>
-<ul>
-<li><code>DataPackageR::project_path()</code> constructs a path to the data package root directory. (e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20)</li>
-<li><code>DataPackageR::project_data_path()</code> constructs a path to the data package <code>data</code> subdirectory. (e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20/data)</li>
-</ul>
-<p>Raw data sets that are stored externally (outside the data package source tree) can be constructed relative to the <code>project_path()</code>.</p>
-</div>
-<div id="yaml-header-metadata-for-r-files-and-rmd-files." class="section level3">
-<h3>YAML header metadata for R files and Rmd files.</h3>
-<p>If your processing scripts are Rmd files, the usual yaml header for rmarkdown documents should be present.</p>
-<p>If you have Rmd files, you can still include a yaml header, but it should be commented with <code>#'</code> and it should be at the top of your R file. For example, a test R file in the DataPackageR package looks as follows:</p>
-<pre><code>#'---
-#'title: Sample report  from R script
-#'author: Greg Finak
-#'date: August 1, 2018
-#'---
-data &lt;- runif(100)</code></pre>
-<p>This will be converted to an Rmd file with a proper yaml header, which will then be turned into a vignette and indexed in the built package.</p>
-</div>
-</div>
-<div id="build-the-data-package." class="section level2">
-<h2>Build the data package.</h2>
-<p>Once the skeleton framework is set up,</p>
-<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" data-line-number="1"><span class="co"># Run the preprocessing code to build cars_over_20</span></a>
-<a class="sourceLine" id="cb5-2" data-line-number="2"><span class="co"># and reproducibly enclose it in a package.</span></a>
-<a class="sourceLine" id="cb5-3" data-line-number="3">DataPackageR<span class="op">:::</span><span class="kw">package_build</span>(<span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;mtcars20&quot;</span>))</a>
-<a class="sourceLine" id="cb5-4" data-line-number="4"></a>
-<a class="sourceLine" id="cb5-5" data-line-number="5">✔ Setting active project to <span class="st">'/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20'</span></a>
-<a class="sourceLine" id="cb5-6" data-line-number="6">✔ <span class="dv">1</span> data <span class="kw">set</span>(s) created by subsetCars.Rmd</a>
-<a class="sourceLine" id="cb5-7" data-line-number="7">• cars_over_<span class="dv">20</span></a>
-<a class="sourceLine" id="cb5-8" data-line-number="8">☘ Built  all datasets<span class="op">!</span></a>
-<a class="sourceLine" id="cb5-9" data-line-number="9">Non<span class="op">-</span>interactive NEWS.md file update.</a>
-<a class="sourceLine" id="cb5-10" data-line-number="10"></a>
-<a class="sourceLine" id="cb5-11" data-line-number="11">✔ Creating <span class="st">'vignettes/'</span></a>
-<a class="sourceLine" id="cb5-12" data-line-number="12">✔ Creating <span class="st">'inst/doc/'</span></a>
-<a class="sourceLine" id="cb5-13" data-line-number="13">First time using roxygen2. Upgrading automatically...</a>
-<a class="sourceLine" id="cb5-14" data-line-number="14">Updating roxygen version <span class="cf">in</span> <span class="op">/</span>private<span class="op">/</span>var<span class="op">/</span>folders<span class="op">/</span>jh<span class="op">/</span>x0h3v3pd4dd497g3gtzsm8500000gn<span class="op">/</span>T<span class="op">/</span>Rtmp765KFQ<span class="op">/</span>mtcars20<span class="op">/</span>DESCRIPTION</a>
-<a class="sourceLine" id="cb5-15" data-line-number="15">Writing NAMESPACE</a>
-<a class="sourceLine" id="cb5-16" data-line-number="16">Loading mtcars20</a>
-<a class="sourceLine" id="cb5-17" data-line-number="17">Writing mtcars20.Rd</a>
-<a class="sourceLine" id="cb5-18" data-line-number="18">Writing cars_over_<span class="fl">20.</span>Rd</a>
-<a class="sourceLine" id="cb5-19" data-line-number="19">  </a>
-<a class="sourceLine" id="cb5-20" data-line-number="20">   checking <span class="cf">for</span> file ‘<span class="op">/</span>private<span class="op">/</span>var<span class="op">/</span>folders<span class="op">/</span>jh<span class="op">/</span>x0h3v3pd4dd497g3gtzsm8500000gn<span class="op">/</span>T<span class="op">/</span>Rtmp765KFQ<span class="op">/</span>mtcars20<span class="op">/</span>DESCRIPTION’ ...</a>
-<a class="sourceLine" id="cb5-21" data-line-number="21">  </a>
-<a class="sourceLine" id="cb5-22" data-line-number="22">✔  checking <span class="cf">for</span> file ‘<span class="op">/</span>private<span class="op">/</span>var<span class="op">/</span>folders<span class="op">/</span>jh<span class="op">/</span>x0h3v3pd4dd497g3gtzsm8500000gn<span class="op">/</span>T<span class="op">/</span>Rtmp765KFQ<span class="op">/</span>mtcars20<span class="op">/</span>DESCRIPTION’</a>
-<a class="sourceLine" id="cb5-23" data-line-number="23"></a>
-<a class="sourceLine" id="cb5-24" data-line-number="24">  </a>
-<a class="sourceLine" id="cb5-25" data-line-number="25">─  preparing ‘mtcars20’<span class="op">:</span></a>
-<a class="sourceLine" id="cb5-26" data-line-number="26"></a>
-<a class="sourceLine" id="cb5-27" data-line-number="27"><span class="st">  </span></a>
-<a class="sourceLine" id="cb5-28" data-line-number="28"><span class="st">   </span>checking DESCRIPTION meta<span class="op">-</span>information ...</a>
-<a class="sourceLine" id="cb5-29" data-line-number="29">  </a>
-<a class="sourceLine" id="cb5-30" data-line-number="30">✔  checking DESCRIPTION meta<span class="op">-</span>information</a>
-<a class="sourceLine" id="cb5-31" data-line-number="31"></a>
-<a class="sourceLine" id="cb5-32" data-line-number="32">  </a>
-<a class="sourceLine" id="cb5-33" data-line-number="33">─  checking <span class="cf">for</span> LF line<span class="op">-</span>endings <span class="cf">in</span> source and make files and shell scripts</a>
-<a class="sourceLine" id="cb5-34" data-line-number="34"></a>
-<a class="sourceLine" id="cb5-35" data-line-number="35">  </a>
-<a class="sourceLine" id="cb5-36" data-line-number="36">─  checking <span class="cf">for</span> empty or unneeded directories</a>
-<a class="sourceLine" id="cb5-37" data-line-number="37"></a>
-<a class="sourceLine" id="cb5-38" data-line-number="38">  </a>
-<a class="sourceLine" id="cb5-39" data-line-number="39">─  looking to see <span class="cf">if</span> a ‘data<span class="op">/</span>datalist’ file should be added</a>
-<a class="sourceLine" id="cb5-40" data-line-number="40"></a>
-<a class="sourceLine" id="cb5-41" data-line-number="41">  </a>
-<a class="sourceLine" id="cb5-42" data-line-number="42">     NB<span class="op">:</span><span class="st"> </span>this package now depends on <span class="kw">R</span> (<span class="op">&gt;=</span><span class="st"> </span><span class="dv">3</span>.<span class="fl">5.0</span>)</a>
-<a class="sourceLine" id="cb5-43" data-line-number="43"></a>
-<a class="sourceLine" id="cb5-44" data-line-number="44">  </a>
-<a class="sourceLine" id="cb5-45" data-line-number="45">     WARNING<span class="op">:</span><span class="st"> </span>Added dependency on R <span class="op">&gt;=</span><span class="st"> </span><span class="dv">3</span>.<span class="fl">5.0</span> because serialized objects <span class="cf">in</span>  serialize<span class="op">/</span>load version <span class="dv">3</span> cannot be read <span class="cf">in</span> older versions of R.  <span class="kw">File</span>(s) containing such objects<span class="op">:</span><span class="st"> 'mtcars20/data/cars_over_20.rda'</span></a>
-<a class="sourceLine" id="cb5-46" data-line-number="46"></a>
-<a class="sourceLine" id="cb5-47" data-line-number="47">  </a>
-<a class="sourceLine" id="cb5-48" data-line-number="48">─  building <span class="st">'mtcars20_1.0.tar.gz'</span></a>
-<a class="sourceLine" id="cb5-49" data-line-number="49"></a>
-<a class="sourceLine" id="cb5-50" data-line-number="50">  </a>
-<a class="sourceLine" id="cb5-51" data-line-number="51">   </a>
-<a class="sourceLine" id="cb5-52" data-line-number="52"></a>
-<a class="sourceLine" id="cb5-53" data-line-number="53">Next Steps </a>
-<a class="sourceLine" id="cb5-54" data-line-number="54"><span class="fl">1.</span> Update your package documentation. </a>
-<a class="sourceLine" id="cb5-55" data-line-number="55">   <span class="op">-</span><span class="st"> </span>Edit the documentation.R file <span class="cf">in</span> the package source data<span class="op">-</span>raw subdirectory and update the roxygen markup. </a>
-<a class="sourceLine" id="cb5-56" data-line-number="56">   <span class="op">-</span><span class="st"> </span>Rebuild the package documentation with  <span class="kw">document</span>() . </a>
-<a class="sourceLine" id="cb5-57" data-line-number="57"><span class="fl">2.</span> Add your package to source control. </a>
-<a class="sourceLine" id="cb5-58" data-line-number="58">   <span class="op">-</span><span class="st"> </span>Call  git init .  <span class="cf">in</span> the package source root directory. </a>
-<a class="sourceLine" id="cb5-59" data-line-number="59">   <span class="op">-</span><span class="st">  </span>git add  the package files. </a>
-<a class="sourceLine" id="cb5-60" data-line-number="60">   <span class="op">-</span><span class="st">  </span>git commit  your new package. </a>
-<a class="sourceLine" id="cb5-61" data-line-number="61">   <span class="op">-</span><span class="st"> </span>Set up a github repository <span class="cf">for</span> your pacakge. </a>
-<a class="sourceLine" id="cb5-62" data-line-number="62">   <span class="op">-</span><span class="st"> </span>Add the github repository as a remote of your local package repository. </a>
-<a class="sourceLine" id="cb5-63" data-line-number="63">   <span class="op">-</span><span class="st">  </span>git push  your local repository to gitub. </a>
-<a class="sourceLine" id="cb5-64" data-line-number="64">[<span class="dv">1</span>] <span class="st">&quot;/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20_1.0.tar.gz&quot;</span></a></code></pre></div>
-<div id="documenting-your-data-set-changes-in-news.md" class="section level3">
-<h3>Documenting your data set changes in NEWS.md</h3>
-<p>When you build a package in interactive mode, you will be prompted to input text describing the changes to your data package (one line).</p>
-<p>These will appear in the NEWS.md file in the following format:</p>
-<pre><code>DataVersion: xx.yy.zz
-========
-A description of your changes to the package
-
-[The rest of the file]</code></pre>
-</div>
-<div id="why-not-just-use-r-cmd-build" class="section level3">
-<h3>Why not just use R CMD build?</h3>
-<p>If the processing script is time consuming or the data set is particularly large, then <code>R CMD build</code> would run the code each time the package is installed. In such cases, raw data may not be available, or the environment to do the data processing may not be set up for each user of the data. DataPackageR decouples data processing from package building/installation for data consumers.</p>
-</div>
-<div id="a-log-of-the-build-process" class="section level3">
-<h3>A log of the build process</h3>
-<p>DataPackageR uses the <code>futile.logger</code> package to log progress.</p>
-<p>If there are errors in the processing, the script will notify you via logging to console and to <code>/private/tmp/Test/inst/extdata/Logfiles/processing.log</code>. Errors should be corrected and the build repeated.</p>
-<p>If everything goes smoothly, you will have a new package built in the parent directory.</p>
-<p>In this case we have a new package <code>mtcars20_1.0.tar.gz</code>.</p>
-</div>
-<div id="a-note-about-the-package-source-directory-after-building." class="section level3">
-<h3>A note about the package source directory after building.</h3>
-<p>The pacakge source directory changes after the first build.</p>
-<pre><code>                         levelName
-1  mtcars20                       
-2   ¦--DATADIGEST                 
-3   ¦--DESCRIPTION                
-4   ¦--NAMESPACE                  
-5   ¦--NEWS.md                    
-6   ¦--R                          
-7   ¦   °--mtcars20.R             
-8   ¦--Read-and-delete-me         
-9   ¦--data                       
-10  ¦   °--cars_over_20.rda       
-11  ¦--data-raw                   
-12  ¦   ¦--documentation.R        
-13  ¦   ¦--subsetCars.R           
-14  ¦   °--subsetCars.Rmd         
-15  ¦--datapackager.yml           
-16  ¦--inst                       
-17  ¦   ¦--doc                    
-18  ¦   ¦   ¦--subsetCars.Rmd     
-19  ¦   ¦   °--subsetCars.html    
-20  ¦   °--extdata                
-21  ¦       °--Logfiles           
-22  ¦           ¦--processing.log 
-23  ¦           °--subsetCars.html
-24  ¦--man                        
-25  ¦   ¦--cars_over_20.Rd        
-26  ¦   °--mtcars20.Rd            
-27  °--vignettes                  
-28      °--subsetCars.Rmd         </code></pre>
-</div>
-<div id="update-the-autogenerated-documentation." class="section level3">
-<h3>Update the autogenerated documentation.</h3>
-<p>After the first build, the <code>R</code> directory contains <code>mtcars.R</code> that has autogenerated <code>roxygen2</code> markup documentation for the data package and for the packaged data <code>cars_over20</code>.</p>
-<p>The processed <code>Rd</code> files can be found in <code>man</code>.</p>
-<p>The autogenerated documentation source is in the <code>documentation.R</code> file in <code>data-raw</code>.</p>
-<p>You should update this file to properly document your objects. Then rebuild the documentation:</p>
-<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" data-line-number="1"><span class="kw">document</span>(<span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;mtcars20&quot;</span>))</a>
-<a class="sourceLine" id="cb8-2" data-line-number="2"></a>
-<a class="sourceLine" id="cb8-3" data-line-number="3">✔ Setting active project to <span class="st">'/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20'</span></a>
-<a class="sourceLine" id="cb8-4" data-line-number="4">Updating mtcars20 documentation</a>
-<a class="sourceLine" id="cb8-5" data-line-number="5">Loading mtcars20</a>
-<a class="sourceLine" id="cb8-6" data-line-number="6">[<span class="dv">1</span>] <span class="ot">TRUE</span></a></code></pre></div>
-<p>This is done without reprocessing the data.</p>
-<div id="dont-forget-to-rebuild-the-package." class="section level4">
-<h4>Dont’ forget to rebuild the package.</h4>
-<p>You should update the documentation in <code>R/mtcars.R</code>, then call <code>package_build()</code> again.</p>
-</div>
-</div>
-</div>
-<div id="installing-and-using-the-new-data-package" class="section level2">
-<h2>Installing and using the new data package</h2>
-<div id="accessing-vignettes-data-sets-and-data-set-documentation." class="section level3">
-<h3>Accessing vignettes, data sets, and data set documentation.</h3>
-<p>The package source also contains files in the <code>vignettes</code> and <code>inst/doc</code> directories that provide a log of the data processing.</p>
-<p>When the package is installed, these will be accessible via the <code>vignette()</code> API.</p>
-<p>The vignette will detail the processing performed by the <code>subsetCars.Rmd</code> processing script.</p>
-<p>The data set documentation will be accessible via <code>?cars_over_20</code>, and the data sets via <code>data()</code>.</p>
-<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb9-1" data-line-number="1"><span class="co"># create a temporary library to install into.</span></a>
-<a class="sourceLine" id="cb9-2" data-line-number="2"><span class="kw">dir.create</span>(<span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;lib&quot;</span>))</a>
-<a class="sourceLine" id="cb9-3" data-line-number="3"><span class="co"># Let's use the package we just created.</span></a>
-<a class="sourceLine" id="cb9-4" data-line-number="4"><span class="kw">install.packages</span>(<span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;mtcars20_1.0.tar.gz&quot;</span>), <span class="dt">type =</span> <span class="st">&quot;source&quot;</span>, <span class="dt">repos =</span> <span class="ot">NULL</span>, <span class="dt">lib =</span> <span class="kw">file.path</span>(<span class="kw">tempdir</span>(),<span class="st">&quot;lib&quot;</span>))</a>
-<a class="sourceLine" id="cb9-5" data-line-number="5"><span class="cf">if</span>(<span class="op">!</span><span class="st">&quot;package:mtcars20&quot;</span><span class="op">%in%</span><span class="kw">search</span>())</a>
-<a class="sourceLine" id="cb9-6" data-line-number="6">  <span class="kw">attachNamespace</span>(<span class="st">'mtcars20'</span>) <span class="co">#use library() in your code</span></a>
-<a class="sourceLine" id="cb9-7" data-line-number="7"><span class="kw">data</span>(<span class="st">&quot;cars_over_20&quot;</span>) <span class="co"># load the data</span></a>
-<a class="sourceLine" id="cb9-8" data-line-number="8"></a>
-<a class="sourceLine" id="cb9-9" data-line-number="9">cars_over_<span class="dv">20</span> <span class="co"># now we can use it.</span></a>
-<a class="sourceLine" id="cb9-10" data-line-number="10">   speed dist</a>
-<a class="sourceLine" id="cb9-11" data-line-number="11"><span class="dv">44</span>    <span class="dv">22</span>   <span class="dv">66</span></a>
-<a class="sourceLine" id="cb9-12" data-line-number="12"><span class="dv">45</span>    <span class="dv">23</span>   <span class="dv">54</span></a>
-<a class="sourceLine" id="cb9-13" data-line-number="13"><span class="dv">46</span>    <span class="dv">24</span>   <span class="dv">70</span></a>
-<a class="sourceLine" id="cb9-14" data-line-number="14"><span class="dv">47</span>    <span class="dv">24</span>   <span class="dv">92</span></a>
-<a class="sourceLine" id="cb9-15" data-line-number="15"><span class="dv">48</span>    <span class="dv">24</span>   <span class="dv">93</span></a>
-<a class="sourceLine" id="cb9-16" data-line-number="16"><span class="dv">49</span>    <span class="dv">24</span>  <span class="dv">120</span></a>
-<a class="sourceLine" id="cb9-17" data-line-number="17"><span class="dv">50</span>    <span class="dv">25</span>   <span class="dv">85</span></a>
-<a class="sourceLine" id="cb9-18" data-line-number="18">?cars_over_<span class="dv">20</span> <span class="co"># See the documentation you wrote in data-raw/documentation.R.</span></a>
-<a class="sourceLine" id="cb9-19" data-line-number="19">  </a>
-<a class="sourceLine" id="cb9-20" data-line-number="20">vignettes &lt;-<span class="st"> </span><span class="kw">vignette</span>(<span class="dt">package =</span> <span class="st">&quot;mtcars20&quot;</span>)</a>
-<a class="sourceLine" id="cb9-21" data-line-number="21">vignettes<span class="op">$</span>results</a>
-<a class="sourceLine" id="cb9-22" data-line-number="22">      Package   </a>
-<a class="sourceLine" id="cb9-23" data-line-number="23">Topic <span class="st">&quot;mtcars20&quot;</span></a>
-<a class="sourceLine" id="cb9-24" data-line-number="24">      LibPath                                                         </a>
-<a class="sourceLine" id="cb9-25" data-line-number="25">Topic <span class="st">&quot;/Library/Frameworks/R.framework/Versions/3.6/Resources/library&quot;</span></a>
-<a class="sourceLine" id="cb9-26" data-line-number="26">      Item         Title                                            </a>
-<a class="sourceLine" id="cb9-27" data-line-number="27">Topic <span class="st">&quot;subsetCars&quot;</span> <span class="st">&quot;A Test Document for DataPackageR (source, html)&quot;</span></a></code></pre></div>
-</div>
-<div id="using-the-dataversion" class="section level3">
-<h3>Using the DataVersion</h3>
-<p>Your downstream data analysis can depend on a specific version of the data in your data package by testing the DataVersion string in the DESCRIPTION file.</p>
-<p>We provide an API for this:</p>
-<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb10-1" data-line-number="1"><span class="co"># We can easily check the version of the data</span></a>
-<a class="sourceLine" id="cb10-2" data-line-number="2">DataPackageR<span class="op">::</span><span class="kw">data_version</span>(<span class="st">&quot;mtcars20&quot;</span>)</a>
-<a class="sourceLine" id="cb10-3" data-line-number="3">[<span class="dv">1</span>] <span class="st">'0.1.0'</span></a>
-<a class="sourceLine" id="cb10-4" data-line-number="4"></a>
-<a class="sourceLine" id="cb10-5" data-line-number="5"><span class="co"># You can use an assert to check the data version in  reports and</span></a>
-<a class="sourceLine" id="cb10-6" data-line-number="6"><span class="co"># analyses that use the packaged data.</span></a>
-<a class="sourceLine" id="cb10-7" data-line-number="7"><span class="kw">assert_data_version</span>(<span class="dt">data_package_name =</span> <span class="st">&quot;mtcars20&quot;</span>,</a>
-<a class="sourceLine" id="cb10-8" data-line-number="8">                    <span class="dt">version_string =</span> <span class="st">&quot;0.1.0&quot;</span>,</a>
-<a class="sourceLine" id="cb10-9" data-line-number="9">                    <span class="dt">acceptable =</span> <span class="st">&quot;equal&quot;</span>)  <span class="co">#If this fails, execution stops</span></a>
-<a class="sourceLine" id="cb10-10" data-line-number="10">                                           <span class="co">#and provides an informative error.</span></a></code></pre></div>
-</div>
-</div>
-<div id="migrating-old-data-packages." class="section level1">
-<h1>Migrating old data packages.</h1>
-<p>Version 1.12.0 has moved away from controlling the build process using <code>datasets.R</code> and an additional <code>masterfile</code> argument.</p>
-<p>The build process is now controlled via a <code>datapackager.yml</code> configuration file located in the package root directory. (see <a href="YAML_CONFIG.md">YAML Configuration Details</a>)</p>
-<div id="create-a-datapackager.yml-file" class="section level3">
-<h3>Create a datapackager.yml file</h3>
-<p>You can migrate an old package by constructing such a config file using the <code>construct_yml_config()</code> API.</p>
-<div class="sourceCode" id="cb11"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb11-1" data-line-number="1"><span class="co"># assume I have file1.Rmd and file2.R located in /data-raw, </span></a>
-<a class="sourceLine" id="cb11-2" data-line-number="2"><span class="co"># and these create 'object1' and 'object2' respectively.</span></a>
-<a class="sourceLine" id="cb11-3" data-line-number="3"></a>
-<a class="sourceLine" id="cb11-4" data-line-number="4">config &lt;-<span class="st"> </span><span class="kw">construct_yml_config</span>(<span class="dt">code =</span> <span class="kw">c</span>(<span class="st">&quot;file1.Rmd&quot;</span>, <span class="st">&quot;file2.R&quot;</span>),</a>
-<a class="sourceLine" id="cb11-5" data-line-number="5">                              <span class="dt">data =</span> <span class="kw">c</span>(<span class="st">&quot;object1&quot;</span>, <span class="st">&quot;object2&quot;</span>))</a>
-<a class="sourceLine" id="cb11-6" data-line-number="6"><span class="kw">cat</span>(yaml<span class="op">::</span><span class="kw">as.yaml</span>(config))</a>
-<a class="sourceLine" id="cb11-7" data-line-number="7">configuration<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-8" data-line-number="8"><span class="st">  </span>files<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-9" data-line-number="9"><span class="st">    </span>file1.Rmd<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-10" data-line-number="10"><span class="st">      </span>enabled<span class="op">:</span><span class="st"> </span>yes</a>
-<a class="sourceLine" id="cb11-11" data-line-number="11">    file2.R<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-12" data-line-number="12"><span class="st">      </span>enabled<span class="op">:</span><span class="st"> </span>yes</a>
-<a class="sourceLine" id="cb11-13" data-line-number="13">  objects<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-14" data-line-number="14"><span class="st">  </span><span class="op">-</span><span class="st"> </span>object1</a>
-<a class="sourceLine" id="cb11-15" data-line-number="15">  <span class="op">-</span><span class="st"> </span>object2</a>
-<a class="sourceLine" id="cb11-16" data-line-number="16">  render_root<span class="op">:</span></a>
-<a class="sourceLine" id="cb11-17" data-line-number="17"><span class="st">    </span>tmp<span class="op">:</span><span class="st"> '251712'</span></a></code></pre></div>
-<p><code>config</code> is a newly constructed yaml configuration object. It can be written to the package directory:</p>
-<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" data-line-number="1">path_to_package &lt;-<span class="st"> </span><span class="kw">tempdir</span>() <span class="co">#e.g., if tempdir() was the root of our package.</span></a>
-<a class="sourceLine" id="cb12-2" data-line-number="2"><span class="kw">yml_write</span>(config, <span class="dt">path =</span> path_to_package)</a></code></pre></div>
-<p>Now the package at <code>path_to_package</code> will build with version 1.12.0 or greater.</p>
-</div>
-<div id="reading-data-sets-from-rmd-files" class="section level3">
-<h3>Reading data sets from Rmd files</h3>
-<p>In versions prior to 1.12.1 we would read data sets from <code>inst/extdata</code> in an <code>Rmd</code> script using paths relative to <code>data-raw</code> in the data package source tree.</p>
-<p>For example:</p>
-<div id="the-old-way" class="section level4">
-<h4>The old way</h4>
-<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" data-line-number="1"><span class="co"># read 'myfile.csv' from inst/extdata relative to data-raw where the Rmd is rendered.</span></a>
-<a class="sourceLine" id="cb13-2" data-line-number="2"><span class="kw">read.csv</span>(<span class="kw">file.path</span>(<span class="st">&quot;../inst/extdata&quot;</span>,<span class="st">&quot;myfile.csv&quot;</span>))</a></code></pre></div>
-<p>Now <code>Rmd</code> and <code>R</code> scripts are processed in <code>render_root</code> defined in the yaml config.</p>
-<p>To read a raw data set we can get the path to the package source directory using an API call:</p>
-</div>
-<div id="the-new-way" class="section level4">
-<h4>The new way</h4>
-<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" data-line-number="1"><span class="co"># DataPackageR::project_extdata_path() returns the path to the data package inst/extdata subdirectory directory.</span></a>
-<a class="sourceLine" id="cb14-2" data-line-number="2"><span class="co"># DataPackageR::project_path() returns the path to the data package root directory.</span></a>
-<a class="sourceLine" id="cb14-3" data-line-number="3"><span class="co"># DataPackageR::project_data_path() returns the path to the data package data subdirectory directory.</span></a>
-<a class="sourceLine" id="cb14-4" data-line-number="4"><span class="kw">read.csv</span>(</a>
-<a class="sourceLine" id="cb14-5" data-line-number="5">    DataPackageR<span class="op">::</span><span class="kw">project_extdata_path</span>(<span class="st">&quot;myfile.csv&quot;</span>)</a>
-<a class="sourceLine" id="cb14-6" data-line-number="6">    )</a></code></pre></div>
-</div>
-</div>
-</div>
-<div id="partial-builds" class="section level1">
-<h1>Partial builds</h1>
-<p>We can also perform partial builds of a subset of files in a package by toggling the <code>enabled</code> key in the config file.</p>
-<p>This can be done with the following API:</p>
-<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb15-1" data-line-number="1">config &lt;-<span class="st"> </span><span class="kw">yml_disable_compile</span>(config,<span class="dt">filenames =</span> <span class="st">&quot;file2.R&quot;</span>)</a>
-<a class="sourceLine" id="cb15-2" data-line-number="2"><span class="kw">yml_write</span>(config, <span class="dt">path =</span> path_to_package) <span class="co"># write modified yml to the package.</span></a>
-<a class="sourceLine" id="cb15-3" data-line-number="3">configuration<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-4" data-line-number="4"><span class="st">  </span>files<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-5" data-line-number="5"><span class="st">    </span>file1.Rmd<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-6" data-line-number="6"><span class="st">      </span>enabled<span class="op">:</span><span class="st"> </span>yes</a>
-<a class="sourceLine" id="cb15-7" data-line-number="7">    file2.R<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-8" data-line-number="8"><span class="st">      </span>enabled<span class="op">:</span><span class="st"> </span>no</a>
-<a class="sourceLine" id="cb15-9" data-line-number="9">  objects<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-10" data-line-number="10"><span class="st">  </span><span class="op">-</span><span class="st"> </span>object1</a>
-<a class="sourceLine" id="cb15-11" data-line-number="11">  <span class="op">-</span><span class="st"> </span>object2</a>
-<a class="sourceLine" id="cb15-12" data-line-number="12">  render_root<span class="op">:</span></a>
-<a class="sourceLine" id="cb15-13" data-line-number="13"><span class="st">    </span>tmp<span class="op">:</span><span class="st"> '251712'</span></a></code></pre></div>
-<p>Note that the modified configuration needs to be written back to the package source directory in order for the changes to take effect.</p>
-<p>The consequence of toggling a file to <code>enable: no</code> is that it will be skipped when the package is rebuilt, but the data will still be retained in the package, and the documentation will not be altered.</p>
-<p>This is useful in situations where we have multiple data sets, and want to re-run one script to update a specific data set, but not the other scripts because they may be too time consuming, for example.</p>
-</div>
-<div id="multi-script-pipelines." class="section level1">
-<h1>Multi-script pipelines.</h1>
-<p>We may have situations where we have mutli-script pipelines. There are two ways to share data among scripts.</p>
-<ol style="list-style-type: decimal">
-<li>filesystem artifacts</li>
-<li>data objects passed to subsequent scripts.</li>
-</ol>
-<div id="file-system-artifacts" class="section level3">
-<h3>File system artifacts</h3>
-<p>The yaml configuration property <code>render_root</code> specifies the working directory where scripts will be rendered.</p>
-<p>If a script writes files to the working directory, that is where files will appear. These can be read by subsequent scripts.</p>
-</div>
-<div id="passing-data-objects-to-subsequent-scripts." class="section level3">
-<h3>Passing data objects to subsequent scripts.</h3>
-<p>A script (e.g., <code>script2.Rmd</code>) running after <code>script1.Rmd</code> can access a stored data object named <code>script1_dataset</code> created by <code>script1.Rmd</code> by calling</p>
-<p><code>script1_dataset &lt;- DataPackageR::datapackager_object_read(&quot;script1_dataset&quot;)</code>.</p>
-<p>Passing of data objects amongst scripts can be turned off via:</p>
-<p><code>package_build(deps = FALSE)</code></p>
-</div>
-</div>
-<div id="next-steps" class="section level1">
-<h1>Next steps</h1>
-<p>We recommend the following once your package is created.</p>
-<div id="place-your-package-under-source-control" class="section level2">
-<h2>Place your package under source control</h2>
-<p>You now have a data package source tree.</p>
-<ul>
-<li><strong>Place your package under version control</strong>
-<ol style="list-style-type: decimal">
-<li>Call <code>git init</code> in the package source root to initialize a new git repository.</li>
-<li><a href="https://help.github.com/articles/create-a-repo/">Create a new repository for your data package on github</a>.</li>
-<li>Push your local package repository to <code>github</code>. <a href="https://help.github.com/articles/adding-an-existing-project-to-github-using-the-command-line/">see step 7</a></li>
-</ol></li>
-</ul>
-<p>This will let you version control your data processing code, and provide a mechanism for sharing your package with others.</p>
-<p>For more details on using git and github with R, there is an excellent guide provided by Jenny Bryan: <a href="http://happygitwithr.com/">Happy Git and GitHub for the useR</a> and Hadley Wickham’s <a href="http://r-pkgs.had.co.nz/">book on R packages</a>.</p>
-</div>
-</div>
-<div id="additional-details" class="section level1">
-<h1>Additional Details</h1>
-<p>We provide some additional details for the interested.</p>
-<div id="fingerprints-of-stored-data-objects" class="section level3">
-<h3>Fingerprints of stored data objects</h3>
-<p>DataPackageR calculates an md5 checksum of each data object it stores, and keeps track of them in a file called <code>DATADIGEST</code>.</p>
-<ul>
-<li>Each time the package is rebuilt, the md5 sums of the new data objects are compared against the DATADIGEST.</li>
-<li>If they don’t match, the build process checks that the <code>DataVersion</code> string has been incremented in the <code>DESCRIPTION</code> file.</li>
-<li>If it has not the build process will exit and produce an error message.</li>
-</ul>
-<div id="datadigest" class="section level4">
-<h4>DATADIGEST</h4>
-<p>The <code>DATADIGEST</code> file contains the following:</p>
-<pre><code>DataVersion: 0.1.0
-cars_over_20: 3ccb5b0aaa74fe7cfc0d3ca6ab0b5cf3</code></pre>
-</div>
-<div id="description" class="section level4">
-<h4>DESCRIPTION</h4>
-<p>The description file has the new <code>DataVersion</code> string.</p>
-<pre><code>Package: mtcars20
-Title: What the Package Does (One Line, Title Case)
-Version: 1.0
-Authors@R: 
-    person(given = &quot;First&quot;,
-           family = &quot;Last&quot;,
-           role = c(&quot;aut&quot;, &quot;cre&quot;),
-           email = &quot;first.last@example.com&quot;)
-Description: What the package does (one paragraph).
-License: What license it uses
-Encoding: UTF-8
-LazyData: true
-DataVersion: 0.1.0
-Roxygen: list(markdown = TRUE)
-Date: 2019-03-11
-Suggests: 
-    knitr,
-    rmarkdown
-VignetteBuilder: knitr
-RoxygenNote: 6.1.1</code></pre>
-</div>
-</div>
-</div>
-
-
-
-<!-- dynamically load mathjax for compatibility with self-contained -->
-<script>
-  (function () {
-    var script = document.createElement("script");
-    script.type = "text/javascript";
-    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
-    document.getElementsByTagName("head")[0].appendChild(script);
-  })();
-</script>
-
-</body>
-</html>
diff --git a/vignettes/usingDataPackageR.md b/vignettes/usingDataPackageR.md
deleted file mode 100644
index 8cd8d78..0000000
--- a/vignettes/usingDataPackageR.md
+++ /dev/null
@@ -1,585 +0,0 @@
----
-title: "Using DataPackageR"
-author: "Greg Finak <gfinak@fredhutch.org>"
-date: "2019-03-11"
-output: 
-  rmarkdown::html_vignette:
-    keep_md: TRUE
-    toc: yes
-vignette: >
-  %\VignetteIndexEntry{A Guide to using DataPackageR}
-  %\VignetteEngine{knitr::rmarkdown}
-  %\usepackage[utf8]{inputenc}
-  %\usepackage{graphicx}
----
-
-
-
-## Purpose
-
-This vignette demonstrates how to use DataPackageR to build a data package. 
-
-DataPackageR aims to simplify data package construction.
-
-It provides mechanisms for reproducibly preprocessing and tidying raw data into into documented, versioned, and packaged analysis-ready data sets. 
-
-Long-running or computationally intensive data processing can be decoupled from the usual `R CMD build` process while maintinaing [data lineage](https://en.wikipedia.org/wiki/Data_lineage).
-
-In this vignette we will subset and package the `mtcars` data set.
-
-## Set up a new data package.
-
-We'll set up a new data package based on `mtcars` example in the [README](https://github.com/RGLab/DataPackageR/blob/master/README.md).
-The `datapackage_skeleton()` API is used to set up a new package. 
-The user needs to provide:
-
-- R or Rmd code files that do data processing.
-- A list of R object names created by those code files.
-- Optionally a path to a directory of raw data (will be copied into the package).
-- Optionally a list of additional code files that may be dependencies of your R scripts. 
-
-
-
-```r
-library(DataPackageR)
-
-# Let's reproducibly package up
-# the cars in the mtcars dataset
-# with speed > 20.
-# Our dataset will be called cars_over_20.
-
-# Get the code file that turns the raw data
-# to our packaged and processed analysis-ready dataset.
-processing_code <-
-  system.file("extdata", 
-              "tests",
-              "subsetCars.Rmd",
-              package = "DataPackageR")
-
-# Create the package framework.
-DataPackageR::datapackage_skeleton(name = "mtcars20",
-  force = TRUE,
-  code_files = processing_code,
-  r_object_names = "cars_over_20",
-  path = tempdir() 
-  #dependencies argument is empty
-  #raw_data_dir argument is empty.
-  ) 
-```
-
-### What's in the package skeleton structure?
-
-This has created a datapackage source tree named "mtcars20" (in a temporary directory). 
-For a real use case you would pick a `path` on your filesystem where you could then initialize a new github repository for the package.
-
-The contents of `mtcars20` are:
-
-
-```
-Registered S3 methods overwritten by 'ggplot2':
-  method         from 
-  [.quosures     rlang
-  c.quosures     rlang
-  print.quosures rlang
-                levelName
-1  mtcars20              
-2   ¦--DESCRIPTION       
-3   ¦--R                 
-4   ¦--Read-and-delete-me
-5   ¦--data              
-6   ¦--data-raw          
-7   ¦   °--subsetCars.Rmd
-8   ¦--datapackager.yml  
-9   ¦--inst              
-10  ¦   °--extdata       
-11  °--man               
-```
-
-You should fill out the `DESCRIPTION` file to describe your data package. 
-It contains a new `DataVersion` string that will be automatically incremented when the data package is built *if the packaged data has changed*. 
-
-The user-provided code files reside in `data-raw`. They are executed during the data package build process.
-
-### A few words about the YAML config file
-
-A `datapackager.yml` file is used to configure and control the build process.
-
-The contents are:
-
-
-```
-configuration:
-  files:
-    subsetCars.Rmd:
-      enabled: yes
-  objects: cars_over_20
-  render_root:
-    tmp: '69649'
-```
-
-The two main pieces of information in the configuration are a list of the files to be processed and the data sets the package will store.
-
-This example packages an R data set named `cars_over_20` (the name was passed in to `datapackage_skeleton()`).
-It is created by the `subsetCars.Rmd` file. 
-
-
-The objects must be listed in the yaml configuration file. `datapackage_skeleton()`  ensures this is done for you automatically. 
-
-DataPackageR provides an API for modifying this file, so it does not need to be done by hand. 
-
-Further information on the contents of the YAML configuration file, and the API are in the [YAML Configuration Details](YAML_CONFIG.md)
-
-### Where do I put my raw datasets?
-
-Raw data (provided the size is not prohibitive) can be placed in `inst/extdata`.
-
-The `datapackage_skeleton()` API has the `raw_data_dir` argument, which will copy the contents of `raw_data_dir`  (and its subdirectories) into `inst/extdata` automatically. 
-
-In this example we are reading the `mtcars` data set that is already in memory, rather than from the file system.
-
-### An API to read raw data sets from within an R or Rmd procesing script.
-
-As stated in the README, in order for your processing scripts to be portable, you should not use absolute paths to files. 
-DataPackageR provides an API to point to the data package root directory and the `inst/extdata` and `data` subdirectories.
-These are useful for constructing portable paths in your code to read files from these locations.
-
-For example: to construct a path to a file named "mydata.csv" located in `inst/extdata` in your data package source tree:
-
-- use `DataPackageR::project_extdata_path("mydata.csv")` in your `R` or `Rmd` file. This would return: e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20/inst/extdata/mydata.csv
-
-Similarly: 
-
-- `DataPackageR::project_path()`  constructs a path to the data package root directory. (e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20)
-- `DataPackageR::project_data_path()` constructs a path to the data package `data` subdirectory. (e.g., /var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T//Rtmp765KFQ/mtcars20/data)
-
-Raw data sets that are stored externally (outside the data package source tree) can be constructed relative to the `project_path()`.
-
-### YAML header metadata for R files and Rmd files.
-
-If your processing scripts are Rmd files, the usual yaml header for rmarkdown documents should be present.
-
-If you have Rmd files, you can still include a yaml header, but it should be commented with `#'` and it should be at the top of your R file. For example, a test R file in the DataPackageR package looks as follows:
-
-```
-#'---
-#'title: Sample report  from R script
-#'author: Greg Finak
-#'date: August 1, 2018
-#'---
-data <- runif(100)
-```
-
-This will be converted to an Rmd file with a proper yaml header, which will then be turned into a vignette and indexed in the built package.
-
-
-## Build the data package.
-
-Once the skeleton framework is set up, 
-
-
-```r
-# Run the preprocessing code to build cars_over_20
-# and reproducibly enclose it in a package.
-DataPackageR:::package_build(file.path(tempdir(),"mtcars20"))
-
-✔ Setting active project to '/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20'
-✔ 1 data set(s) created by subsetCars.Rmd
-• cars_over_20
-☘ Built  all datasets!
-Non-interactive NEWS.md file update.
-
-✔ Creating 'vignettes/'
-✔ Creating 'inst/doc/'
-First time using roxygen2. Upgrading automatically...
-Updating roxygen version in /private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20/DESCRIPTION
-Writing NAMESPACE
-Loading mtcars20
-Writing mtcars20.Rd
-Writing cars_over_20.Rd
-     checking for file ‘/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20/DESCRIPTION’ ...  ✔  checking for file ‘/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20/DESCRIPTION’
-  ─  preparing ‘mtcars20’:
-     checking DESCRIPTION meta-information ...  ✔  checking DESCRIPTION meta-information
-  ─  checking for LF line-endings in source and make files and shell scripts
-  ─  checking for empty or unneeded directories
-  ─  looking to see if a ‘data/datalist’ file should be added
-       NB: this package now depends on R (>= 3.5.0)
-       WARNING: Added dependency on R >= 3.5.0 because serialized objects in  serialize/load version 3 cannot be read in older versions of R.  File(s) containing such objects: 'mtcars20/data/cars_over_20.rda'
-  ─  building 'mtcars20_1.0.tar.gz'
-     
-Next Steps 
-1. Update your package documentation. 
-   - Edit the documentation.R file in the package source data-raw subdirectory and update the roxygen markup. 
-   - Rebuild the package documentation with  document() . 
-2. Add your package to source control. 
-   - Call  git init .  in the package source root directory. 
-   -  git add  the package files. 
-   -  git commit  your new package. 
-   - Set up a github repository for your pacakge. 
-   - Add the github repository as a remote of your local package repository. 
-   -  git push  your local repository to gitub. 
-[1] "/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20_1.0.tar.gz"
-```
-
-### Documenting your data set changes in NEWS.md
-
-When you build a package in interactive mode, you will be
-prompted to input text describing the changes to your data package (one line). 
-
-These will appear in the NEWS.md file in the following format:
-
-```
-DataVersion: xx.yy.zz
-========
-A description of your changes to the package
-
-[The rest of the file]
-```
-
-
-### Why not just use R CMD build?
-
-If the processing script is time consuming or the data set is particularly large, then `R CMD build` would run the code each time the package is installed. In such cases, raw data may not be available, or the environment to do the data processing may not be set up for each user of the data. DataPackageR decouples data processing from package building/installation for data consumers.
-
-### A log of the build process
-
-DataPackageR uses the `futile.logger` package to log progress. 
-
-If there are errors in the processing, the script will notify you via logging to console and to  `/private/tmp/Test/inst/extdata/Logfiles/processing.log`. Errors should be corrected and the build repeated.
-
-If everything goes smoothly, you will have a new package built in the parent directory. 
-
-In this case we have a new package 
-`mtcars20_1.0.tar.gz`. 
-
-
-### A note about the package source directory after building.
-
-The pacakge source directory changes after the first build.
-
-
-```
-                         levelName
-1  mtcars20                       
-2   ¦--DATADIGEST                 
-3   ¦--DESCRIPTION                
-4   ¦--NAMESPACE                  
-5   ¦--NEWS.md                    
-6   ¦--R                          
-7   ¦   °--mtcars20.R             
-8   ¦--Read-and-delete-me         
-9   ¦--data                       
-10  ¦   °--cars_over_20.rda       
-11  ¦--data-raw                   
-12  ¦   ¦--documentation.R        
-13  ¦   ¦--subsetCars.R           
-14  ¦   °--subsetCars.Rmd         
-15  ¦--datapackager.yml           
-16  ¦--inst                       
-17  ¦   ¦--doc                    
-18  ¦   ¦   ¦--subsetCars.Rmd     
-19  ¦   ¦   °--subsetCars.html    
-20  ¦   °--extdata                
-21  ¦       °--Logfiles           
-22  ¦           ¦--processing.log 
-23  ¦           °--subsetCars.html
-24  ¦--man                        
-25  ¦   ¦--cars_over_20.Rd        
-26  ¦   °--mtcars20.Rd            
-27  °--vignettes                  
-28      °--subsetCars.Rmd         
-```
-
-### Update the autogenerated documentation. 
-
-After the first build, the `R` directory contains `mtcars.R` that has autogenerated `roxygen2` markup documentation for the data package and for the packaged data `cars_over20`. 
-
-The processed `Rd` files can be found in `man`. 
-
-The autogenerated documentation source is in the `documentation.R` file in `data-raw`. 
-
-You should update this file to properly document your objects. Then rebuild the documentation:
-
-
-```r
-document(file.path(tempdir(),"mtcars20"))
-
-✔ Setting active project to '/private/var/folders/jh/x0h3v3pd4dd497g3gtzsm8500000gn/T/Rtmp765KFQ/mtcars20'
-Updating mtcars20 documentation
-Loading mtcars20
-[1] TRUE
-```
-
-This is done without reprocessing the data.
-
-#### Dont' forget to rebuild the package.
-
-You should update the documentation in `R/mtcars.R`, then call `package_build()` again.
-
-
-## Installing and using the new data package
-
-
-### Accessing vignettes, data sets, and data set documentation. 
-
-The package source also contains files in the `vignettes` and `inst/doc` directories that provide a log of the data processing. 
-
-When the package is installed, these will be accessible via the `vignette()` API. 
-
-The vignette will detail the processing performed by the `subsetCars.Rmd` processing script. 
-
-The data set documentation will be accessible via `?cars_over_20`, and the data sets via `data()`. 
-
-
-```r
-# create a temporary library to install into.
-dir.create(file.path(tempdir(),"lib"))
-# Let's use the package we just created.
-install.packages(file.path(tempdir(),"mtcars20_1.0.tar.gz"), type = "source", repos = NULL, lib = file.path(tempdir(),"lib"))
-if(!"package:mtcars20"%in%search())
-  attachNamespace('mtcars20') #use library() in your code
-data("cars_over_20") # load the data
-
-cars_over_20 # now we can use it.
-   speed dist
-44    22   66
-45    23   54
-46    24   70
-47    24   92
-48    24   93
-49    24  120
-50    25   85
-?cars_over_20 # See the documentation you wrote in data-raw/documentation.R.
-  
-vignettes <- vignette(package = "mtcars20")
-vignettes$results
-      Package   
-Topic "mtcars20"
-      LibPath                                                         
-Topic "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
-      Item         Title                                            
-Topic "subsetCars" "A Test Document for DataPackageR (source, html)"
-```
-
-
-### Using the DataVersion
-
-Your downstream data analysis can depend on a specific version of the data in your data package by testing the DataVersion string in the DESCRIPTION file. 
-
-We provide an API for this:
-
-
-```r
-# We can easily check the version of the data
-DataPackageR::data_version("mtcars20")
-[1] '0.1.0'
-
-# You can use an assert to check the data version in  reports and
-# analyses that use the packaged data.
-assert_data_version(data_package_name = "mtcars20",
-                    version_string = "0.1.0",
-                    acceptable = "equal")  #If this fails, execution stops
-                                           #and provides an informative error.
-```
-
-
-# Migrating old data packages.
-
-Version 1.12.0 has moved away from controlling the build process using `datasets.R` and an additional `masterfile` argument. 
-
-The build process is now controlled via a `datapackager.yml` configuration file located in the package root directory.  (see [YAML Configuration Details](YAML_CONFIG.md))
-
-### Create a datapackager.yml file
-
-You can migrate an old package by constructing such a config file using the `construct_yml_config()` API.
-
-
-```r
-# assume I have file1.Rmd and file2.R located in /data-raw, 
-# and these create 'object1' and 'object2' respectively.
-
-config <- construct_yml_config(code = c("file1.Rmd", "file2.R"),
-                              data = c("object1", "object2"))
-cat(yaml::as.yaml(config))
-configuration:
-  files:
-    file1.Rmd:
-      enabled: yes
-    file2.R:
-      enabled: yes
-  objects:
-  - object1
-  - object2
-  render_root:
-    tmp: '251712'
-```
-
-`config` is a newly constructed yaml configuration object. It can be written to the package directory:
-
-
-```r
-path_to_package <- tempdir() #e.g., if tempdir() was the root of our package.
-yml_write(config, path = path_to_package)
-```
-
-Now the package at `path_to_package` will build with version 1.12.0 or greater.
-
-### Reading data sets from Rmd files
-
-In versions prior to 1.12.1 we would read data sets from `inst/extdata` in an `Rmd` script using paths relative to
-`data-raw` in the data package source tree. 
-
-For example:
-
-#### The old way
-
-```r
-# read 'myfile.csv' from inst/extdata relative to data-raw where the Rmd is rendered.
-read.csv(file.path("../inst/extdata","myfile.csv"))
-```
-
-Now `Rmd` and `R` scripts are processed in `render_root` defined in the yaml config.
-
-To read a raw data set we can get the path to the package source directory using an API call:
-
-
-#### The new way
-
-```r
-# DataPackageR::project_extdata_path() returns the path to the data package inst/extdata subdirectory directory.
-# DataPackageR::project_path() returns the path to the data package root directory.
-# DataPackageR::project_data_path() returns the path to the data package data subdirectory directory.
-read.csv(
-    DataPackageR::project_extdata_path("myfile.csv")
-    )
-```
-
-# Partial builds
-
-We can also perform partial builds of a subset of files in a package by toggling the `enabled` key in the config file.
-
-This can be done with the following API:
-
-
-```r
-config <- yml_disable_compile(config,filenames = "file2.R")
-yml_write(config, path = path_to_package) # write modified yml to the package.
-configuration:
-  files:
-    file1.Rmd:
-      enabled: yes
-    file2.R:
-      enabled: no
-  objects:
-  - object1
-  - object2
-  render_root:
-    tmp: '251712'
-```
-
-Note that the modified configuration needs to be written back to the package source directory in order for the 
-changes to take effect. 
-
-The consequence of toggling a file to `enable: no` is that it will be skipped when the package is rebuilt, 
-but the data will still be retained in the package, and the documentation will not be altered. 
-
-This is useful in situations where we have multiple data sets, and want to re-run one script to update a specific data set, but
-not the other scripts because they may be too time consuming, for example.
-
-# Multi-script pipelines.
-
-We may have situations where we have mutli-script pipelines. There are two ways to share data among scripts. 
-
-1. filesystem artifacts
-2. data objects passed to subsequent scripts.
-
-### File system artifacts
-
-The yaml configuration property `render_root` specifies the working directory where scripts will be rendered.
-
-If a script writes files to the working directory, that is where files will appear. These can be read by subsequent scripts.
-
-### Passing data objects to subsequent scripts.
-
-A script (e.g., `script2.Rmd`) running after `script1.Rmd` can access a stored data object named `script1_dataset` created by `script1.Rmd` by calling
-
-`script1_dataset <- DataPackageR::datapackager_object_read("script1_dataset")`. 
-
-Passing of data objects amongst scripts can be turned off via:
-
-`package_build(deps = FALSE)`
-
-# Next steps 
-
-We recommend the following once your package is created.
-
-## Place your package under source control
-
-You now have a data package source tree. 
-
-- **Place your package under version control**
-    1. Call `git init` in the package source root to initialize a new git repository.
-    2. [Create a new repository for your data package on github](https://help.github.com/articles/create-a-repo/).
-    3. Push your local package repository to `github`. [see step 7](https://help.github.com/articles/adding-an-existing-project-to-github-using-the-command-line/)
-
-
-This will let you version control your data processing code, and provide a mechanism for sharing your package with others.
-
-
-For more details on using git and github with R, there is an excellent guide provided by Jenny Bryan: [Happy Git and GitHub for the useR](http://happygitwithr.com/) and Hadley Wickham's [book on R packages](http://r-pkgs.had.co.nz/).
-
-# Additional Details
-
-We provide some additional details for the interested.
-
-### Fingerprints of stored data objects
-
-DataPackageR calculates an md5 checksum of each data object it stores, and keeps track of them in a file
-called `DATADIGEST`.
-
-- Each time the package is rebuilt, the md5 sums of the new data objects are compared against the DATADIGEST.
-- If they don't match, the build process checks that the `DataVersion` string has been incremented in the `DESCRIPTION` file.
-- If it has not the build process will exit and produce an error message.
-
-#### DATADIGEST
-
-
-The `DATADIGEST` file contains the following:
-
-
-```
-DataVersion: 0.1.0
-cars_over_20: 3ccb5b0aaa74fe7cfc0d3ca6ab0b5cf3
-```
-
-
-#### DESCRIPTION
-
-The description file has the new `DataVersion` string.
-
-
-```
-Package: mtcars20
-Title: What the Package Does (One Line, Title Case)
-Version: 1.0
-Authors@R: 
-    person(given = "First",
-           family = "Last",
-           role = c("aut", "cre"),
-           email = "first.last@example.com")
-Description: What the package does (one paragraph).
-License: What license it uses
-Encoding: UTF-8
-LazyData: true
-DataVersion: 0.1.0
-Roxygen: list(markdown = TRUE)
-Date: 2019-03-11
-Suggests: 
-    knitr,
-    rmarkdown
-VignetteBuilder: knitr
-RoxygenNote: 6.1.1
-```
-
-
-
-