diff --git a/_freeze/archive/CaseStudy01/execute-results/html.json b/_freeze/archive/CaseStudy01/execute-results/html.json
new file mode 100644
index 0000000..df84273
--- /dev/null
+++ b/_freeze/archive/CaseStudy01/execute-results/html.json
@@ -0,0 +1,18 @@
+{
+  "hash": "c061862e27098cb02ccfc71ba1c8ea30",
+  "result": {
+    "markdown": "---\ntitle: \"Algorithmic Thinking Case Study 1\"\nsubtitle: \"SISMID 2024 -- Introduction to R\"\nformat:\n  revealjs:\n    toc: false\nexecute: \n  echo: false\n---\n\n\n## Learning goals\n\n* Use logical operators, subsetting functions, and math calculations in R\n* Translate human-understandable problem descriptions into instructions that\nR can understand.\n\n# Remember, R always does EXACTLY what you tell it to do!\n\n## Instructions\n\n* Make a new R script for this case study, and save it to your code folder.\n* We'll use the diphtheria serosample data from Exercise 1 for this case study.\nLoad it into R and use the functions we've learned to look at it.\n\n## Instructions\n\n* Make a new R script for this case study, and save it to your code folder.\n* We'll use the diphtheria serosample data from Exercise 1 for this case study.\nLoad it into R and use the functions we've learned to look at it.\n* The `str()` of your dataset should look like this.\n\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\ntibble [250 × 5] (S3: tbl_df/tbl/data.frame)\n $ age_months  : num [1:250] 15 44 103 88 88 118 85 19 78 112 ...\n $ group       : chr [1:250] \"urban\" \"rural\" \"urban\" \"urban\" ...\n $ DP_antibody : num [1:250] 0.481 0.657 1.368 1.218 0.333 ...\n $ DP_infection: num [1:250] 1 1 1 1 1 1 1 1 1 1 ...\n $ DP_vacc     : num [1:250] 0 1 1 1 1 1 1 1 1 1 ...\n```\n:::\n:::\n\n\n## Q1: Was the overall prevalence higher in urban or rural areas?\n\n::: {.incremental}\n\n1. How do we calculate the prevalence from the data?\n1. How do we calculate the prevalence separately for urban and rural areas?\n1. How do we determine which prevalence is higher and if the difference is\nmeaningful?\n\n:::\n\n## Q1: How do we calculate the prevalence from the data?\n\n::: {.incremental}\n\n* The variable `DP_infection` in our dataset is binary / dichotomous.\n* The prevalence is the number or percent of people who had the disease over\nsome duration.\n* The average of a binary variable gives the prevalence!\n\n:::\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmean(diph$DP_infection)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 0.8\n```\n:::\n:::\n\n\n## Q1: How do we calculate the prevalence separately for urban and rural areas?\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmean(diph[diph$group == \"urban\", ]$DP_infection)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 0.8235294\n```\n:::\n\n```{.r .cell-code}\nmean(diph[diph$group == \"rural\", ]$DP_infection)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 0.778626\n```\n:::\n:::\n\n\n. . .\n\n* There are many ways you could write this code! You can use `subset()` or you\ncan write the indices many ways.\n* Using `tbl_df` objects from `haven` uses different `[[` rules than a base R\ndata frame.\n\n## Q1: How do we calculate the prevalence separately for urban and rural areas?\n\n* One easy way is to use the `aggregate()` function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\naggregate(DP_infection ~ group, data = diph, FUN = mean)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  group DP_infection\n1 rural    0.7786260\n2 urban    0.8235294\n```\n:::\n:::\n\n\n## Q1: How do we determine which prevalence is higher and if the difference is meaningful?\n\n::: {.incremental}\n\n* We probably need to include a confidence interval in our calculation.\n* This is actually not so easy without more advanced tools that we will learn\nin upcoming modules.\n* Right now the best options are to do it by hand or google a function.\n\n:::\n\n## Q1: By hand\n\n\n::: {.cell}\n\n```{.r .cell-code}\np_urban <- mean(diph[diph$group == \"urban\", ]$DP_infection)\np_rural <- mean(diph[diph$group == \"rural\", ]$DP_infection)\nse_urban <- sqrt(p_urban * (1 - p_urban) / nrow(diph[diph$group == \"urban\", ]))\nse_rural <- sqrt(p_rural * (1 - p_rural) / nrow(diph[diph$group == \"rural\", ])) \n\nresult_urban <- paste0(\n\t\"Urban: \", round(p_urban, 2), \"; 95% CI: (\",\n\tround(p_urban - 1.96 * se_urban, 2), \", \",\n\tround(p_urban + 1.96 * se_urban, 2), \")\"\n)\n\nresult_rural <- paste0(\n\t\"Rural: \", round(p_rural, 2), \"; 95% CI: (\",\n\tround(p_rural - 1.96 * se_rural, 2), \", \",\n\tround(p_rural + 1.96 * se_rural, 2), \")\"\n)\n\ncat(result_urban, result_rural, sep = \"\\n\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nUrban: 0.82; 95% CI: (0.76, 0.89)\nRural: 0.78; 95% CI: (0.71, 0.85)\n```\n:::\n:::\n\n\n## Q1: By hand\n\n* We can see that the 95% CI's overlap, so the groups are probably not that\ndifferent. **To be sure, we need to do a 2-sample test! But this is not a\nstatistics class.**\n* Some people will tell you that coding like this is \"bad\". **But 'bad' code\nthat gives you answers is better than broken code!** We will learn techniques for writing this with less work and less repetition\nin upcoming modules.\n\n## Q1: Googling a package\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# install.packages(\"DescTools\")\nlibrary(DescTools)\n\naggregate(DP_infection ~ group, data = diph, FUN = DescTools::MeanCI)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  group DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci\n1 rural         0.7786260           0.7065872           0.8506647\n2 urban         0.8235294           0.7540334           0.8930254\n```\n:::\n:::\n\n\n## You try it!\n\n* Using any of the approaches you can think of, answer this question!\n* **How many children under 5 were vaccinated? In children under 5, did\nvaccination lower the prevalence of infection?**\n\n## You try it!\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# How many children under 5 were vaccinated\nsum(diph$DP_vacc[diph$age_months < 60])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 91\n```\n:::\n\n```{.r .cell-code}\n# Prevalence in both vaccine groups for children under 5\naggregate(\n\tDP_infection ~ DP_vacc,\n\tdata = subset(diph, age_months < 60),\n\tFUN = DescTools::MeanCI\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  DP_vacc DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci\n1       0         0.4285714           0.1977457           0.6593972\n2       1         0.6373626           0.5366845           0.7380407\n```\n:::\n:::\n\n\nIt appears that prevalence was HIGHER in the vaccine group? That is\ncounterintuitive, but the sample size for the unvaccinated group is too small\nto be sure.\n\n## Congratulations for finishing the first case study!\n\n* What R functions and skills did you practice?\n* What other questions could you answer about the same dataset with the skills\nyou know now?\n",
+    "supporting": [],
+    "filters": [
+      "rmarkdown/pagebreak.lua"
+    ],
+    "includes": {
+      "include-after-body": [
+        "\n<script>\n  // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n  // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n  // slide changes (different for each slide format).\n  (function () {\n    // dispatch for htmlwidgets\n    function fireSlideEnter() {\n      const event = window.document.createEvent(\"Event\");\n      event.initEvent(\"slideenter\", true, true);\n      window.document.dispatchEvent(event);\n    }\n\n    function fireSlideChanged(previousSlide, currentSlide) {\n      fireSlideEnter();\n\n      // dispatch for shiny\n      if (window.jQuery) {\n        if (previousSlide) {\n          window.jQuery(previousSlide).trigger(\"hidden\");\n        }\n        if (currentSlide) {\n          window.jQuery(currentSlide).trigger(\"shown\");\n        }\n      }\n    }\n\n    // hookup for slidy\n    if (window.w3c_slidy) {\n      window.w3c_slidy.add_observer(function (slide_num) {\n        // slide_num starts at position 1\n        fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n      });\n    }\n\n  })();\n</script>\n\n"
+      ]
+    },
+    "engineDependencies": {},
+    "preserve": {},
+    "postProcess": true
+  }
+}
\ No newline at end of file
diff --git a/_freeze/modules/Module00-Welcome/execute-results/html.json b/_freeze/modules/Module00-Welcome/execute-results/html.json
index 01be4ab..8a8349c 100644
--- a/_freeze/modules/Module00-Welcome/execute-results/html.json
+++ b/_freeze/modules/Module00-Welcome/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "c70ee3c3328bbebb542de6ef3986aef7",
+  "hash": "8bfd8f2bc8586d363e99a6e7f763f712",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Welcome to SISMID Workshop: Introduction to R\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Welcome to SISMID Workshop: Introduction to R!\n\n**Amy Winter (she/her)** \nAssistant Professor, Department of Epidemiology and Biostatistics\nEmail: awinter@uga.edu\n\n**Zane Billings (he/him)** \nPhD Candidate, Department of Epidemiology and Biostatistics\nEmail: Wesley.Billings@uga.edu\n\n\n## Introductions\n\n* Name?\n* Current position / institution?\n* Past experience with other statistical programs, including R?\n* Why do you want to learn R?\n* Favorite useful app\n* Favorite guilty pleasure app\n\n\n## What is R?\n\n- R is a language and environment for statistical computing and graphics developed in 1991\n\n- R is the open source implementation of the [S language](https://en.wikipedia.org/wiki/S_(programming_language)), which was developed by [Bell laboratories](https://ca.slack-edge.com/T023TPZA8LF-U024EN26Q0L-113294823b2c-512) in the 70s.\n\n- The aim of the S language, as expressed by John Chambers, is \"to turn ideas into software, quickly and faithfully\"\n\n## What is R?\n\n- **R**oss Ihaka and **R**obert Gentleman at the University of Auckland, New Zealand developed R\n\n\n- R is both [open source](https://en.wikipedia.org/wiki/Open_source) and [open development](https://en.wikipedia.org/wiki/Open-source_software_development)\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://www.r-project.org/logo/Rlogo.png){fig-align='center' fig-alt='R logo' width=20%}\n:::\n:::\n\n\n\n## What is R?\n\n* R possesses an extensive catalog of statistical and graphical methods \n    * includes machine learning algorithm, linear regression, time series, statistical inference to name a few. \n\n* Data analysis with R is done in a series of steps; programming, transforming, discovering, modeling and communicate the results\n\n\n## What is R?\n\n- Program: R is a clear and accessible programming tool\n- Transform: R is made up of a collection of libraries designed specifically for data science\n- Discover: Investigate the data, refine your hypothesis and analyze them\n- Model: R provides a wide array of tools to capture the right model for your data\n- Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world\n\n\n## Why R?\n\n* Free (open source)\n\n* High level language designed for statistical computing\n\n* Powerful and flexible - especially for data wrangling and visualization\n\n* Extensive add-on software (packages)\n\n* Strong community \n\n\n## Why not R?\n\n    \n* Little centralized support, relies on online community and package developers\n\n* Annoying to update\n\n* Slower, and more memory intensive, than the more traditional programming languages (C, Perl, Python)\n\n\n## Is R DIfficult?\n\n* Short answer – It has a steep learning curve. \n* Years ago, R was a difficult language to master. The language was confusing and not as structured as the other programming tools. \n* Hadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore.\n\n\n\n## Overall Workshop Objectives\n\nBy the end of this workshop, you should be able to \n\n1. start a new project, read in data, and conduct basic data manipulation, analysis, and visualization\n2. know how to use and find packages/functions that we did not specifically learn in class\n3. troubleshoot errors (xxzane? -- not included right now)\n\n\n## This workshop differs from \"Introduction to Tidyverse\"\n\nWe will focus this class on using **Base R** functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.\n\nthe Tidyverse is one set of useful/pretty packages, designed to can make your code more **intuitive** as compared to the original older Base R. **Tidyverse advantages**:  \n\n-\t**consistent structure** - making it easier to learn how to use different packages\n-\tparticularly good for **wrangling** (manipulating, cleaning, joining) data  \n-\tmore flexible for **visualizing** data  \n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://tidyverse.tidyverse.org/logo.png){fig-align='center' fig-alt='Tidyverse hex sticker' width=10%}\n:::\n:::\n\n\n\n\n## Workshop Overview\n\n14 lecture blocks that will each:\n- Start with learning objectives\n- End with summary slides\n- Include mini-exercise(s) or a full exercise\n\nThemes that will show up throughout the workshop:\n- Reproducibility\n- Good coding techniques\n- Thinking algorithmically\n- [Basic terms / R jargon](https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf)\n\n\n## Reproducibility\n\nxxzane slides\n\n## Good coding techniques\n\n\n## Thinking algorithmically \n\n\n## Useful (+ Free) Resources\n\n**Want more?**  \n\n- R for Data Science: http://r4ds.had.co.nz/  \n(great general information)\n\n- Fundamentals of Data Visualization: https://clauswilke.com/dataviz/ \n\n- R for Epidemiology: https://www.r4epi.com/\n\n- The Epidemiologist R Handbook: https://epirhandbook.com/en/\n\n- R basics by Rafael A. Irizarry: https://rafalab.github.io/dsbook/r-basics.html\n(great general information)\n \n- Open Case Studies: https://www.opencasestudies.org/  \n(resource for specific public health cases with statistical implementation and interpretation)\n\n## Useful (+Free) Resources\n\n**Need help?** \n\n- Various \"Cheat Sheets\": https://github.com/rstudio/cheatsheets/\n\n- R reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf  \n\n- R jargon: https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf  \n\n- R vs Stata: https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf  \n\n- R terminology: https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf\n\n\n## Installing R\n\n\nHopefully everyone has pre-installed R and RStudio.  We will take a moment to go around and make sure everyone is ready to go. Please open up your RStudio and leave it open as we check everyone's laptops.\n\n- Install the latest version from: [http://cran.r-project.org/](http://cran.r-project.org/ )\n- [Install RStudio](https://www.rstudio.com/products/rstudio/download/)\n\n\n",
+    "markdown": "---\ntitle: \"Welcome to SISMID Workshop: Introduction to R\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Welcome to SISMID Workshop: Introduction to R!\n\n**Amy Winter (she/her)** \n\nAssistant Professor, Department of Epidemiology and Biostatistics\n\nEmail: awinter@uga.edu\n\n</br>\n\n**Zane Billings (he/him)** \n\nPhD Candidate, Department of Epidemiology and Biostatistics\n\nEmail: Wesley.Billings@uga.edu\n\n\n## Introductions\n\n* Name?\n* Current position / institution?\n* Past experience with other statistical programs, including R?\n* Why do you want to learn R?\n* Favorite useful app\n* Favorite guilty pleasure app\n\n\n## What is R?\n\n- R is a language and environment for statistical computing and graphics developed in 1991\n\n- R is the open source implementation of the [S language](https://en.wikipedia.org/wiki/S_(programming_language)), which was developed by [Bell laboratories](https://ca.slack-edge.com/T023TPZA8LF-U024EN26Q0L-113294823b2c-512) in the 70s.\n\n- The aim of the S language, as expressed by John Chambers, is \"to turn ideas into software, quickly and faithfully\"\n\n## What is R?\n\n- **R**oss Ihaka and **R**obert Gentleman at the University of Auckland, New Zealand developed R\n\n\n- R is both [open source](https://en.wikipedia.org/wiki/Open_source) and [open development](https://en.wikipedia.org/wiki/Open-source_software_development)\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://www.r-project.org/logo/Rlogo.png){fig-align='center' fig-alt='R logo' width=20%}\n:::\n:::\n\n\n## What is R?\n\n* R possesses an extensive catalog of statistical and graphical methods \n    * includes machine learning algorithm, linear regression, time series, statistical inference to name a few. \n\n* Data analysis with R is done in a series of steps; programming, transforming, discovering, modeling and communicate the results\n\n\n## What is R?\n\n- Program: R is a clear and accessible programming tool\n- Transform: R is made up of a collection of packages/libraries designed specifically for statistical computing\n- Discover: Investigate the data, refine your hypothesis and analyze them\n- Model: R provides a wide array of tools to capture the right model for your data\n- Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world\n\n\n## Why R?\n\n* Free (open source)\n\n* High level language designed for statistical computing\n\n* Powerful and flexible - especially for data wrangling and visualization\n\n* Extensive add-on software (packages)\n\n* Strong community \n\n\n## Why not R?\n\n    \n* Little centralized support, relies on online community and package developers\n\n* Annoying to update\n\n* Slower, and more memory intensive, than the more traditional programming languages (C, Perl, Python)\n\n\n## Is R Difficult?\n\n* Short answer – It has a steep learning curve, like all programming languages\n* Years ago, R was a difficult language to master. \n* Hadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore.\n\n\n## Overall Workshop Objectives\n\nBy the end of this workshop, you should be able to \n\n1. start a new project, read in data, and conduct basic data manipulation, analysis, and visualization\n2. know how to use and find packages/functions that we did not specifically learn in class\n3. troubleshoot errors\n\n\n## This workshop differs from \"Introduction to Tidyverse\"\n\nWe will focus this class on using **Base R** functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.\n\nThe Tidyverse is one set of useful/pretty sets of packages, designed to can make your code more **intuitive** as compared to the original older Base R. **Tidyverse advantages**:  \n\n-\t**consistent structure** - making it easier to learn how to use different packages\n-\tparticularly good for **wrangling** (manipulating, cleaning, joining) data  \n-\tmore flexible for **visualizing** data  \n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://tidyverse.tidyverse.org/logo.png){fig-align='center' fig-alt='Tidyverse hex sticker' width=10%}\n:::\n:::\n\n\n\n## Workshop Overview\n\n14 lecture blocks that will each:\n\n- Start with learning objectives\n- End with summary slides\n- Include mini-exercise(s) or a full exercise\n\nThemes that will show up throughout the workshop:\n\n- Reproducibility\n- Good coding techniques\n- Thinking algorithmically\n- [Basic terms / R jargon](https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf)\n\n\n## Reproducibility\n\nxxzane slides\n\n\n## Useful (+ Free) Resources\n\n**Want more?**  \n\n- R for Data Science: http://r4ds.had.co.nz/  \n(great general information)\n\n- Fundamentals of Data Visualization: https://clauswilke.com/dataviz/ \n\n- R for Epidemiology: https://www.r4epi.com/\n\n- The Epidemiologist R Handbook: https://epirhandbook.com/en/\n\n- R basics by Rafael A. Irizarry: https://rafalab.github.io/dsbook/r-basics.html\n(great general information)\n \n- Open Case Studies: https://www.opencasestudies.org/  \n(resource for specific public health cases with statistical implementation and interpretation)\n\n## Useful (+Free) Resources\n\n**Need help?** \n\n- Various \"Cheat Sheets\": https://github.com/rstudio/cheatsheets/\n\n- R reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf  \n\n- R jargon: https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf  \n\n- R vs Stata: https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf  \n\n- R terminology: https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf\n\n\n## Installing R\n\n\nHopefully everyone has pre-installed R and RStudio.  We will take a moment to go around and make sure everyone is ready to go. Please open up your RStudio and leave it open as we check everyone's laptops.\n\n- Install the latest version from: [http://cran.r-project.org/](http://cran.r-project.org/ )\n- [Install RStudio](https://www.rstudio.com/products/rstudio/download/)\n\n\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module01-Intro/execute-results/html.json b/_freeze/modules/Module01-Intro/execute-results/html.json
index f5f7304..0484237 100644
--- a/_freeze/modules/Module01-Intro/execute-results/html.json
+++ b/_freeze/modules/Module01-Intro/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "f445c448019a47d959fea49d68987f67",
+  "hash": "f7be0bcf0c004397e5a35535a3dd9a72",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 1: Introduction to RStudio and R Basics\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n\n## Learning Objectives\n\nAfter module 1, you should be able to...\n\n-   Create and save an R script\n-   Describe the utility and differences b/w the console and an R script\n-   Modify R Studio windows\n-   Create objects\n-   Describe the difference b/w character, numeric, list, and matrix objects\n-   Reference objects in the RStudio Global Environment\n-   Use basic arithmetic operators in R\n-   Use comments within an R script to create header, sections, and make notes\n\n## Working with R -- RStudio\n\nRStudio is an Integrated Development Environment (IDE) for R\n\n-   It helps the user effectively use R\n-   Makes things easier\n-   Is NOT a dropdown statistical tool (such as Stata)\n    -   See [Rcmdr](https://cran.r-project.org/web/packages/Rcmdr/index.html) or [Radiant](http://vnijs.github.io/radiant/)\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://d33wubrfki0l68.cloudfront.net/62bcc8535a06077094ca3c29c383e37ad7334311/a263f/assets/img/logo.svg){fig-align='center' fig-alt='RStudio logo' width=30%}\n:::\n:::\n\n\n\n\n## RStudio\n\nEasier working with R\n\n-   Syntax highlighting, code completion, and smart indentation\n-   Easily manage multiple working directories and projects\n\nMore information\n\n-   Workspace browser and data viewer\n-   Plot history, zooming, and flexible image and file export\n-   Integrated R help and documentation\n-   Searchable command history\n\n## RStudio\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://ayeimanolr.files.wordpress.com/2013/04/r-rstudio-1-1.png?w=640&h=382){fig-align='center' fig-alt='RStudio' width=80%}\n:::\n:::\n\n\n\n\n## Getting the editor\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/both.png){width=90%}\n:::\n:::\n\n\n\n\n## Working with R in RStudio - 2 major panes:\n\n1)  The **Source/Editor**: \"Analysis\" Script + Interactive Exploration\n    -   Static copy of what you did (reproducibility)\n    -   Top by default\n2)  The **R Console**: \"interprets\" whatever you type\n    -   Calculator\n    -   Try things out interactively, then add to your editor\n    -   Bottom by default\n\n## Source / Editor\n\n-   Where files open to\n-   Have R code and comments in them\n-   Where code is saved\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_script.png){width=200%}\n:::\n:::\n\n\n\n\n## R Console\n\n-   Where code is executed (where things happen)\n-   You can type here for things interactively\n-   Code is **not saved**\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/rstudio_console.png){fig-align='center' width=60%}\n:::\n:::\n\n\n\n\n\n## RStudio\n\nUseful RStudio \"cheat sheet\": <https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf>\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/rstudio_sheet.png){fig-align='center' fig-alt='RStudio' width=65%}\n:::\n:::\n\n\n\n\n\n## RStudio Layout\n\nIf RStudio doesn't look the way you want (or like our RStudio), then do:\n\nRStudio --\\> View --\\> Panes --\\> Pane Layout\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/pane_layout.png){fig-align='center' width=500px}\n:::\n:::\n\n\n\n\n## Workspace/Environment\n\n-   Tells you what **objects** are in R\n-   What exists in memory/what is loaded?/what did I read in?\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_environment.png){width=90%}\n:::\n:::\n\n\n\n\n## Workspace/History\n\n-   Shows previous commands. Good to look at for debugging, but **don't rely** on it.\n-   Also type the \"up\" key in the Console to scroll through previous commands\n\n## Workspace/Other Panes\n\n-   **Files** - shows the files on your computer of the directory you are working in\n-   **Viewer** - can view data or R objects\n-   **Help** - shows help of R commands\n-   **Plots** - pictures and figures\n-   **Packages** - list of R packages that are loaded in memory\n\n## Getting Started\n\n-   File --\\> New File --\\> R Script\n-   Save the blank R script as Module1.R\n\n## Explaining output on slides\n\nIn slides, a command (we'll also call them code or a code chunk) will look like this\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nprint(\"I'm code\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"I'm code\"\n```\n\n\n:::\n:::\n\n\n\n\nAnd then directly after it, will be the output of the code.  \nSo `print(\"I'm code\")` is the code chunk and `[1] \"I'm code\"` is the output.\n\n## R as a calculator\n\nYou can do basic arithmetic in R, which I surprisingly use all the time.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n2 + 2\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 4\n```\n\n\n:::\n\n```{.r .cell-code}\n2 * 4\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 8\n```\n\n\n:::\n\n```{.r .cell-code}\n2^3\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 8\n```\n\n\n:::\n:::\n\n\n\n\n## R as a calculator\n\n- The R console is a full calculator\n- Try to play around with it:\n    - +, -, /, * are add, subtract, divide and multiply\n    - ^ or ** is power\n    - parentheses -- ( and ) -- work with order of operations \n    - %% finds the remainder\n    \n\n## Execute / Run Code\n\nTo execute or run a line of code, you just put your cursor on line of code and then:\n\n  1. Press Run (which you will find at the top of your window)\n\n  OR\n\n  2. Press `Cmd + Return` (iOS) OR `Ctrl + Enter` (Windows).\n\nTo execute or run multiple lines of code, you just need to highlight the code you want to run and then follow option 1 or 2.\n\n## Mini exercise \n\nExecute `5+4` from your .R file, and then find the answer 9 in the Console.\n\n## Commenting in Scripts\n\nThe syntax `#` creates a comment, which means anything to the right of `#` will not be executed / run\n\nCommenting is useful to:\n\n1. Create headers for R Scripts\n2. Create sections within an R Script\n3. Explain what is happening in your code \n\n## Commenting an R Script header\n\nAdd a comment header to Module1.R.  This is the one I typically use, but you may have your own preference.  The goal is that you are consistent so that future you / collaborators can make sense of your code.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n### Title: Module 1\n### Author: Amy Winter \n### Objective: Mini Exercise - Developing first R Script\n### Date: 15 July 2024\n```\n:::\n\n\n\n\n## Commenting to create sections\n\nYou can also create sections within your code by ending a comment with 4 hash marks. **This is very useful for creating an outline of your R Script.** The \"Outline\" can be found in the top right of the your source window.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Section 1 Header ####\n## Section 2 Sub-header ####\n### Section 3 Sub-sub-header ####\n#### Section 4 Sub-sub-sub-header ####\n```\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/outline.png){width=90%}\n:::\n:::\n\n\n\n\n\n## Commenting to explain code\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## this # is still a comment\n### you can use many #'s as you want\n\n# sometimes you have a really long comment,\n#    like explaining what you are doing\n#    for a step in analysis. \n# Take it to another line\n```\n:::\n\n\n\n\n## Commenting to explain code\n\nI tend to use:\n\n-   One hash tag with a space to describe what is happening in the following few lines of code\n-   One hastag with no space after a command to list specifics \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Practicing my arithmetic\n5+2\n3*5\n9/8\n\n5+2 #5 plus 2 \n```\n:::\n\n\n\n\n## Object - Basic terms\n\n**Object** - an object is something that can be worked with in R - can be lots of different things!\n\n-   a scalar / number\n-   a vector\n-   a matrix of numbers\n-   a list\n-   a plot\n-   a function\n\n... many more\n\n## Objects\n\n- You can create objects from within the R environment and from files on your computer\n- R uses `<-` to assign values to an object name \n- Note: Object names are case-sensitive, i.e. X and x are different\n- Here are examples of creating five different objects:\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\n\n\nNote, `c()` and `matrix()` are functions, which we will talk more about in module 2.\n\n\n## Mini Exercise\n\nTry creating one or two of these objects in your R script\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\n\n\n## Objects \n\nNote, you can find these objects now in the Global Environment.\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/global_env.png){width=90%}\n:::\n:::\n\n\n\n\n\nAlso, you can call them anytime (i.e, see them in the Console) by executing (running) the object.  For example,\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncharacter.object\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"blue\"\n```\n\n\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix.object\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n\n\n:::\n:::\n\n\n\n\n\n## Assignment - Good coding\n\n`=` and `<-` can both be used for assignment, but `<-` is better coding practice, because `==` is a logical operator. We will talk about this more, later.\n\n## Lists\n\nList is a special data class, that can hold vectors, strings, matrices, models, list of other lists.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object <- list(number.object, vector.object2, matrix.object)\nlist.object\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n\n\n:::\n:::\n\n\n\n\n\n## Useful R Studio Shortcuts\n\nWill certainly save you time\n\n- `Cmd + Return` (iOS) OR `Ctrl + Enter` (Windows) in your script evaluates current line/selection\n    -   It's like copying and pasting the code into the console for it to run.\n- pressing Up/Down in the Console allows you to navigate command history\n\nSee <http://www.rstudio.com/ide/docs/using/keyboard_shortcuts> for many more\n\n\n## RStudio helps with \"tab completion\"\n\nIf you start typing a object, RStudio will show you options that you can choose without typing out the whole object.\n\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/tab.completion.png){width=90%}\n:::\n:::\n\n\n\n\n\n\n\n## Summary\n\n-   RStudio makes working in R easier\n-   The Editor is for static code like R Scripts\n-   The Console is for testing code that can't be saved\n-   Commenting is your new best friend\n-   In R we create objects that can be viewed in the Environment panel and called anytime\n-   An object is something that can be worked with in R\n-   Use `<-` syntax to create objects\n\n\n## Mini Exercise\n\n1. Create a new number object and name it `my.object`\n2. Create a vector of 4 numbers and name it `my.vector` using the `c()` function\n3. Add `my.object` and `my.vector` together use arithmatic operator\n\n## Acknowledgements\n\nThese are the materials I looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n-   Some RStudio snapshots were pulled from <http://ayeimanol-r.net/2013/04/21/289/>\n",
+    "markdown": "---\ntitle: \"Module 1: Introduction to RStudio and R Basics\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 1, you should be able to...\n\n-   Create and save an R script\n-   Describe the utility and differences b/w the Console and the Source panes\n-   Modify R Studio panes\n-   Create objects\n-   Describe the difference b/w character, numeric, list, and matrix objects\n-   Reference objects in the RStudio Environment pane\n-   Use basic arithmetic operators in R\n-   Use comments within an R script to create header, sections, and make notes\n\n## Working with R -- RStudio\n\nRStudio is an Integrated Development Environment (IDE) for R\n\n-   It helps the user effectively use R\n-   Makes things easier\n-   Is NOT a dropdown statistical tool (such as Stata)\n    -   See [jamovi](https://www.jamovi.org/) or also [Rcmdr](https://cran.r-project.org/web/packages/Rcmdr/index.html), [Radiant](http://vnijs.github.io/radiant/)\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://d33wubrfki0l68.cloudfront.net/62bcc8535a06077094ca3c29c383e37ad7334311/a263f/assets/img/logo.svg){fig-align='center' fig-alt='RStudio logo' width=30%}\n:::\n:::\n\n\n## RStudio\n\nEasier working with R\n\n-   Syntax highlighting, code completion, and smart indentation\n-   Easily manage multiple working directories and projects\n\nMore information\n\n-   Workspace browser and data viewer\n-   Plot history, zooming, and flexible image and file export\n-   Integrated R help and documentation\n-   Searchable command history\n\n## RStudio\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](https://ayeimanolr.files.wordpress.com/2013/04/r-rstudio-1-1.png?w=640&h=382){fig-align='center' fig-alt='RStudio' width=80%}\n:::\n:::\n\n\n## Getting the editor\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/both.png){width=90%}\n:::\n:::\n\n\n## Working with R in RStudio - 2 major panes:\n\n1) The **Source/Editor**: xxamy\n\n- \"Analysis\" Script\n- Static copy of what you did (reproducibility)\n- Top by default\n    \n2)  The **R Console**: \"interprets\" whatever you type:\n\n    -   Calculator\n    -   Try things out interactively, then add to your editor\n    -   Bottom by default\n\n## Source / Editor\n\n-   Where files open to\n-   Have R code and comments in them\n-   Where code is saved\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_script.png){width=200%}\n:::\n:::\n\n\n## R Console\n\n-   Where code is executed (where things happen)\n-   You can type here for things interactively\n-   Code is **not saved**\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/rstudio_console.png){fig-align='center' width=60%}\n:::\n:::\n\n\n\n## RStudio\n\nUseful RStudio \"cheat sheet\": <https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf>\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/rstudio_sheet.png){fig-align='center' fig-alt='RStudio' width=65%}\n:::\n:::\n\n\n\n## RStudio Layout\n\nIf RStudio doesn't look the way you want (or like our RStudio), then do:\n\nIn R Studio Menu Bar go to View Menu --\\> Panes --\\> Pane Layout\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](images/pane_layout.png){fig-align='center' width=500px}\n:::\n:::\n\n\n## Workspace/Environment\n\n-   Tells you what **objects** are in R\n-   What exists in memory/what is loaded?/what did I read in?\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_environment.png){width=90%}\n:::\n:::\n\n\n## Workspace/History\n\n-   Shows previous commands. Good to look at for debugging, but **don't rely** on it.\n-   Also type the \"up\" and \"down\" key in the Console to scroll through previous commands\n\n## Workspace/Other Panes\n\n-   **Files** - shows the files on your computer of the directory you are working in\n-   **Viewer** - can view data or R objects\n-   **Help** - shows help of R commands\n-   **Plots** - pictures and figures\n-   **Packages** - list of R packages that are loaded in memory\n\n## Getting Started\n\n-   In R Studio Menu Bar go to File Menu --\\> New File --\\> R Script\n-   Save the blank R script as Module1.R\n\n## Explaining output on slides\n\nIn slides, the R command/code will be in a box, and then directly after it, will be the output of the code starting with `[1]`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nprint(\"I'm code\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"I'm code\"\n```\n:::\n:::\n\n\nSo `print(\"I'm code\")` is the command and `[1] \"I'm code\"` is the output.\n\n</br>\n\nCommands/code and output written as inline text will be typewriter blue font. For example `code`\n\n## R as a calculator\n\nYou can do basic arithmetic in R, which I surprisingly use all the time.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n2 + 2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 4\n```\n:::\n\n```{.r .cell-code}\n2 * 4\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 8\n```\n:::\n\n```{.r .cell-code}\n2^3\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 8\n```\n:::\n:::\n\n\n## R as a calculator\n\n- The R console is a full calculator\n- Arithmetic operators:\n    - `+`, `-`, `/`, `*` are add, subtract, divide and multiply\n    - `^` or `**` is power\n    - parentheses -- `(` and `)` -- work with order of operations \n    - `%%` finds the remainder\n    \n\n## Execute / Run Code\n\nTo execute or run a line of code (i.e., command), you just put your cursor on the command and then:\n\n  1. Press Run (which you will find at the top of your window)\n\n  OR\n\n  2. Press `Cmd + Return` (iOS) OR `Ctrl + Enter` (Windows).\n\nTo execute or run multiple lines of code, you need to highlight the code you want to run and then follow option 1 or 2.\n\n## Mini exercise \n\nExecute `5+4` from your .R file, and then find the answer 9 in the Console.\n\n## Commenting in Scripts\n\nThe syntax `#` creates a comment, which means anything to the right of `#` will not be executed / run\n\nCommenting is useful to:\n\n1. Create headers for R Scripts\n2. Create sections within an R Script\n3. Explain what is happening in your code \n\n## Commenting an R Script header\n\nAdd a comment header to Module1.R.  This is the one I typically use, but you may have your own preference.  The goal is that you are consistent so that future you / collaborators can make sense of your code.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n### Title: Module 1\n### Author: Amy Winter \n### Objective: Mini Exercise - Developing first R Script\n### Date: 15 July 2024\n```\n:::\n\n\n## Commenting to create sections\n\nYou can also create sections within your code by ending a comment with 4 hash marks. **This is very useful for creating an outline of your R Script.** The \"Outline\" can be found in the top right of the your Source pane\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Section 1 Header ####\n## Section 2 Sub-header ####\n### Section 3 Sub-sub-header ####\n#### Section 4 Sub-sub-sub-header ####\n```\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/outline.png){width=90%}\n:::\n:::\n\n\n\n## Commenting to explain code\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## this # is still a comment\n### you can use many #'s as you want\n\n# sometimes you have a really long comment,\n#    like explaining what you are doing\n#    for a step in analysis. \n# Take it to another line\n```\n:::\n\n\nI tend to use:\n\n-   One hash mark with a space to describe what is happening in the following few lines of code\n-   One hash mark with no space after a command to list specifics \n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Practicing my arithmetic\n5+2\n3*5\n9/8\n\n5+2 #5 plus 2 \n```\n:::\n\n\n## Object - Basic terms\n\n**Object** - an object is something that can be worked with in R - can be lots of different things!\n\n-   a scalar / number\n-   a vector\n-   a matrix of numbers\n-   a list\n-   a plot\n-   a function\n\n... many more\n\n## Objects\n\n- You can create objects from within the R environment and from files on your computer\n- R uses `<-` to assign values to an object name \n- Note: Object names are case-sensitive, i.e. `X` and `x` are different\n- Here are examples of creating five different objects:\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\nNote, `c()` and `matrix()` are functions, which we will talk more about in module 2.\n\n\n## Mini Exercise\n\nTry creating one or two of these objects in your R script\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\n## Objects \n\nNote, you can find these objects now in the Global Environment.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/global_env.png){width=90%}\n:::\n:::\n\n\n\nAlso, you can print them anytime (i.e, see them in the Console) by executing (running) the object.  For example,\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncharacter.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"blue\"\n```\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\n\n# Object names and assingment - Good coding\n\nxxzane\n\n`=` and `<-` can both be used for assignment, but `<-` is better coding practice, because sometimes `=` doesn't work and we want to distinguish between the logical operator `==`. We will talk about this more, later.\n\n## Lists\n\nList is a special data class, that can hold vectors, strings, matrices, models, list of other lists.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object <- list(number.object, vector.object2, matrix.object)\nlist.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\n\n## Useful R Studio Shortcuts\n\nWill certainly save you time\n\n- `Cmd + Return` (iOS) OR `Ctrl + Enter` (Windows) in your script evaluates current line/selection\n    -   It's like copying and pasting the code into the console for it to run.\n- pressing Up/Down in the Console allows you to navigate command history\n\nSee <http://www.rstudio.com/ide/docs/using/keyboard_shortcuts> for many more\n\n\n## RStudio helps with \"tab completion\"\n\nIf you start typing a object, RStudio will show you options that you can choose without typing out the whole object.\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/tab.completion.png){width=90%}\n:::\n:::\n\n\n\n\n\n## Summary\n\n-   RStudio makes working in R easier\n-   The Editor is for static code like R Scripts\n-   The Console is for testing code that can't be saved\n-   Commenting is your new best friend\n-   In R we create objects that can be viewed in the Environment pane and used anytime\n-   An object is something that can be worked with in R\n-   Use `<-` syntax to create objects\n\n\n## Mini Exercise\n\n1. Create a new number object and name it `my.object`\n2. Create a vector of 4 numbers and name it `my.vector` using the `c()` function\n3. Add `my.object` and `my.vector` together using an arithmetic operator\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n-   Some RStudio snapshots were pulled from <http://ayeimanol-r.net/2013/04/21/289/>\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module02-Functions/execute-results/html.json b/_freeze/modules/Module02-Functions/execute-results/html.json
index fe5023e..f66418b 100644
--- a/_freeze/modules/Module02-Functions/execute-results/html.json
+++ b/_freeze/modules/Module02-Functions/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "0531e7ec69b41ee43083c73f617056fc",
+  "hash": "147c719ed518b56df6eee7d8e94ffde0",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 2: Functions\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Learning Objectives\n\nAfter module 2, you should be able to...\n\n-   Describe and execute functions in R\n-   Modify default behavior of functions using arguments in R\n-   Use R-specific sources of help to get more information about functions and packages \n-   Differentiate between Base R functions and functions that come from other packages\n\n\n## Function - Basic term\n\n**Function** - Functions are \"self contained\" modules of code that accomplish specific tasks. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.\n\nA function might help you add numbers together, create a plot, or organize your data. In fact, we have already used three functions in the Module 1, including `c()`, `matrix()`, `list()`. Here is another one, `sum()`\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsum(1, 20234)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 20235\n```\n\n\n:::\n:::\n\n\n\n\n## Function\n\nThe general usage for a function is the name of the function followed by parentheses. Within the parentheses are **arguments**.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfunction_name(argument1, argument2, ...)\n```\n:::\n\n\n\n\n## Arguments - Basic term\n\n**Arguments** are what you pass to the function and can include:\n\n1.  the physical object on which the function carries out a task (e.g., can be data such as a number 1 or 20234)\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsum(1, 20234)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 20235\n```\n\n\n:::\n:::\n\n\n\n2.  options that alter the way the function operates (e.g., such as the `base` argument in the function `log()`)\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10, base = 10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 1\n```\n\n\n:::\n\n```{.r .cell-code}\nlog(10, base = 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 3.321928\n```\n\n\n:::\n\n```{.r .cell-code}\nlog(10, base=exp(1))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 2.302585\n```\n\n\n:::\n:::\n\n\n\n## Arguments\n\nMost functions are created with **default argument options**. The defaults represent standard values that the author of the function specified as being \"good enough in standard cases\". This means if you don't specify an argument when calling the function, it will use a default.\n\n-   If you want something specific, simply change the argument yourself with a value of your choice.\n-   If an argument is required but you did not specify it and there is no default argument specified when the function was created, you will receive an error.\n\n## Example\n\nWhat is the default in the `base` argument of the `log()` function?\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 2.302585\n```\n\n\n:::\n:::\n\n\n\n## Sure that is easy enough, but how do you know\n\n- the purpose of a function? \n- what arguments a function includes? \n- how to specify the arguments?\n\n## Seeking help for using functions\n\nThe best way of finding out this information is to use the `?` followed by the name of the function. Doing this will open up the help manual in the bottom RStudio Help panel. It provides a description of the function, usage, arguments, details, and examples. Lets look at the help file for the function `round()`\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?log\n```\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nLogarithms and Exponentials\n\nDescription:\n\n     'log' computes logarithms, by default natural logarithms, 'log10'\n     computes common (i.e., base 10) logarithms, and 'log2' computes\n     binary (i.e., base 2) logarithms.  The general form 'log(x, base)'\n     computes logarithms with base 'base'.\n\n     'log1p(x)' computes log(1+x) accurately also for |x| << 1.\n\n     'exp' computes the exponential function.\n\n     'expm1(x)' computes exp(x) - 1 accurately also for |x| << 1.\n\nUsage:\n\n     log(x, base = exp(1))\n     logb(x, base = exp(1))\n     log10(x)\n     log2(x)\n     \n     log1p(x)\n     \n     exp(x)\n     expm1(x)\n     \nArguments:\n\n       x: a numeric or complex vector.\n\n    base: a positive or complex number: the base with respect to which\n          logarithms are computed.  Defaults to e='exp(1)'.\n\nDetails:\n\n     All except 'logb' are generic functions: methods can be defined\n     for them individually or via the 'Math' group generic.\n\n     'log10' and 'log2' are only convenience wrappers, but logs to\n     bases 10 and 2 (whether computed _via_ 'log' or the wrappers) will\n     be computed more efficiently and accurately where supported by the\n     OS.  Methods can be set for them individually (and otherwise\n     methods for 'log' will be used).\n\n     'logb' is a wrapper for 'log' for compatibility with S.  If (S3 or\n     S4) methods are set for 'log' they will be dispatched.  Do not set\n     S4 methods on 'logb' itself.\n\n     All except 'log' are primitive functions.\n\nValue:\n\n     A vector of the same length as 'x' containing the transformed\n     values.  'log(0)' gives '-Inf', and 'log(x)' for negative values\n     of 'x' is 'NaN'.  'exp(-Inf)' is '0'.\n\n     For complex inputs to the log functions, the value is a complex\n     number with imaginary part in the range [-pi, pi]: which end of\n     the range is used might be platform-specific.\n\nS4 methods:\n\n     'exp', 'expm1', 'log', 'log10', 'log2' and 'log1p' are S4 generic\n     and are members of the 'Math' group generic.\n\n     Note that this means that the S4 generic for 'log' has a signature\n     with only one argument, 'x', but that 'base' can be passed to\n     methods (but will not be used for method selection).  On the other\n     hand, if you only set a method for the 'Math' group generic then\n     'base' argument of 'log' will be ignored for your class.\n\nSource:\n\n     'log1p' and 'expm1' may be taken from the operating system, but if\n     not available there then they are based on the Fortran subroutine\n     'dlnrel' by W. Fullerton of Los Alamos Scientific Laboratory (see\n     <https://netlib.org/slatec/fnlib/dlnrel.f>) and (for small x) a\n     single Newton step for the solution of 'log1p(y) = x'\n     respectively.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.  (for 'log', 'log10' and\n     'exp'.)\n\n     Chambers, J. M. (1998) _Programming with Data.  A Guide to the S\n     Language_.  Springer. (for 'logb'.)\n\nSee Also:\n\n     'Trig', 'sqrt', 'Arithmetic'.\n\nExamples:\n\n     log(exp(3))\n     log10(1e7) # = 7\n     \n     x <- 10^-(1+2*1:9)\n     cbind(deparse.level=2, # to get nice column names\n           x, log(1+x), log1p(x), exp(x)-1, expm1(x))\n\n\n\n## How to specify arguments\n\n1.  Arguments are separated with a comma\n2.  You can specify arguments by either including them in the correct order OR by assigning the argument within the function parentheses.\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/log_args.png){width=70%}\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10, 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 3.321928\n```\n\n\n:::\n\n```{.r .cell-code}\nlog(base=2, x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 3.321928\n```\n\n\n:::\n\n```{.r .cell-code}\nlog(x=10, 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 3.321928\n```\n\n\n:::\n\n```{.r .cell-code}\nlog(10, base=2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 3.321928\n```\n\n\n:::\n:::\n\n\n\n## Package - Basic term\n\nWhen you download R, it has a \"base\" set of functions, that are associated with a \"base\" set of packages including: 'base', 'datasets', 'graphics', 'grDevices', 'methods', 'stats', 'methods' (typically just referred to as **Base R**).\n\n-   e.g., the `log()` function comes from the 'base' package\n\n**Package** - a package in R is a bundle or \"package\" of code (and or possibly data) that can be loaded together for easy repeated use or for **sharing** with others.\n\nPackages are analogous to software applications like Microsoft Word. After installation, your operating system allows you to use it, just like having Word installed allows you to use it.\n\n## Packages\n\nThe Packages window in RStudio can help you identify what have been installed (listed), and which one have been called (check mark).\n\nLets go look at the Packages window, find the `base` package and find the `log()` function. It automatically loads the help file that we looked at earlier using `?log`.\n\n\n## Additional Packages\n\nYou can install additional packages for your uses from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/). These additional packages are written by RStudio or R users/developers (like us)\n\n-   Not all packages available on CRAN or GitHub are trustworthy\n-   RStudio (the company) makes a lot of great packages\n-   Who wrote it? **Hadley Wickham** is a major authority on R (Employee and Developer at RStudio)\n-   How to [trust](https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/#:~:text=The%20first%20thing%20I%20do,I%20immediately%20trust%20the%20package.) an R package\n\n## **Installing** and calling packages\n\nTo use the bundle or \"package\" of code (and or possibly data) from a package, you need to install and also call the package.\n\nTo install a package you can \n\n1. go to Tools ---\\> Install Packages in the RStudio header\n\nOR\n\n2. use the following code:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(package_name)\n```\n:::\n\n\n\n\n## Installing and **calling** packages\n\nTo call (i.e., be able to use the package) you can use the following code:\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(package_name)\n```\n:::\n\n\n\nMore on installing and calling packages later...\n\n\n## Mini Exercise\n\nFind and execute a **Base R** function that will round the number 0.86424 to two digits.\n\n\n## Functions from Module 1\n\nThe combine function `c()` collects/combines/joins single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?c\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\nCombine Values into a Vector or List\n\nDescription:\n\n     This is a generic function which combines its arguments.\n\n     The default method combines its arguments to form a vector.  All\n     arguments are coerced to a common type which is the type of the\n     returned value, and all attributes except names are removed.\n\nUsage:\n\n     ## S3 Generic function\n     c(...)\n     \n     ## Default S3 method:\n     c(..., recursive = FALSE, use.names = TRUE)\n     \nArguments:\n\n     ...: objects to be concatenated.  All 'NULL' entries are dropped\n          before method dispatch unless at the very beginning of the\n          argument list.\n\nrecursive: logical.  If 'recursive = TRUE', the function recursively\n          descends through lists (and pairlists) combining all their\n          elements into a vector.\n\nuse.names: logical indicating if 'names' should be preserved.\n\nDetails:\n\n     The output type is determined from the highest type of the\n     components in the hierarchy NULL < raw < logical < integer <\n     double < complex < character < list < expression.  Pairlists are\n     treated as lists, whereas non-vector components (such as 'name's /\n     'symbol's and 'call's) are treated as one-element 'list's which\n     cannot be unlisted even if 'recursive = TRUE'.\n\n     There is a 'c.factor' method which combines factors into a factor.\n\n     'c' is sometimes used for its side effect of removing attributes\n     except names, for example to turn an 'array' into a vector.\n     'as.vector' is a more intuitive way to do this, but also drops\n     names.  Note that methods other than the default are not required\n     to do this (and they will almost certainly preserve a class\n     attribute).\n\n     This is a primitive function.\n\nValue:\n\n     'NULL' or an expression or a vector of an appropriate mode.  (With\n     no arguments the value is 'NULL'.)\n\nS4 methods:\n\n     This function is S4 generic, but with argument list '(x, ...)'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'unlist' and 'as.vector' to produce attribute-free vectors.\n\nExamples:\n\n     c(1,7:9)\n     c(1:5, 10.5, \"next\")\n     \n     ## uses with a single argument to drop attributes\n     x <- 1:4\n     names(x) <- letters[1:4]\n     x\n     c(x)          # has names\n     as.vector(x)  # no names\n     dim(x) <- c(2,2)\n     x\n     c(x)\n     as.vector(x)\n     \n     ## append to a list:\n     ll <- list(A = 1, c = \"C\")\n     ## do *not* use\n     c(ll, d = 1:3) # which is == c(ll, as.list(c(d = 1:3)))\n     ## but rather\n     c(ll, d = list(1:3))  # c() combining two lists\n     \n     c(list(A = c(B = 1)), recursive = TRUE)\n     \n     c(options(), recursive = TRUE)\n     c(list(A = c(B = 1, C = 2), B = c(E = 7)), recursive = TRUE)\n```\n\n\n:::\n:::\n\n\n\n## Functions from Module 1\n\nThe `matrix()` function creates a matrix from the given set of values.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?matrix\n```\n:::\n\n\n\nxxamy - doesn't seem to work - may need to paste in a screen shot figure\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\nNo documentation for 'matix' in specified packages and libraries\n```\n\n\n:::\n:::\n\n\n\n\n## Summary\n\n- Functions are \"self contained\" modules of code that accomplish specific tasks.\n- Arguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)\n- Arguments may include defaults that the author of the function specified as being \"good enough in standard cases\", but that can be changed.\n- An R Package is a bundle or \"package\" of code (and or possibly data) that can be used by installing it once and calling it (using `library()`) each time R/Rstudio is opened\n- The Help window in RStudio is useful for to get more information about functions and packages \n\n\n## Acknowledgements\n\nThese are the materials I looked through, modified, or extracted to complete this module's lecture.\n\n- [\"Introduction to R - ARCHIVED\" from  Harvard Chan Bioinformatics Core (HBC)](https://hbctraining.github.io/Intro-to-R/lessons/03_introR-functions-and-arguments.html#:\\~:text=A%20key%20feature%20of%20R,it%2C%20and%20return%20a%20result.)\n\n\n",
+    "markdown": "---\ntitle: \"Module 2: Functions\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 2, you should be able to...\n\n-   Describe and execute functions in R\n-   Modify default behavior of functions using arguments in R\n-   Use R-specific sources of help to get more information about functions and packages \n-   Differentiate between Base R functions and functions that come from other packages\n\n\n## Function - Basic term\n\n**Function** - Functions are \"self contained\" modules of code that **accomplish specific tasks**. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.\n\nA function might help you add numbers together, create a plot, or organize your data. In fact, we have already used three functions in the Module 1, including `c()`, `matrix()`, `list()`. Here is another one, `sum()`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsum(1, 20234)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 20235\n```\n:::\n:::\n\n\n\n## Function\n\nThe general usage for a function is the name of the function followed by parentheses (i.e., the function signature). Within the parentheses are **arguments**.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfunction_name(argument1, argument2, ...)\n```\n:::\n\n\n\n## Arguments - Basic term\n\n**Arguments** are what you pass to the function and can include:\n\n1.  the physical object on which the function carries out a task (e.g., can be data such as a number 1 or 20234)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsum(1, 20234)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 20235\n```\n:::\n:::\n\n\n2.  options that alter the way the function operates (e.g., such as the `base` argument in the function `log()`)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10, base = 10)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1\n```\n:::\n\n```{.r .cell-code}\nlog(10, base = 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3.321928\n```\n:::\n\n```{.r .cell-code}\nlog(10, base=exp(1))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 2.302585\n```\n:::\n:::\n\n\n## Arguments\n\nMost functions are created with **default argument options**. The defaults represent standard values that the author of the function specified as being \"good enough in standard cases\". This means if you don't specify an argument when calling the function, it will use a default.\n\n-   If you want something specific, simply change the argument yourself with a value of your choice.\n-   If an argument is required but you did not specify it and there is no default argument specified when the function was created, you will receive an error.\n\n## Example\n\nWhat is the default in the `base` argument of the `log()` function?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 2.302585\n```\n:::\n:::\n\n\n## Sure that is easy enough, but how do you know\n\n- the purpose of a function? \n- what arguments a function includes? \n- how to specify the arguments?\n\n## Seeking help for using functions\n\nThe best way of finding out this information is to use the `?` followed by the name of the function. Doing this will open up the help manual in the bottom RStudio Help panel. It provides a description of the function, usage, arguments, details, and examples. Lets look at the help file for the function `round()`\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?log\n```\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nLogarithms and Exponentials\n\nDescription:\n\n     'log' computes logarithms, by default natural logarithms, 'log10'\n     computes common (i.e., base 10) logarithms, and 'log2' computes\n     binary (i.e., base 2) logarithms.  The general form 'log(x, base)'\n     computes logarithms with base 'base'.\n\n     'log1p(x)' computes log(1+x) accurately also for |x| << 1.\n\n     'exp' computes the exponential function.\n\n     'expm1(x)' computes exp(x) - 1 accurately also for |x| << 1.\n\nUsage:\n\n     log(x, base = exp(1))\n     logb(x, base = exp(1))\n     log10(x)\n     log2(x)\n     \n     log1p(x)\n     \n     exp(x)\n     expm1(x)\n     \nArguments:\n\n       x: a numeric or complex vector.\n\n    base: a positive or complex number: the base with respect to which\n          logarithms are computed.  Defaults to e='exp(1)'.\n\nDetails:\n\n     All except 'logb' are generic functions: methods can be defined\n     for them individually or via the 'Math' group generic.\n\n     'log10' and 'log2' are only convenience wrappers, but logs to\n     bases 10 and 2 (whether computed _via_ 'log' or the wrappers) will\n     be computed more efficiently and accurately where supported by the\n     OS.  Methods can be set for them individually (and otherwise\n     methods for 'log' will be used).\n\n     'logb' is a wrapper for 'log' for compatibility with S.  If (S3 or\n     S4) methods are set for 'log' they will be dispatched.  Do not set\n     S4 methods on 'logb' itself.\n\n     All except 'log' are primitive functions.\n\nValue:\n\n     A vector of the same length as 'x' containing the transformed\n     values.  'log(0)' gives '-Inf', and 'log(x)' for negative values\n     of 'x' is 'NaN'.  'exp(-Inf)' is '0'.\n\n     For complex inputs to the log functions, the value is a complex\n     number with imaginary part in the range [-pi, pi]: which end of\n     the range is used might be platform-specific.\n\nS4 methods:\n\n     'exp', 'expm1', 'log', 'log10', 'log2' and 'log1p' are S4 generic\n     and are members of the 'Math' group generic.\n\n     Note that this means that the S4 generic for 'log' has a signature\n     with only one argument, 'x', but that 'base' can be passed to\n     methods (but will not be used for method selection).  On the other\n     hand, if you only set a method for the 'Math' group generic then\n     'base' argument of 'log' will be ignored for your class.\n\nSource:\n\n     'log1p' and 'expm1' may be taken from the operating system, but if\n     not available there then they are based on the Fortran subroutine\n     'dlnrel' by W. Fullerton of Los Alamos Scientific Laboratory (see\n     <https://netlib.org/slatec/fnlib/dlnrel.f>) and (for small x) a\n     single Newton step for the solution of 'log1p(y) = x'\n     respectively.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.  (for 'log', 'log10' and\n     'exp'.)\n\n     Chambers, J. M. (1998) _Programming with Data.  A Guide to the S\n     Language_.  Springer. (for 'logb'.)\n\nSee Also:\n\n     'Trig', 'sqrt', 'Arithmetic'.\n\nExamples:\n\n     log(exp(3))\n     log10(1e7) # = 7\n     \n     x <- 10^-(1+2*1:9)\n     cbind(deparse.level=2, # to get nice column names\n           x, log(1+x), log1p(x), exp(x)-1, expm1(x))\n\n\n## How to specify arguments\n\n1.  Arguments are separated with a comma\n2.  You can specify arguments by either including them in the correct order OR by assigning the argument within the function parentheses.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/log_args.png){width=70%}\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nlog(10, 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3.321928\n```\n:::\n\n```{.r .cell-code}\nlog(base=2, x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3.321928\n```\n:::\n\n```{.r .cell-code}\nlog(x=10, 2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3.321928\n```\n:::\n\n```{.r .cell-code}\nlog(10, base=2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3.321928\n```\n:::\n:::\n\n\n## Package - Basic term\n\nWhen you download R, it has a \"base\" set of functions, that are associated with a \"base\" set of packages including: 'base', 'datasets', 'graphics', 'grDevices', 'methods', 'stats' (typically just referred to as **Base R**).\n\n-   e.g., the `log()` function comes from the 'base' package\n\n**Package** - a package in R is a bundle or \"package\" of code (and or possibly data) that can be loaded together for easy repeated use or for **sharing** with others.\n\nPackages are analogous to software applications like Microsoft Word. After installation, your operating system allows you to use it, just like having Word installed allows you to use it.\n\n## Packages\n\nThe Packages pane in RStudio can help you identify what have been installed (listed), and which one have been attached (check mark).\n\nLets go look at the Packages window, find the `base` package and find the `log()` function. It automatically loads the help file that we looked at earlier using `?log`.\n\n\n## Additional Packages\n\nYou can install additional packages for your use from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/). These additional packages are written by RStudio or R users/developers (like us)\n\n-   Not all packages available on CRAN or GitHub are trustworthy\n-   RStudio (the company) makes a lot of great packages\n-   Who wrote it? **Hadley Wickham** is a major authority on R (Employee and Developer at RStudio)\n-   How to [trust](https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/#:~:text=The%20first%20thing%20I%20do,I%20immediately%20trust%20the%20package.) an R package\n\n## **Installing** and attaching packages\n\nTo use the bundle or \"package\" of code (and or possibly data) from a package, you need to install and also attach the package.\n\nTo install a package you can \n\n1. go to R Studio Menu Bar Tools Menu ---\\> Install Packages in the RStudio header\n\nOR\n\n2. use the following code:\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"package_name\")\n```\n:::\n\n\n\n## Installing and **attaching** packages\n\nTo attach (i.e., be able to use the package) you can use the following code:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrequire(package_name) #library(package_name) also works\n```\n:::\n\n\nMore on installing and attaching packages later...\n\n\n## Mini Exercise\n\nFind and execute a **Base R** function that will round the number 0.86424 to two digits.\n\n\n## Functions from Module 1\n\nThe combine function `c()` concatenate/collects/combines single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types. \n\n\n::: {.cell}\n\n```{.r .cell-code}\n?c\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\nCombine Values into a Vector or List\n\nDescription:\n\n     This is a generic function which combines its arguments.\n\n     The default method combines its arguments to form a vector.  All\n     arguments are coerced to a common type which is the type of the\n     returned value, and all attributes except names are removed.\n\nUsage:\n\n     ## S3 Generic function\n     c(...)\n     \n     ## Default S3 method:\n     c(..., recursive = FALSE, use.names = TRUE)\n     \nArguments:\n\n     ...: objects to be concatenated.  All 'NULL' entries are dropped\n          before method dispatch unless at the very beginning of the\n          argument list.\n\nrecursive: logical.  If 'recursive = TRUE', the function recursively\n          descends through lists (and pairlists) combining all their\n          elements into a vector.\n\nuse.names: logical indicating if 'names' should be preserved.\n\nDetails:\n\n     The output type is determined from the highest type of the\n     components in the hierarchy NULL < raw < logical < integer <\n     double < complex < character < list < expression.  Pairlists are\n     treated as lists, whereas non-vector components (such as 'name's /\n     'symbol's and 'call's) are treated as one-element 'list's which\n     cannot be unlisted even if 'recursive = TRUE'.\n\n     There is a 'c.factor' method which combines factors into a factor.\n\n     'c' is sometimes used for its side effect of removing attributes\n     except names, for example to turn an 'array' into a vector.\n     'as.vector' is a more intuitive way to do this, but also drops\n     names.  Note that methods other than the default are not required\n     to do this (and they will almost certainly preserve a class\n     attribute).\n\n     This is a primitive function.\n\nValue:\n\n     'NULL' or an expression or a vector of an appropriate mode.  (With\n     no arguments the value is 'NULL'.)\n\nS4 methods:\n\n     This function is S4 generic, but with argument list '(x, ...)'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'unlist' and 'as.vector' to produce attribute-free vectors.\n\nExamples:\n\n     c(1,7:9)\n     c(1:5, 10.5, \"next\")\n     \n     ## uses with a single argument to drop attributes\n     x <- 1:4\n     names(x) <- letters[1:4]\n     x\n     c(x)          # has names\n     as.vector(x)  # no names\n     dim(x) <- c(2,2)\n     x\n     c(x)\n     as.vector(x)\n     \n     ## append to a list:\n     ll <- list(A = 1, c = \"C\")\n     ## do *not* use\n     c(ll, d = 1:3) # which is == c(ll, as.list(c(d = 1:3)))\n     ## but rather\n     c(ll, d = list(1:3))  # c() combining two lists\n     \n     c(list(A = c(B = 1)), recursive = TRUE)\n     \n     c(options(), recursive = TRUE)\n     c(list(A = c(B = 1, C = 2), B = c(E = 7)), recursive = TRUE)\n```\n:::\n:::\n\n\n## Functions from Module 1\n\nThe `matrix()` function creates a matrix from the given set of values.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?matrix\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\nMatrices\n\nDescription:\n\n     'matrix' creates a matrix from the given set of values.\n\n     'as.matrix' attempts to turn its argument into a matrix.\n\n     'is.matrix' tests if its argument is a (strict) matrix.\n\nUsage:\n\n     matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,\n            dimnames = NULL)\n     \n     as.matrix(x, ...)\n     ## S3 method for class 'data.frame'\n     as.matrix(x, rownames.force = NA, ...)\n     \n     is.matrix(x)\n     \nArguments:\n\n    data: an optional data vector (including a list or 'expression'\n          vector).  Non-atomic classed R objects are coerced by\n          'as.vector' and all attributes discarded.\n\n    nrow: the desired number of rows.\n\n    ncol: the desired number of columns.\n\n   byrow: logical. If 'FALSE' (the default) the matrix is filled by\n          columns, otherwise the matrix is filled by rows.\n\ndimnames: A 'dimnames' attribute for the matrix: 'NULL' or a 'list' of\n          length 2 giving the row and column names respectively.  An\n          empty list is treated as 'NULL', and a list of length one as\n          row names.  The list can be named, and the list names will be\n          used as names for the dimensions.\n\n       x: an R object.\n\n     ...: additional arguments to be passed to or from methods.\n\nrownames.force: logical indicating if the resulting matrix should have\n          character (rather than 'NULL') 'rownames'.  The default,\n          'NA', uses 'NULL' rownames if the data frame has 'automatic'\n          row.names or for a zero-row data frame.\n\nDetails:\n\n     If one of 'nrow' or 'ncol' is not given, an attempt is made to\n     infer it from the length of 'data' and the other parameter.  If\n     neither is given, a one-column matrix is returned.\n\n     If there are too few elements in 'data' to fill the matrix, then\n     the elements in 'data' are recycled.  If 'data' has length zero,\n     'NA' of an appropriate type is used for atomic vectors ('0' for\n     raw vectors) and 'NULL' for lists.\n\n     'is.matrix' returns 'TRUE' if 'x' is a vector and has a '\"dim\"'\n     attribute of length 2 and 'FALSE' otherwise.  Note that a\n     'data.frame' is *not* a matrix by this test.  The function is\n     generic: you can write methods to handle specific classes of\n     objects, see InternalMethods.\n\n     'as.matrix' is a generic function.  The method for data frames\n     will return a character matrix if there is only atomic columns and\n     any non-(numeric/logical/complex) column, applying 'as.vector' to\n     factors and 'format' to other non-character columns.  Otherwise,\n     the usual coercion hierarchy (logical < integer < double <\n     complex) will be used, e.g., all-logical data frames will be\n     coerced to a logical matrix, mixed logical-integer will give a\n     integer matrix, etc.\n\n     The default method for 'as.matrix' calls 'as.vector(x)', and hence\n     e.g. coerces factors to character vectors.\n\n     When coercing a vector, it produces a one-column matrix, and\n     promotes the names (if any) of the vector to the rownames of the\n     matrix.\n\n     'is.matrix' is a primitive function.\n\n     The 'print' method for a matrix gives a rectangular layout with\n     dimnames or indices.  For a list matrix, the entries of length not\n     one are printed in the form 'integer,7' indicating the type and\n     length.\n\nNote:\n\n     If you just want to convert a vector to a matrix, something like\n\n       dim(x) <- c(nx, ny)\n       dimnames(x) <- list(row_names, col_names)\n     \n     will avoid duplicating 'x' _and_ preserve 'class(x)' which may be\n     useful, e.g., for 'Date' objects.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'data.matrix', which attempts to convert to a numeric matrix.\n\n     A matrix is the special case of a two-dimensional 'array'.\n     'inherits(m, \"array\")' is true for a 'matrix' 'm'.\n\nExamples:\n\n     is.matrix(as.matrix(1:10))\n     !is.matrix(warpbreaks)  # data.frame, NOT matrix!\n     warpbreaks[1:10,]\n     as.matrix(warpbreaks[1:10,])  # using as.matrix.data.frame(.) method\n     \n     ## Example of setting row and column names\n     mdat <- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,\n                    dimnames = list(c(\"row1\", \"row2\"),\n                                    c(\"C.1\", \"C.2\", \"C.3\")))\n     mdat\n```\n:::\n:::\n\n\n\n## Summary\n\n- Functions are \"self contained\" modules of code that accomplish specific tasks.\n- Arguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)\n- Arguments may include defaults that the author of the function specified as being \"good enough in standard cases\", but that can be changed.\n- An R Package is a bundle or \"package\" of code (and or possibly data) that can be used by installing it once and attaching it (using `library()`) each time R/Rstudio is opened\n- The Help window in RStudio is useful for to get more information about functions and packages \n\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n- [\"Introduction to R - ARCHIVED\" from  Harvard Chan Bioinformatics Core (HBC)](https://hbctraining.github.io/Intro-to-R/lessons/03_introR-functions-and-arguments.html#:\\~:text=A%20key%20feature%20of%20R,it%2C%20and%20return%20a%20result.)\n\n\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module03-WorkingDirectories/execute-results/html.json b/_freeze/modules/Module03-WorkingDirectories/execute-results/html.json
index 2e6560c..3061fc1 100644
--- a/_freeze/modules/Module03-WorkingDirectories/execute-results/html.json
+++ b/_freeze/modules/Module03-WorkingDirectories/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "cc0e87ad3f332df20d0071d6ad92faff",
+  "hash": "8434fd2c84bea4b8dd46c1e3247e7a9d",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 3: Working Directories\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Learning Objectives\n\nAfter module 3, you should be able to...\n\n-   Understand your own systems file structure and the purpose of the working directory\n-   Determine the working directory\n-   Change the working directory\n\n## File Structure\n\nxxzane slide(s)\n\n## Working Directory -- Basic term\n\n-   R \"looks\" for files on your computer relative to the \"working\" directory\n-   For example, if you want to load data into R or save a figure, you will need to tell R where/store the file\n-   Many people recommend not setting a directory in the scripts, rather assume you're in the directory the script is in\n\n\n## Getting and setting the working directory using code\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## get the working directory\ngetwd()\nsetwd(\"~/\") \n```\n:::\n\n\n\n## Setting a working directory\n\n-   Setting the directory can sometimes (almost always when new to R) be finicky\n    -   **Windows**: Default directory structure involves single backslashes (\"`\\`\"), but R interprets these as\"escape\" characters. So you must replace the backslash with forward slashes (\"/\") or two backslashes (\"`\\\\`\")\n    -   **Mac/Linux**: Default is forward slashes, so you are okay\n-   Typical directory structure syntax applies\n    -   \"..\" - goes up one level\n    -   \"./\" - is the current directory\n    -   \"\\~\" - is your \"home\" directory\n\n\n## Absolute vs. relative paths\n\nFrom Wiki\n\n-   An **absolute or full path** points to the same location in a file system, regardless of the current working directory. To do that, it must include the root directory. Absolute path is specific to your system alone. This means if I try your code, and you use absolute paths, it won't work unless we have the exact same folder structure where R is looking (bad).\n\n-   By contrast, a **relative path starts from some given working directory**, avoiding the need to provide the full absolute path.\n\n## Relative path\n\nYou want to set you code up based on relative paths.  This allows sharing of code, and also, allows you to modify your own file structure (above the working directory) without breaking your own code.\n\n\n## Setting the working directory using your cursor\n\nRemember above \"Many people recommend not setting a directory in the scripts, rather assume you're in the directory the script is in.\" To do so, go to Session --\\> Set Working Directory --\\> To Source File Location\n\nRStudio will show the code in the Console for the action you took with your cursor. This is a good way to learn about your file system how to set a correct working directory!\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsetwd(\"~/Dropbox/Git/SISMID-2024\")\n```\n:::\n\n\n\n\n## Setting the Working Directory\n\nIf you have not yet saved a \"source\" file, it will set working directory to the default location. See RStudio -\\> Preferences -\\> General for default location.\n\nTo change the working directory to another location, go to Session --\\> Set Working Directory --\\> Choose Directory`\n\nAgain, RStudio will show the code in the Console for the action you took with your cursor.\n\n\n## Summary\n\n-   R \"looks\" for files on your computer relative to the \"working\" directory\n-   Absolute path points to the same location in a file system - it is specific to your system and your system alone\n-   Relative path points is based on the current working directory \n-   Two functions, `setwd()` and `getwd()`, are your new best friends.\n\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n",
+    "markdown": "---\ntitle: \"Module 3: Working Directories\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 3, you should be able to...\n\n-   Understand your own systems' file structure and the purpose of the working directory\n-   Determine the working directory\n-   Change the working directory\n\n## File Structure\n\nxxzane slide(s)\n\n## Working Directory -- Basic term\n\n-   R \"looks\" for files on your computer relative to the \"working\" directory\n-   For example, if you want to load data into R or save a figure, you will need to tell R where to look for or store the file\n-   Many people recommend not setting a directory in the scripts, rather assume you're in the directory the script is in\n\n\n## Getting and setting the working directory using code\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## get the working directory\ngetwd()\nsetwd(\"~/\") \n```\n:::\n\n\n## Setting a working directory\n\n-   Setting the directory can sometimes (almost always when new to R) be finicky\n    -   **Windows**: Default directory structure involves single backslashes (\"`\\`\"), but R interprets these as\"escape\" characters. So you must replace the backslash with forward slashes (\"/\") or two backslashes (\"`\\\\`\")\n    -   **Mac/Linux**: Default is forward slashes, so you are okay\n-   Typical directory structure syntax applies\n    -   \"..\" - goes up one level\n    -   \"./\" - is the current directory\n    -   \"\\~\" - is your \"home\" directory\n\n\n## Absolute vs. relative paths\n\nFrom Wiki\n\n-   An **absolute or full path** points to the same location in a file system, regardless of the current working directory. To do that, it must include the root directory. Absolute path is specific to your system alone. This means if I try your code, and you use absolute paths, it won't work unless we have the exact same folder structure where R is looking (bad).\n\n-   By contrast, a **relative path starts from some given working directory**, avoiding the need to provide the full absolute path.\n\n## Relative path\n\nYou want to set you code up based on relative paths.  This allows sharing of code, and also, allows you to modify your own file structure (above the working directory) without breaking your own code.\n\n\n## Setting the working directory using your cursor\n\nRemember above \"Many people recommend not setting a directory in the scripts, rather assume you're in the directory the script is in.\" To do so, go to Session --\\> Set Working Directory --\\> To Source File Location\n\nRStudio will show the code in the Console for the action you took with your cursor. This is a good way to learn about your file system how to set a correct working directory!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsetwd(\"~/Dropbox/Git/SISMID-2024\")\n```\n:::\n\n\n\n## Setting the Working Directory\n\nIf you have not yet saved a \"source\" file, it will set working directory to the default location.Find the Tool Menu in the Menu Bar -\\> Global Opsions -\\> General for default location.\n\nTo change the working directory to another location, find Session Menu in the Menu Bar --\\> Set Working Directory --\\> Choose Directory`\n\nAgain, RStudio will show the code in the Console for the action you took with your cursor.\n\n\n## Summary\n\n-   R \"looks\" for files on your computer relative to the \"working\" directory\n-   Absolute path points to the same location in a file system - it is specific to your system and your system alone\n-   Relative path points is based on the current working directory \n-   Two functions, `setwd()` and `getwd()` are useful for identifying and manipulating the working directory.\n\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module05-DataImportExport/execute-results/html.json b/_freeze/modules/Module05-DataImportExport/execute-results/html.json
index 5833ba8..2e3766f 100644
--- a/_freeze/modules/Module05-DataImportExport/execute-results/html.json
+++ b/_freeze/modules/Module05-DataImportExport/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "2ccda6bd4bed2b1d83f2f251ba9dd6ce",
+  "hash": "a0db8f0dfe70bceb90779858e46280be",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 5: Data Import and Export\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Learning Objectives\n\nAfter module 5, you should be able to...\n\n-   Use Base R functions to load data\n-   Install and call external R Packages to extend R's functionality\n-   Install any type of data into R\n-   Find loaded data in the Global Environment window of RStudio\n-   Reading and writing R .Rds and .Rda/.RData files\n\n\n## Import (read) Data\n\n-   Importing or 'Reading in' data is the first step of any real project/analysis\n-   R can read almost any file format, especially with external, non-Base R, packages\n-   We are going to focus on simple delimited files first. \n    -   comma separated (e.g. '.csv')\n    -   tab delimited (e.g. '.txt')\n\nA delimited file is a sequential file with column delimiters. Each delimited file is a stream of records, which consists of fields that are ordered by column. Each record contains fields for one row. Within each row, individual fields are separated by column **delimiters** (IBM.com definition)\n\n## Mini exercise\n\n1. Download Module 5 data from the website and save the data to your data subdirectory -- specifically `SISMID_IntroToR_RProject/data`\n\n2. Open the data files in a text editor application and familiarize you self with the data.\n\n3. Determine the delminiter of the two '.txt' files\n\n\n## Import delimited data\n\nWithin the Base R 'util' package we can find a handful of useful functions including  `read.csv()` and `read.delim()` to importing data.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?read.csv\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\nData Input\n\nDescription:\n\n     Reads a file in table format and creates a data frame from it,\n     with cases corresponding to lines and variables to fields in the\n     file.\n\nUsage:\n\n     read.table(file, header = FALSE, sep = \"\", quote = \"\\\"'\",\n                dec = \".\", numerals = c(\"allow.loss\", \"warn.loss\", \"no.loss\"),\n                row.names, col.names, as.is = !stringsAsFactors, tryLogical = TRUE,\n                na.strings = \"NA\", colClasses = NA, nrows = -1,\n                skip = 0, check.names = TRUE, fill = !blank.lines.skip,\n                strip.white = FALSE, blank.lines.skip = TRUE,\n                comment.char = \"#\",\n                allowEscapes = FALSE, flush = FALSE,\n                stringsAsFactors = FALSE,\n                fileEncoding = \"\", encoding = \"unknown\", text, skipNul = FALSE)\n     \n     read.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n              dec = \".\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.csv2(file, header = TRUE, sep = \";\", quote = \"\\\"\",\n               dec = \",\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n                dec = \".\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.delim2(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n                 dec = \",\", fill = TRUE, comment.char = \"\", ...)\n     \nArguments:\n\n    file: the name of the file which the data are to be read from.\n          Each row of the table appears as one line of the file.  If it\n          does not contain an _absolute_ path, the file name is\n          _relative_ to the current working directory, 'getwd()'.\n          Tilde-expansion is performed where supported.  This can be a\n          compressed file (see 'file').\n\n          Alternatively, 'file' can be a readable text-mode connection\n          (which will be opened for reading if necessary, and if so\n          'close'd (and hence destroyed) at the end of the function\n          call).  (If 'stdin()' is used, the prompts for lines may be\n          somewhat confusing.  Terminate input with a blank line or an\n          EOF signal, 'Ctrl-D' on Unix and 'Ctrl-Z' on Windows.  Any\n          pushback on 'stdin()' will be cleared before return.)\n\n          'file' can also be a complete URL.  (For the supported URL\n          schemes, see the 'URLs' section of the help for 'url'.)\n\n  header: a logical value indicating whether the file contains the\n          names of the variables as its first line.  If missing, the\n          value is determined from the file format: 'header' is set to\n          'TRUE' if and only if the first row contains one fewer field\n          than the number of columns.\n\n     sep: the field separator character.  Values on each line of the\n          file are separated by this character.  If 'sep = \"\"' (the\n          default for 'read.table') the separator is 'white space',\n          that is one or more spaces, tabs, newlines or carriage\n          returns.\n\n   quote: the set of quoting characters. To disable quoting altogether,\n          use 'quote = \"\"'.  See 'scan' for the behaviour on quotes\n          embedded in quotes.  Quoting is only considered for columns\n          read as character, which is all of them unless 'colClasses'\n          is specified.\n\n     dec: the character used in the file for decimal points.\n\nnumerals: string indicating how to convert numbers whose conversion to\n          double precision would lose accuracy, see 'type.convert'.\n          Can be abbreviated.  (Applies also to complex-number inputs.)\n\nrow.names: a vector of row names.  This can be a vector giving the\n          actual row names, or a single number giving the column of the\n          table which contains the row names, or character string\n          giving the name of the table column containing the row names.\n\n          If there is a header and the first row contains one fewer\n          field than the number of columns, the first column in the\n          input is used for the row names.  Otherwise if 'row.names' is\n          missing, the rows are numbered.\n\n          Using 'row.names = NULL' forces row numbering. Missing or\n          'NULL' 'row.names' generate row names that are considered to\n          be 'automatic' (and not preserved by 'as.matrix').\n\ncol.names: a vector of optional names for the variables.  The default\n          is to use '\"V\"' followed by the column number.\n\n   as.is: controls conversion of character variables (insofar as they\n          are not converted to logical, numeric or complex) to factors,\n          if not otherwise specified by 'colClasses'.  Its value is\n          either a vector of logicals (values are recycled if\n          necessary), or a vector of numeric or character indices which\n          specify which columns should not be converted to factors.\n\n          Note: to suppress all conversions including those of numeric\n          columns, set 'colClasses = \"character\"'.\n\n          Note that 'as.is' is specified per column (not per variable)\n          and so includes the column of row names (if any) and any\n          columns to be skipped.\n\ntryLogical: a 'logical' determining if columns consisting entirely of\n          '\"F\"', '\"T\"', '\"FALSE\"', and '\"TRUE\"' should be converted to\n          'logical'; passed to 'type.convert', true by default.\n\nna.strings: a character vector of strings which are to be interpreted\n          as 'NA' values.  Blank fields are also considered to be\n          missing values in logical, integer, numeric and complex\n          fields.  Note that the test happens _after_ white space is\n          stripped from the input, so 'na.strings' values may need\n          their own white space stripped in advance.\n\ncolClasses: character.  A vector of classes to be assumed for the\n          columns.  If unnamed, recycled as necessary.  If named, names\n          are matched with unspecified values being taken to be 'NA'.\n\n          Possible values are 'NA' (the default, when 'type.convert' is\n          used), '\"NULL\"' (when the column is skipped), one of the\n          atomic vector classes (logical, integer, numeric, complex,\n          character, raw), or '\"factor\"', '\"Date\"' or '\"POSIXct\"'.\n          Otherwise there needs to be an 'as' method (from package\n          'methods') for conversion from '\"character\"' to the specified\n          formal class.\n\n          Note that 'colClasses' is specified per column (not per\n          variable) and so includes the column of row names (if any).\n\n   nrows: integer: the maximum number of rows to read in.  Negative and\n          other invalid values are ignored.\n\n    skip: integer: the number of lines of the data file to skip before\n          beginning to read data.\n\ncheck.names: logical.  If 'TRUE' then the names of the variables in the\n          data frame are checked to ensure that they are syntactically\n          valid variable names.  If necessary they are adjusted (by\n          'make.names') so that they are, and also to ensure that there\n          are no duplicates.\n\n    fill: logical. If 'TRUE' then in case the rows have unequal length,\n          blank fields are implicitly added.  See 'Details'.\n\nstrip.white: logical. Used only when 'sep' has been specified, and\n          allows the stripping of leading and trailing white space from\n          unquoted 'character' fields ('numeric' fields are always\n          stripped).  See 'scan' for further details (including the\n          exact meaning of 'white space'), remembering that the columns\n          may include the row names.\n\nblank.lines.skip: logical: if 'TRUE' blank lines in the input are\n          ignored.\n\ncomment.char: character: a character vector of length one containing a\n          single character or an empty string.  Use '\"\"' to turn off\n          the interpretation of comments altogether.\n\nallowEscapes: logical.  Should C-style escapes such as '\\n' be\n          processed or read verbatim (the default)?  Note that if not\n          within quotes these could be interpreted as a delimiter (but\n          not as a comment character).  For more details see 'scan'.\n\n   flush: logical: if 'TRUE', 'scan' will flush to the end of the line\n          after reading the last of the fields requested.  This allows\n          putting comments after the last field.\n\nstringsAsFactors: logical: should character vectors be converted to\n          factors?  Note that this is overridden by 'as.is' and\n          'colClasses', both of which allow finer control.\n\nfileEncoding: character string: if non-empty declares the encoding used\n          on a file (not a connection) so the character data can be\n          re-encoded.  See the 'Encoding' section of the help for\n          'file', the 'R Data Import/Export' manual and 'Note'.\n\nencoding: encoding to be assumed for input strings.  It is used to mark\n          character strings as known to be in Latin-1 or UTF-8 (see\n          'Encoding'): it is not used to re-encode the input, but\n          allows R to handle encoded strings in their native encoding\n          (if one of those two).  See 'Value' and 'Note'.\n\n    text: character string: if 'file' is not supplied and this is, then\n          data are read from the value of 'text' via a text connection.\n          Notice that a literal string can be used to include (small)\n          data sets within R code.\n\n skipNul: logical: should nuls be skipped?\n\n     ...: Further arguments to be passed to 'read.table'.\n\nDetails:\n\n     This function is the principal means of reading tabular data into\n     R.\n\n     Unless 'colClasses' is specified, all columns are read as\n     character columns and then converted using 'type.convert' to\n     logical, integer, numeric, complex or (depending on 'as.is')\n     factor as appropriate.  Quotes are (by default) interpreted in all\n     fields, so a column of values like '\"42\"' will result in an\n     integer column.\n\n     A field or line is 'blank' if it contains nothing (except\n     whitespace if no separator is specified) before a comment\n     character or the end of the field or line.\n\n     If 'row.names' is not specified and the header line has one less\n     entry than the number of columns, the first column is taken to be\n     the row names.  This allows data frames to be read in from the\n     format in which they are printed.  If 'row.names' is specified and\n     does not refer to the first column, that column is discarded from\n     such files.\n\n     The number of data columns is determined by looking at the first\n     five lines of input (or the whole input if it has less than five\n     lines), or from the length of 'col.names' if it is specified and\n     is longer.  This could conceivably be wrong if 'fill' or\n     'blank.lines.skip' are true, so specify 'col.names' if necessary\n     (as in the 'Examples').\n\n     'read.csv' and 'read.csv2' are identical to 'read.table' except\n     for the defaults.  They are intended for reading 'comma separated\n     value' files ('.csv') or ('read.csv2') the variant used in\n     countries that use a comma as decimal point and a semicolon as\n     field separator.  Similarly, 'read.delim' and 'read.delim2' are\n     for reading delimited files, defaulting to the TAB character for\n     the delimiter.  Notice that 'header = TRUE' and 'fill = TRUE' in\n     these variants, and that the comment character is disabled.\n\n     The rest of the line after a comment character is skipped; quotes\n     are not processed in comments.  Complete comment lines are allowed\n     provided 'blank.lines.skip = TRUE'; however, comment lines prior\n     to the header must have the comment character in the first\n     non-blank column.\n\n     Quoted fields with embedded newlines are supported except after a\n     comment character.  Embedded nuls are unsupported: skipping them\n     (with 'skipNul = TRUE') may work.\n\nValue:\n\n     A data frame ('data.frame') containing a representation of the\n     data in the file.\n\n     Empty input is an error unless 'col.names' is specified, when a\n     0-row data frame is returned: similarly giving just a header line\n     if 'header = TRUE' results in a 0-row data frame.  Note that in\n     either case the columns will be logical unless 'colClasses' was\n     supplied.\n\n     Character strings in the result (including factor levels) will\n     have a declared encoding if 'encoding' is '\"latin1\"' or '\"UTF-8\"'.\n\nCSV files:\n\n     See the help on 'write.csv' for the various conventions for '.csv'\n     files.  The commonest form of CSV file with row names needs to be\n     read with 'read.csv(..., row.names = 1)' to use the names in the\n     first column of the file as row names.\n\nMemory usage:\n\n     These functions can use a surprising amount of memory when reading\n     large files.  There is extensive discussion in the 'R Data\n     Import/Export' manual, supplementing the notes here.\n\n     Less memory will be used if 'colClasses' is specified as one of\n     the six atomic vector classes.  This can be particularly so when\n     reading a column that takes many distinct numeric values, as\n     storing each distinct value as a character string can take up to\n     14 times as much memory as storing it as an integer.\n\n     Using 'nrows', even as a mild over-estimate, will help memory\n     usage.\n\n     Using 'comment.char = \"\"' will be appreciably faster than the\n     'read.table' default.\n\n     'read.table' is not the right tool for reading large matrices,\n     especially those with many columns: it is designed to read _data\n     frames_ which may have columns of very different classes.  Use\n     'scan' instead for matrices.\n\nNote:\n\n     The columns referred to in 'as.is' and 'colClasses' include the\n     column of row names (if any).\n\n     There are two approaches for reading input that is not in the\n     local encoding.  If the input is known to be UTF-8 or Latin1, use\n     the 'encoding' argument to declare that.  If the input is in some\n     other encoding, then it may be translated on input.  The\n     'fileEncoding' argument achieves this by setting up a connection\n     to do the re-encoding into the current locale.  Note that on\n     Windows or other systems not running in a UTF-8 locale, this may\n     not be possible.\n\nReferences:\n\n     Chambers, J. M. (1992) _Data for models._ Chapter 3 of\n     _Statistical Models in S_ eds J. M. Chambers and T. J. Hastie,\n     Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     The 'R Data Import/Export' manual.\n\n     'scan', 'type.convert', 'read.fwf' for reading _f_ixed _w_idth\n     _f_ormatted input; 'write.table'; 'data.frame'.\n\n     'count.fields' can be useful to determine problems with reading\n     files which result in reports of incorrect record lengths (see the\n     'Examples' below).\n\n     <https://www.rfc-editor.org/rfc/rfc4180> for the IANA definition\n     of CSV files (which requires comma as separator and CRLF line\n     endings).\n\nExamples:\n\n     ## using count.fields to handle unknown maximum number of fields\n     ## when fill = TRUE\n     test1 <- c(1:5, \"6,7\", \"8,9,10\")\n     tf <- tempfile()\n     writeLines(test1, tf)\n     \n     read.csv(tf, fill = TRUE) # 1 column\n     ncol <- max(count.fields(tf, sep = \",\"))\n     read.csv(tf, fill = TRUE, header = FALSE,\n              col.names = paste0(\"V\", seq_len(ncol)))\n     unlink(tf)\n     \n     ## \"Inline\" data set, using text=\n     ## Notice that leading and trailing empty lines are auto-trimmed\n     \n     read.table(header = TRUE, text = \"\n     a b\n     1 2\n     3 4\n     \")\n```\n\n\n:::\n:::\n\n\n\n## Import .csv files\n\nReminder\n```\nread.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n         dec = \".\", fill = TRUE, comment.char = \"\", ...)\n```\n\n`file` is the first argument and is the path to your file, in quotes \n\n\t- \t\tcan be path in your local computer -- absolute file path or relative file path \n\t- \t\tcan be path to a file on a website\n\n## Mini Exercise\n\nIf your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data.\n\n\n## Import .csv files\n\nLets import a new data file\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Examples\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\ndf <- read.csv(file = \"~/Dropbox/Git/SISMID-2024/modules/data/serodata.csv\") #absolute path starting from my home directory\n```\n:::\n\n\n\n\nNote #1, I assigned the data frame to an object called `df`.  I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object. \n\nNote #2, Look to the Environment window, you will see the `df` object ready to be used.\n\n\n## Import .txt files\n\n`read.csv()` is a special case of `read.delim()` -- a general function to read a delimited file into a data frame  \n\n```\nread.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n           dec = \".\", fill = TRUE, comment.char = \"\", ...)\n```\n\n- `file` is the path to your file, in quotes \n- `delim` is what separates the fields within a record. The default for csv is comma\n\n## Import .txt files\n\nLets first import 'serodata1.txt' which uses a tab delminiter and 'serodata2.txt' which uses a semicolon delminiter.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Examples\ndf <- read.delim(file = \"data/serodata.txt\", sep = \"\\t\")\ndf <- read.delim(file = \"data/serodata.txt\", sep = \";\")\n```\n:::\n\n\n\nThe data is now successfully read into your R workspace, **many times actually.** Notice, that each time we imported the data we assigned the data to the `df` object, meaning we replaced it each time we reassinged the `df` object.  \n\n\n## What if we have a .xlsx file - what do we do?\n\n1. Google / Ask ChatGPT\n2. Find and vet function and package you want\n3. Install package\n4. Call package\n5. Use function\n\n\n## 1. Internet Search\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/ChatGPT.png){width=100%}\n:::\n\n::: {.cell-output-display}\n![](images/GoogleSearch.png){width=100%}\n:::\n\n::: {.cell-output-display}\n![](images/StackOverflow.png){width=100%}\n:::\n:::\n\n\n\n## 2. Find and vet function and package you want\n\nI am getting consistent message to use the the `read_excel()` function found in the `readxl` package.  This package was developed by Hadley Wickham, who we know is reputable. Also, you can check that data was read in correctly, b/c this is a straightforward task. \n\n## 3. Install Package\n\nTo use the bundle or \"package\" of code (and or possibly data) from a package, you need to install and also call the package.\n\nTo install a package you can \n\n1. go to Tools ---\\> Install Packages in the RStudio header\n\nOR\n\n2. use the following code:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"package_name\")\n```\n:::\n\n\n\n\nTherefore,\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"readxl\")\n```\n:::\n\n\n\n## 4. Call Package\n\nReminder -- Installing and calling packages\n\nTo call (i.e., be able to use the package) you can use the following code:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(package_name)\n```\n:::\n\n\n\nTherefore, \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(readxl)\n```\n:::\n\n\n\n## 5. Use Function\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?read_excel\n```\n:::\n\nRead xls and xlsx files\n\nDescription:\n\n     Read xls and xlsx files\n\n     'read_excel()' calls 'excel_format()' to determine if 'path' is\n     xls or xlsx, based on the file extension and the file itself, in\n     that order. Use 'read_xls()' and 'read_xlsx()' directly if you\n     know better and want to prevent such guessing.\n\nUsage:\n\n     read_excel(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \n     read_xls(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \n     read_xlsx(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \nArguments:\n\n    path: Path to the xls/xlsx file.\n\n   sheet: Sheet to read. Either a string (the name of a sheet), or an\n          integer (the position of the sheet). Ignored if the sheet is\n          specified via 'range'. If neither argument specifies the\n          sheet, defaults to the first sheet.\n\n   range: A cell range to read from, as described in\n          cell-specification. Includes typical Excel ranges like\n          \"B3:D87\", possibly including the sheet name like\n          \"Budget!B2:G14\", and more. Interpreted strictly, even if the\n          range forces the inclusion of leading or trailing empty rows\n          or columns. Takes precedence over 'skip', 'n_max' and\n          'sheet'.\n\ncol_names: 'TRUE' to use the first row as column names, 'FALSE' to get\n          default names, or a character vector giving a name for each\n          column. If user provides 'col_types' as a vector, 'col_names'\n          can have one entry per column, i.e. have the same length as\n          'col_types', or one entry per unskipped column.\n\ncol_types: Either 'NULL' to guess all from the spreadsheet or a\n          character vector containing one entry per column from these\n          options: \"skip\", \"guess\", \"logical\", \"numeric\", \"date\",\n          \"text\" or \"list\". If exactly one 'col_type' is specified, it\n          will be recycled. The content of a cell in a skipped column\n          is never read and that column will not appear in the data\n          frame output. A list cell loads a column as a list of length\n          1 vectors, which are typed using the type guessing logic from\n          'col_types = NULL', but on a cell-by-cell basis.\n\n      na: Character vector of strings to interpret as missing values.\n          By default, readxl treats blank cells as missing data.\n\n trim_ws: Should leading and trailing whitespace be trimmed?\n\n    skip: Minimum number of rows to skip before reading anything, be it\n          column names or data. Leading empty rows are automatically\n          skipped, so this is a lower bound. Ignored if 'range' is\n          given.\n\n   n_max: Maximum number of data rows to read. Trailing empty rows are\n          automatically skipped, so this is an upper bound on the\n          number of rows in the returned tibble. Ignored if 'range' is\n          given.\n\nguess_max: Maximum number of data rows to use for guessing column\n          types.\n\nprogress: Display a progress spinner? By default, the spinner appears\n          only in an interactive session, outside the context of\n          knitting a document, and when the call is likely to run for\n          several seconds or more. See 'readxl_progress()' for more\n          details.\n\n.name_repair: Handling of column names. Passed along to\n          'tibble::as_tibble()'. readxl's default is `.name_repair =\n          \"unique\", which ensures column names are not empty and are\n          unique.\n\nValue:\n\n     A tibble\n\nSee Also:\n\n     cell-specification for more details on targetting cells with the\n     'range' argument\n\nExamples:\n\n     datasets <- readxl_example(\"datasets.xlsx\")\n     read_excel(datasets)\n     \n     # Specify sheet either by position or by name\n     read_excel(datasets, 2)\n     read_excel(datasets, \"mtcars\")\n     \n     # Skip rows and use default column names\n     read_excel(datasets, skip = 148, col_names = FALSE)\n     \n     # Recycle a single column type\n     read_excel(datasets, col_types = \"text\")\n     \n     # Specify some col_types and guess others\n     read_excel(datasets, col_types = c(\"text\", \"guess\", \"numeric\", \"guess\", \"guess\"))\n     \n     # Accomodate a column with disparate types via col_type = \"list\"\n     df <- read_excel(readxl_example(\"clippy.xlsx\"), col_types = c(\"text\", \"list\"))\n     df\n     df$value\n     sapply(df$value, class)\n     \n     # Limit the number of data rows read\n     read_excel(datasets, n_max = 3)\n     \n     # Read from an Excel range using A1 or R1C1 notation\n     read_excel(datasets, range = \"C1:E7\")\n     read_excel(datasets, range = \"R1C2:R2C5\")\n     \n     # Specify the sheet as part of the range\n     read_excel(datasets, range = \"mtcars!B1:D5\")\n     \n     # Read only specific rows or columns\n     read_excel(datasets, range = cell_rows(102:151), col_names = FALSE)\n     read_excel(datasets, range = cell_cols(\"B:D\"))\n     \n     # Get a preview of column names\n     names(read_excel(readxl_example(\"datasets.xlsx\"), n_max = 0))\n     \n     # exploit full .name_repair flexibility from tibble\n     \n     # \"universal\" names are unique and syntactic\n     read_excel(\n       readxl_example(\"deaths.xlsx\"),\n       range = \"arts!A5:F15\",\n       .name_repair = \"universal\"\n     )\n     \n     # specify name repair as a built-in function\n     read_excel(readxl_example(\"clippy.xlsx\"), .name_repair = toupper)\n     \n     # specify name repair as a custom function\n     my_custom_name_repair <- function(nms) tolower(gsub(\"[.]\", \"_\", nms))\n     read_excel(\n       readxl_example(\"datasets.xlsx\"),\n       .name_repair = my_custom_name_repair\n     )\n     \n     # specify name repair as an anonymous function\n     read_excel(\n       readxl_example(\"datasets.xlsx\"),\n       sheet = \"chickwts\",\n       .name_repair = ~ substr(.x, start = 1, stop = 3)\n     )\n\n\n\n## 5. Use Function\n\nReminder\n```\nread_excel(\n  path,\n  sheet = NULL,\n  range = NULL,\n  col_names = TRUE,\n  col_types = NULL,\n  na = \"\",\n  trim_ws = TRUE,\n  skip = 0,\n  n_max = Inf,\n  guess_max = min(1000, n_max),\n  progress = readxl_progress(),\n  .name_repair = \"unique\"\n)\n```\n\nLet's practice\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")\n```\n:::\n\n\n\n\n## Mini exercise\n\nLets make some mistakes\n\n1. What if we read in the data without assigning it to an object (i.e., `read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")`)?\n\n2. What if we forget to specify the sheet argument? (i.e., `dd <- read_excel(path = \"data/serodata.xlsx\")`)?\n\n\n## Installing and calling packages - Common confusion\n\nYou only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it. \n\nThe exception to this rule are the \"base\" set of packages (i.e., **Base R**) that are installed automatically when you install R and that automatically called whenever you open R or RStudio.\n\n\n## Common Error\n\nBe prepared to see the error \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nError: could not find function \"some_function\"\n```\n:::\n\n\n\nThis usually mean that either \n\n- you called the function by the wrong name \n- you have not installed a package that contains the function\n- you have installed a package but you forgot to call it (i.e., `library(package_name)`) -- **most likely**\n\n\n## Export (write) Data \n\n-   Exporting or 'Writing out' data allows you to save modified files to future use or sharing\n-   R can write almost any file format, especially with external, non-Base R, packages\n-   We are going to focus again on writing delimited files\n\n\n## Export delimited data\n\nWithin the Base R 'util' package we can find a handful of useful functions including  `write.csv()` and `write.table()` to exporting data.\n\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\nData Output\n\nDescription:\n\n     'write.table' prints its required argument 'x' (after converting\n     it to a data frame if it is not one nor a matrix) to a file or\n     connection.\n\nUsage:\n\n     write.table(x, file = \"\", append = FALSE, quote = TRUE, sep = \" \",\n                 eol = \"\\n\", na = \"NA\", dec = \".\", row.names = TRUE,\n                 col.names = TRUE, qmethod = c(\"escape\", \"double\"),\n                 fileEncoding = \"\")\n     \n     write.csv(...)\n     write.csv2(...)\n     \nArguments:\n\n       x: the object to be written, preferably a matrix or data frame.\n          If not, it is attempted to coerce 'x' to a data frame.\n\n    file: either a character string naming a file or a connection open\n          for writing.  '\"\"' indicates output to the console.\n\n  append: logical. Only relevant if 'file' is a character string.  If\n          'TRUE', the output is appended to the file.  If 'FALSE', any\n          existing file of the name is destroyed.\n\n   quote: a logical value ('TRUE' or 'FALSE') or a numeric vector.  If\n          'TRUE', any character or factor columns will be surrounded by\n          double quotes.  If a numeric vector, its elements are taken\n          as the indices of columns to quote.  In both cases, row and\n          column names are quoted if they are written.  If 'FALSE',\n          nothing is quoted.\n\n     sep: the field separator string.  Values within each row of 'x'\n          are separated by this string.\n\n     eol: the character(s) to print at the end of each line (row).  For\n          example, 'eol = \"\\r\\n\"' will produce Windows' line endings on\n          a Unix-alike OS, and 'eol = \"\\r\"' will produce files as\n          expected by Excel:mac 2004.\n\n      na: the string to use for missing values in the data.\n\n     dec: the string to use for decimal points in numeric or complex\n          columns: must be a single character.\n\nrow.names: either a logical value indicating whether the row names of\n          'x' are to be written along with 'x', or a character vector\n          of row names to be written.\n\ncol.names: either a logical value indicating whether the column names\n          of 'x' are to be written along with 'x', or a character\n          vector of column names to be written.  See the section on\n          'CSV files' for the meaning of 'col.names = NA'.\n\n qmethod: a character string specifying how to deal with embedded\n          double quote characters when quoting strings.  Must be one of\n          '\"escape\"' (default for 'write.table'), in which case the\n          quote character is escaped in C style by a backslash, or\n          '\"double\"' (default for 'write.csv' and 'write.csv2'), in\n          which case it is doubled.  You can specify just the initial\n          letter.\n\nfileEncoding: character string: if non-empty declares the encoding to\n          be used on a file (not a connection) so the character data\n          can be re-encoded as they are written.  See 'file'.\n\n     ...: arguments to 'write.table': 'append', 'col.names', 'sep',\n          'dec' and 'qmethod' cannot be altered.\n\nDetails:\n\n     If the table has no columns the rownames will be written only if\n     'row.names = TRUE', and _vice versa_.\n\n     Real and complex numbers are written to the maximal possible\n     precision.\n\n     If a data frame has matrix-like columns these will be converted to\n     multiple columns in the result (_via_ 'as.matrix') and so a\n     character 'col.names' or a numeric 'quote' should refer to the\n     columns in the result, not the input.  Such matrix-like columns\n     are unquoted by default.\n\n     Any columns in a data frame which are lists or have a class (e.g.,\n     dates) will be converted by the appropriate 'as.character' method:\n     such columns are unquoted by default.  On the other hand, any\n     class information for a matrix is discarded and non-atomic (e.g.,\n     list) matrices are coerced to character.\n\n     Only columns which have been converted to character will be quoted\n     if specified by 'quote'.\n\n     The 'dec' argument only applies to columns that are not subject to\n     conversion to character because they have a class or are part of a\n     matrix-like column (or matrix), in particular to columns protected\n     by 'I()'.  Use 'options(\"OutDec\")' to control such conversions.\n\n     In almost all cases the conversion of numeric quantities is\n     governed by the option '\"scipen\"' (see 'options'), but with the\n     internal equivalent of 'digits = 15'.  For finer control, use\n     'format' to make a character matrix/data frame, and call\n     'write.table' on that.\n\n     These functions check for a user interrupt every 1000 lines of\n     output.\n\n     If 'file' is a non-open connection, an attempt is made to open it\n     and then close it after use.\n\n     To write a Unix-style file on Windows, use a binary connection\n     e.g. 'file = file(\"filename\", \"wb\")'.\n\nCSV files:\n\n     By default there is no column name for a column of row names.  If\n     'col.names = NA' and 'row.names = TRUE' a blank column name is\n     added, which is the convention used for CSV files to be read by\n     spreadsheets.  Note that such CSV files can be read in R by\n\n       read.csv(file = \"<filename>\", row.names = 1)\n     \n     'write.csv' and 'write.csv2' provide convenience wrappers for\n     writing CSV files.  They set 'sep' and 'dec' (see below), 'qmethod\n     = \"double\"', and 'col.names' to 'NA' if 'row.names = TRUE' (the\n     default) and to 'TRUE' otherwise.\n\n     'write.csv' uses '\".\"' for the decimal point and a comma for the\n     separator.\n\n     'write.csv2' uses a comma for the decimal point and a semicolon\n     for the separator, the Excel convention for CSV files in some\n     Western European locales.\n\n     These wrappers are deliberately inflexible: they are designed to\n     ensure that the correct conventions are used to write a valid\n     file.  Attempts to change 'append', 'col.names', 'sep', 'dec' or\n     'qmethod' are ignored, with a warning.\n\n     CSV files do not record an encoding, and this causes problems if\n     they are not ASCII for many other applications.  Windows Excel\n     2007/10 will open files (e.g., by the file association mechanism)\n     correctly if they are ASCII or UTF-16 (use 'fileEncoding =\n     \"UTF-16LE\"') or perhaps in the current Windows codepage (e.g.,\n     '\"CP1252\"'), but the 'Text Import Wizard' (from the 'Data' tab)\n     allows far more choice of encodings.  Excel:mac 2004/8 can\n     _import_ only 'Macintosh' (which seems to mean Mac Roman),\n     'Windows' (perhaps Latin-1) and 'PC-8' files.  OpenOffice 3.x asks\n     for the character set when opening the file.\n\n     There is an IETF RFC4180\n     (<https://www.rfc-editor.org/rfc/rfc4180>) for CSV files, which\n     mandates comma as the separator and CRLF line endings.\n     'write.csv' writes compliant files on Windows: use 'eol = \"\\r\\n\"'\n     on other platforms.\n\nNote:\n\n     'write.table' can be slow for data frames with large numbers\n     (hundreds or more) of columns: this is inevitable as each column\n     could be of a different class and so must be handled separately.\n     If they are all of the same class, consider using a matrix\n     instead.\n\nSee Also:\n\n     The 'R Data Import/Export' manual.\n\n     'read.table', 'write'.\n\n     'write.matrix' in package 'MASS'.\n\nExamples:\n\n     x <- data.frame(a = I(\"a \\\" quote\"), b = pi)\n     tf <- tempfile(fileext = \".csv\")\n     \n     ## To write a CSV file for input to Excel one might use\n     write.table(x, file = tf, sep = \",\", col.names = NA,\n                 qmethod = \"double\")\n     file.show(tf)\n     ## and to read this file back into R one needs\n     read.table(tf, header = TRUE, sep = \",\", row.names = 1)\n     ## NB: you do need to specify a separator if qmethod = \"double\".\n     \n     ### Alternatively\n     write.csv(x, file = tf)\n     read.csv(tf, row.names = 1)\n     ## or without row names\n     write.csv(x, file = tf, row.names = FALSE)\n     read.csv(tf)\n     \n     ## Not run:\n     \n     ## To write a file in Mac Roman for simple use in Mac Excel 2004/8\n     write.csv(x, file = \"foo.csv\", fileEncoding = \"macroman\")\n     ## or for Windows Excel 2007/10\n     write.csv(x, file = \"foo.csv\", fileEncoding = \"UTF-16LE\")\n     ## End(Not run)\n```\n\n\n:::\n:::\n\n\n\n## Export delimited data\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwrite.csv(df, file=\"data/serodata_new.csv\", row.names = FALSE) #comma delimited\nwrite.table(df, file=\"data/serodata1_new.txt\", sep=\"\\t\", row.names = FALSE) #tab delimited\nwrite.table(df, file=\"data/serodata2_new.txt\", sep=\";\", row.names = FALSE) #semicolon delimited\n```\n:::\n\n\n\nNote, I wrote the data to new file names.  Even though we didn't change the data at all in this module, it is good practice to keep raw data raw, and not to write over it.\n\n## R .rds and .rda/RData files\n\nThere are two file extensions worth discussing.\n\nR has two native data formats—Rdata (sometimes shortened to Rda) and Rds. These formats are used when R objects are saved for later use. Rdata is used to save multiple R objects, while Rds is used to save a single R object. \n\n## .rds binary file\n\nSaving datasets in `.rds` format can save time if you have to read it back in later.\n\n`write_rds()` and `read_rds()` from `readr` package can be used to write/read a single R object to/from file.\n\n```\nlibrary(readr)\nwrite_rds(object1, file = \"filename.rds\")\nobject1 <- read_rds(file = \"filename.rds\")\n```\n\n\n## .rda/RData files \n\nThe Base R functions `save()` and `load()` can be used to save and load multiple R objects. \n\n`save()` writes an external representation of R objects to the specified file, and can by loaded back into the environment using `load()`. A nice feature about using `save` and `load` is that the R object is directly imported into the environment and you don't have to assign it to an object. The files can be saved as `.RData` or `.rda` files.\n\n```\nsave(object1, object2, file = \"filename.RData\")\nload(\"filename.RData\")\n```\n\nNote, that when you read .RData files you don't need to assign it to an abjecct.  It simply reads in the objects as they were saved.  Therefore, `load(\"filename.RData\")` will read in `object1` and  `object2` directly into the Global Environment.\n\n\n\n## Summary\n\n- Importing or 'Reading in' data is the first step of any real project/analysis\n- The Base R 'util' package we can find a handful of useful functions including  `read.csv()` and `read.delim()` to importing/reading data or `write.csv()` and `write.table()` for exporti/writing data\n- When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be called/used\n- Properly read data can be found in the Environment window of RStudio\n- You only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it. \n- To complete a tasek you don't know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Call package, 5. Use function\n\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n\n",
+    "markdown": "---\ntitle: \"Module 5: Data Import and Export\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 5, you should be able to...\n\n-   Use Base R functions to load data\n-   Install and attach external R Packages to extend R's functionality\n-   Load any type of data into R\n-   Find loaded data in the Environment pane of RStudio\n-   Reading and writing R .Rds and .Rda/.RData files\n\n\n## Import (read) Data\n\n-   Importing or 'Reading in' data are the first step of any real project / data analysis\n-   R can read almost any file format, especially with external, non-Base R, packages\n-   We are going to focus on simple delimited files first. \n    -   comma separated (e.g. '.csv')\n    -   tab delimited (e.g. '.txt')\n\nA delimited file is a sequential file with column delimiters. Each delimited file is a stream of records, which consists of fields that are ordered by column. Each record contains fields for one row. Within each row, individual fields are separated by column **delimiters** (IBM.com definition)\n\n## Mini exercise\n\n1. Download Module 5 data from the website and save the data to your data subdirectory -- specifically `SISMID_IntroToR_RProject/data`\n\n1. Open the '.csv' and '.txt' data files in a text editor application and familiarize yourself with the data (i.e., Notepad for Windows and TextEdit for Mac)\n\n1. Open the '.xlsx' data file in excel and familiarize yourself with the data\n\t\t-\t\tif you use a Mac **do not** open in Numbers, it can corrupt the file\n\t\t-\t\tif you do not have excel, you can upload it to Google Sheets\n\n1. Determine the delimiter of the two '.txt' files\n\n\n## Import delimited data\n\nWithin the Base R 'util' package we can find a handful of useful functions including  `read.csv()` and `read.delim()` to importing data.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?read.csv\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nData Input\n\nDescription:\n\n     Reads a file in table format and creates a data frame from it,\n     with cases corresponding to lines and variables to fields in the\n     file.\n\nUsage:\n\n     read.table(file, header = FALSE, sep = \"\", quote = \"\\\"'\",\n                dec = \".\", numerals = c(\"allow.loss\", \"warn.loss\", \"no.loss\"),\n                row.names, col.names, as.is = !stringsAsFactors, tryLogical = TRUE,\n                na.strings = \"NA\", colClasses = NA, nrows = -1,\n                skip = 0, check.names = TRUE, fill = !blank.lines.skip,\n                strip.white = FALSE, blank.lines.skip = TRUE,\n                comment.char = \"#\",\n                allowEscapes = FALSE, flush = FALSE,\n                stringsAsFactors = FALSE,\n                fileEncoding = \"\", encoding = \"unknown\", text, skipNul = FALSE)\n     \n     read.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n              dec = \".\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.csv2(file, header = TRUE, sep = \";\", quote = \"\\\"\",\n               dec = \",\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n                dec = \".\", fill = TRUE, comment.char = \"\", ...)\n     \n     read.delim2(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n                 dec = \",\", fill = TRUE, comment.char = \"\", ...)\n     \nArguments:\n\n    file: the name of the file which the data are to be read from.\n          Each row of the table appears as one line of the file.  If it\n          does not contain an _absolute_ path, the file name is\n          _relative_ to the current working directory, 'getwd()'.\n          Tilde-expansion is performed where supported.  This can be a\n          compressed file (see 'file').\n\n          Alternatively, 'file' can be a readable text-mode connection\n          (which will be opened for reading if necessary, and if so\n          'close'd (and hence destroyed) at the end of the function\n          call).  (If 'stdin()' is used, the prompts for lines may be\n          somewhat confusing.  Terminate input with a blank line or an\n          EOF signal, 'Ctrl-D' on Unix and 'Ctrl-Z' on Windows.  Any\n          pushback on 'stdin()' will be cleared before return.)\n\n          'file' can also be a complete URL.  (For the supported URL\n          schemes, see the 'URLs' section of the help for 'url'.)\n\n  header: a logical value indicating whether the file contains the\n          names of the variables as its first line.  If missing, the\n          value is determined from the file format: 'header' is set to\n          'TRUE' if and only if the first row contains one fewer field\n          than the number of columns.\n\n     sep: the field separator character.  Values on each line of the\n          file are separated by this character.  If 'sep = \"\"' (the\n          default for 'read.table') the separator is 'white space',\n          that is one or more spaces, tabs, newlines or carriage\n          returns.\n\n   quote: the set of quoting characters. To disable quoting altogether,\n          use 'quote = \"\"'.  See 'scan' for the behaviour on quotes\n          embedded in quotes.  Quoting is only considered for columns\n          read as character, which is all of them unless 'colClasses'\n          is specified.\n\n     dec: the character used in the file for decimal points.\n\nnumerals: string indicating how to convert numbers whose conversion to\n          double precision would lose accuracy, see 'type.convert'.\n          Can be abbreviated.  (Applies also to complex-number inputs.)\n\nrow.names: a vector of row names.  This can be a vector giving the\n          actual row names, or a single number giving the column of the\n          table which contains the row names, or character string\n          giving the name of the table column containing the row names.\n\n          If there is a header and the first row contains one fewer\n          field than the number of columns, the first column in the\n          input is used for the row names.  Otherwise if 'row.names' is\n          missing, the rows are numbered.\n\n          Using 'row.names = NULL' forces row numbering. Missing or\n          'NULL' 'row.names' generate row names that are considered to\n          be 'automatic' (and not preserved by 'as.matrix').\n\ncol.names: a vector of optional names for the variables.  The default\n          is to use '\"V\"' followed by the column number.\n\n   as.is: controls conversion of character variables (insofar as they\n          are not converted to logical, numeric or complex) to factors,\n          if not otherwise specified by 'colClasses'.  Its value is\n          either a vector of logicals (values are recycled if\n          necessary), or a vector of numeric or character indices which\n          specify which columns should not be converted to factors.\n\n          Note: to suppress all conversions including those of numeric\n          columns, set 'colClasses = \"character\"'.\n\n          Note that 'as.is' is specified per column (not per variable)\n          and so includes the column of row names (if any) and any\n          columns to be skipped.\n\ntryLogical: a 'logical' determining if columns consisting entirely of\n          '\"F\"', '\"T\"', '\"FALSE\"', and '\"TRUE\"' should be converted to\n          'logical'; passed to 'type.convert', true by default.\n\nna.strings: a character vector of strings which are to be interpreted\n          as 'NA' values.  Blank fields are also considered to be\n          missing values in logical, integer, numeric and complex\n          fields.  Note that the test happens _after_ white space is\n          stripped from the input, so 'na.strings' values may need\n          their own white space stripped in advance.\n\ncolClasses: character.  A vector of classes to be assumed for the\n          columns.  If unnamed, recycled as necessary.  If named, names\n          are matched with unspecified values being taken to be 'NA'.\n\n          Possible values are 'NA' (the default, when 'type.convert' is\n          used), '\"NULL\"' (when the column is skipped), one of the\n          atomic vector classes (logical, integer, numeric, complex,\n          character, raw), or '\"factor\"', '\"Date\"' or '\"POSIXct\"'.\n          Otherwise there needs to be an 'as' method (from package\n          'methods') for conversion from '\"character\"' to the specified\n          formal class.\n\n          Note that 'colClasses' is specified per column (not per\n          variable) and so includes the column of row names (if any).\n\n   nrows: integer: the maximum number of rows to read in.  Negative and\n          other invalid values are ignored.\n\n    skip: integer: the number of lines of the data file to skip before\n          beginning to read data.\n\ncheck.names: logical.  If 'TRUE' then the names of the variables in the\n          data frame are checked to ensure that they are syntactically\n          valid variable names.  If necessary they are adjusted (by\n          'make.names') so that they are, and also to ensure that there\n          are no duplicates.\n\n    fill: logical. If 'TRUE' then in case the rows have unequal length,\n          blank fields are implicitly added.  See 'Details'.\n\nstrip.white: logical. Used only when 'sep' has been specified, and\n          allows the stripping of leading and trailing white space from\n          unquoted 'character' fields ('numeric' fields are always\n          stripped).  See 'scan' for further details (including the\n          exact meaning of 'white space'), remembering that the columns\n          may include the row names.\n\nblank.lines.skip: logical: if 'TRUE' blank lines in the input are\n          ignored.\n\ncomment.char: character: a character vector of length one containing a\n          single character or an empty string.  Use '\"\"' to turn off\n          the interpretation of comments altogether.\n\nallowEscapes: logical.  Should C-style escapes such as '\\n' be\n          processed or read verbatim (the default)?  Note that if not\n          within quotes these could be interpreted as a delimiter (but\n          not as a comment character).  For more details see 'scan'.\n\n   flush: logical: if 'TRUE', 'scan' will flush to the end of the line\n          after reading the last of the fields requested.  This allows\n          putting comments after the last field.\n\nstringsAsFactors: logical: should character vectors be converted to\n          factors?  Note that this is overridden by 'as.is' and\n          'colClasses', both of which allow finer control.\n\nfileEncoding: character string: if non-empty declares the encoding used\n          on a file (not a connection) so the character data can be\n          re-encoded.  See the 'Encoding' section of the help for\n          'file', the 'R Data Import/Export' manual and 'Note'.\n\nencoding: encoding to be assumed for input strings.  It is used to mark\n          character strings as known to be in Latin-1 or UTF-8 (see\n          'Encoding'): it is not used to re-encode the input, but\n          allows R to handle encoded strings in their native encoding\n          (if one of those two).  See 'Value' and 'Note'.\n\n    text: character string: if 'file' is not supplied and this is, then\n          data are read from the value of 'text' via a text connection.\n          Notice that a literal string can be used to include (small)\n          data sets within R code.\n\n skipNul: logical: should nuls be skipped?\n\n     ...: Further arguments to be passed to 'read.table'.\n\nDetails:\n\n     This function is the principal means of reading tabular data into\n     R.\n\n     Unless 'colClasses' is specified, all columns are read as\n     character columns and then converted using 'type.convert' to\n     logical, integer, numeric, complex or (depending on 'as.is')\n     factor as appropriate.  Quotes are (by default) interpreted in all\n     fields, so a column of values like '\"42\"' will result in an\n     integer column.\n\n     A field or line is 'blank' if it contains nothing (except\n     whitespace if no separator is specified) before a comment\n     character or the end of the field or line.\n\n     If 'row.names' is not specified and the header line has one less\n     entry than the number of columns, the first column is taken to be\n     the row names.  This allows data frames to be read in from the\n     format in which they are printed.  If 'row.names' is specified and\n     does not refer to the first column, that column is discarded from\n     such files.\n\n     The number of data columns is determined by looking at the first\n     five lines of input (or the whole input if it has less than five\n     lines), or from the length of 'col.names' if it is specified and\n     is longer.  This could conceivably be wrong if 'fill' or\n     'blank.lines.skip' are true, so specify 'col.names' if necessary\n     (as in the 'Examples').\n\n     'read.csv' and 'read.csv2' are identical to 'read.table' except\n     for the defaults.  They are intended for reading 'comma separated\n     value' files ('.csv') or ('read.csv2') the variant used in\n     countries that use a comma as decimal point and a semicolon as\n     field separator.  Similarly, 'read.delim' and 'read.delim2' are\n     for reading delimited files, defaulting to the TAB character for\n     the delimiter.  Notice that 'header = TRUE' and 'fill = TRUE' in\n     these variants, and that the comment character is disabled.\n\n     The rest of the line after a comment character is skipped; quotes\n     are not processed in comments.  Complete comment lines are allowed\n     provided 'blank.lines.skip = TRUE'; however, comment lines prior\n     to the header must have the comment character in the first\n     non-blank column.\n\n     Quoted fields with embedded newlines are supported except after a\n     comment character.  Embedded nuls are unsupported: skipping them\n     (with 'skipNul = TRUE') may work.\n\nValue:\n\n     A data frame ('data.frame') containing a representation of the\n     data in the file.\n\n     Empty input is an error unless 'col.names' is specified, when a\n     0-row data frame is returned: similarly giving just a header line\n     if 'header = TRUE' results in a 0-row data frame.  Note that in\n     either case the columns will be logical unless 'colClasses' was\n     supplied.\n\n     Character strings in the result (including factor levels) will\n     have a declared encoding if 'encoding' is '\"latin1\"' or '\"UTF-8\"'.\n\nCSV files:\n\n     See the help on 'write.csv' for the various conventions for '.csv'\n     files.  The commonest form of CSV file with row names needs to be\n     read with 'read.csv(..., row.names = 1)' to use the names in the\n     first column of the file as row names.\n\nMemory usage:\n\n     These functions can use a surprising amount of memory when reading\n     large files.  There is extensive discussion in the 'R Data\n     Import/Export' manual, supplementing the notes here.\n\n     Less memory will be used if 'colClasses' is specified as one of\n     the six atomic vector classes.  This can be particularly so when\n     reading a column that takes many distinct numeric values, as\n     storing each distinct value as a character string can take up to\n     14 times as much memory as storing it as an integer.\n\n     Using 'nrows', even as a mild over-estimate, will help memory\n     usage.\n\n     Using 'comment.char = \"\"' will be appreciably faster than the\n     'read.table' default.\n\n     'read.table' is not the right tool for reading large matrices,\n     especially those with many columns: it is designed to read _data\n     frames_ which may have columns of very different classes.  Use\n     'scan' instead for matrices.\n\nNote:\n\n     The columns referred to in 'as.is' and 'colClasses' include the\n     column of row names (if any).\n\n     There are two approaches for reading input that is not in the\n     local encoding.  If the input is known to be UTF-8 or Latin1, use\n     the 'encoding' argument to declare that.  If the input is in some\n     other encoding, then it may be translated on input.  The\n     'fileEncoding' argument achieves this by setting up a connection\n     to do the re-encoding into the current locale.  Note that on\n     Windows or other systems not running in a UTF-8 locale, this may\n     not be possible.\n\nReferences:\n\n     Chambers, J. M. (1992) _Data for models._ Chapter 3 of\n     _Statistical Models in S_ eds J. M. Chambers and T. J. Hastie,\n     Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     The 'R Data Import/Export' manual.\n\n     'scan', 'type.convert', 'read.fwf' for reading _f_ixed _w_idth\n     _f_ormatted input; 'write.table'; 'data.frame'.\n\n     'count.fields' can be useful to determine problems with reading\n     files which result in reports of incorrect record lengths (see the\n     'Examples' below).\n\n     <https://www.rfc-editor.org/rfc/rfc4180> for the IANA definition\n     of CSV files (which requires comma as separator and CRLF line\n     endings).\n\nExamples:\n\n     ## using count.fields to handle unknown maximum number of fields\n     ## when fill = TRUE\n     test1 <- c(1:5, \"6,7\", \"8,9,10\")\n     tf <- tempfile()\n     writeLines(test1, tf)\n     \n     read.csv(tf, fill = TRUE) # 1 column\n     ncol <- max(count.fields(tf, sep = \",\"))\n     read.csv(tf, fill = TRUE, header = FALSE,\n              col.names = paste0(\"V\", seq_len(ncol)))\n     unlink(tf)\n     \n     ## \"Inline\" data set, using text=\n     ## Notice that leading and trailing empty lines are auto-trimmed\n     \n     read.table(header = TRUE, text = \"\n     a b\n     1 2\n     3 4\n     \")\n```\n:::\n:::\n\n\n## Import .csv files\n\nFunction signature reminder\n```\nread.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n         dec = \".\", fill = TRUE, comment.char = \"\", ...)\n```\n\n\t\t-\t\t`file` is the first argument and is the path to your file, in quotes \n\t\t\n\t\t\t\t-\t\tcan be path in your local computer -- absolute file path or relative file path \n\t\t\t\t-\t\tcan be path to a file on a website\n\n## Mini exercise\n\nIf your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data.\n\n\n## Import .csv files\n\nLets import a new data file\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Examples\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\n```\n:::\n\n\n\nNote #1, I assigned the data frame to an object called `df`.  I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object. \n\nNote #2, Look to the Environment pane, you will see the `df` object ready to be used.\n\n\n## Import .txt files\n\n`read.csv()` is a special case of `read.delim()` -- a general function to read a delimited file into a data frame  \n\nReminder function signature\n```\nread.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n           dec = \".\", fill = TRUE, comment.char = \"\", ...)\n```\n\n\t\t- `file` is the path to your file, in quotes \n\t\t- `delim` is what separates the fields within a record. The default for csv is comma\n\n## Import .txt files\n\nLets first import 'serodata1.txt' which uses a tab delimiter and 'serodata2.txt' which uses a semicolon delimiter.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Examples\ndf <- read.delim(file = \"data/serodata.txt\", sep = \"\\t\")\ndf <- read.delim(file = \"data/serodata.txt\", sep = \";\")\n```\n:::\n\n\nThe dataset is now successfully read into your R workspace, **many times actually.** Notice, that each time we imported the data we assigned the data to the `df` object, meaning we replaced it each time we reassinged the `df` object.  \n\n\n## What if we have a .xlsx file - what do we do?\n\n1. Google / Ask ChatGPT\n2. Find and vet function and package you want\n3. Install package\n4. Attach package\n5. Use function\n\n\n## 1. Internet Search\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/ChatGPT.png){width=100%}\n:::\n\n::: {.cell-output-display}\n![](images/GoogleSearch.png){width=100%}\n:::\n\n::: {.cell-output-display}\n![](images/StackOverflow.png){width=100%}\n:::\n:::\n\n\n## 2. Find and vet function and package you want\n\nI am getting consistent message to use the the `read_excel()` function found in the `readxl` package.  This package was developed by Hadley Wickham, who we know is reputable. Also, you can check that data was read in correctly, b/c this is a straightforward task. \n\n## 3. Install Package\n\nTo use the bundle or \"package\" of code (and or possibly data) from a package, you need to install and also call the package.\n\nTo install a package you can \n\n1. go to Tools ---\\> Install Packages in the RStudio header\n\nOR\n\n2. use the following code:\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"package_name\")\n```\n:::\n\n\n\nTherefore,\n\n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"readxl\")\n```\n:::\n\n\n## 4. Attach Package\n\nReminder - To attach (i.e., be able to use the package) you can use the following code:\n\n::: {.cell}\n\n```{.r .cell-code}\nrequire(package_name)\n```\n:::\n\n\nTherefore, \n\n\n::: {.cell}\n\n```{.r .cell-code}\nrequire(readxl)\n```\n:::\n\n\n## 5. Use Function\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?read_excel\n```\n:::\n\nRead xls and xlsx files\n\nDescription:\n\n     Read xls and xlsx files\n\n     'read_excel()' calls 'excel_format()' to determine if 'path' is\n     xls or xlsx, based on the file extension and the file itself, in\n     that order. Use 'read_xls()' and 'read_xlsx()' directly if you\n     know better and want to prevent such guessing.\n\nUsage:\n\n     read_excel(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \n     read_xls(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \n     read_xlsx(\n       path,\n       sheet = NULL,\n       range = NULL,\n       col_names = TRUE,\n       col_types = NULL,\n       na = \"\",\n       trim_ws = TRUE,\n       skip = 0,\n       n_max = Inf,\n       guess_max = min(1000, n_max),\n       progress = readxl_progress(),\n       .name_repair = \"unique\"\n     )\n     \nArguments:\n\n    path: Path to the xls/xlsx file.\n\n   sheet: Sheet to read. Either a string (the name of a sheet), or an\n          integer (the position of the sheet). Ignored if the sheet is\n          specified via 'range'. If neither argument specifies the\n          sheet, defaults to the first sheet.\n\n   range: A cell range to read from, as described in\n          cell-specification. Includes typical Excel ranges like\n          \"B3:D87\", possibly including the sheet name like\n          \"Budget!B2:G14\", and more. Interpreted strictly, even if the\n          range forces the inclusion of leading or trailing empty rows\n          or columns. Takes precedence over 'skip', 'n_max' and\n          'sheet'.\n\ncol_names: 'TRUE' to use the first row as column names, 'FALSE' to get\n          default names, or a character vector giving a name for each\n          column. If user provides 'col_types' as a vector, 'col_names'\n          can have one entry per column, i.e. have the same length as\n          'col_types', or one entry per unskipped column.\n\ncol_types: Either 'NULL' to guess all from the spreadsheet or a\n          character vector containing one entry per column from these\n          options: \"skip\", \"guess\", \"logical\", \"numeric\", \"date\",\n          \"text\" or \"list\". If exactly one 'col_type' is specified, it\n          will be recycled. The content of a cell in a skipped column\n          is never read and that column will not appear in the data\n          frame output. A list cell loads a column as a list of length\n          1 vectors, which are typed using the type guessing logic from\n          'col_types = NULL', but on a cell-by-cell basis.\n\n      na: Character vector of strings to interpret as missing values.\n          By default, readxl treats blank cells as missing data.\n\n trim_ws: Should leading and trailing whitespace be trimmed?\n\n    skip: Minimum number of rows to skip before reading anything, be it\n          column names or data. Leading empty rows are automatically\n          skipped, so this is a lower bound. Ignored if 'range' is\n          given.\n\n   n_max: Maximum number of data rows to read. Trailing empty rows are\n          automatically skipped, so this is an upper bound on the\n          number of rows in the returned tibble. Ignored if 'range' is\n          given.\n\nguess_max: Maximum number of data rows to use for guessing column\n          types.\n\nprogress: Display a progress spinner? By default, the spinner appears\n          only in an interactive session, outside the context of\n          knitting a document, and when the call is likely to run for\n          several seconds or more. See 'readxl_progress()' for more\n          details.\n\n.name_repair: Handling of column names. Passed along to\n          'tibble::as_tibble()'. readxl's default is `.name_repair =\n          \"unique\", which ensures column names are not empty and are\n          unique.\n\nValue:\n\n     A tibble\n\nSee Also:\n\n     cell-specification for more details on targetting cells with the\n     'range' argument\n\nExamples:\n\n     datasets <- readxl_example(\"datasets.xlsx\")\n     read_excel(datasets)\n     \n     # Specify sheet either by position or by name\n     read_excel(datasets, 2)\n     read_excel(datasets, \"mtcars\")\n     \n     # Skip rows and use default column names\n     read_excel(datasets, skip = 148, col_names = FALSE)\n     \n     # Recycle a single column type\n     read_excel(datasets, col_types = \"text\")\n     \n     # Specify some col_types and guess others\n     read_excel(datasets, col_types = c(\"text\", \"guess\", \"numeric\", \"guess\", \"guess\"))\n     \n     # Accomodate a column with disparate types via col_type = \"list\"\n     df <- read_excel(readxl_example(\"clippy.xlsx\"), col_types = c(\"text\", \"list\"))\n     df\n     df$value\n     sapply(df$value, class)\n     \n     # Limit the number of data rows read\n     read_excel(datasets, n_max = 3)\n     \n     # Read from an Excel range using A1 or R1C1 notation\n     read_excel(datasets, range = \"C1:E7\")\n     read_excel(datasets, range = \"R1C2:R2C5\")\n     \n     # Specify the sheet as part of the range\n     read_excel(datasets, range = \"mtcars!B1:D5\")\n     \n     # Read only specific rows or columns\n     read_excel(datasets, range = cell_rows(102:151), col_names = FALSE)\n     read_excel(datasets, range = cell_cols(\"B:D\"))\n     \n     # Get a preview of column names\n     names(read_excel(readxl_example(\"datasets.xlsx\"), n_max = 0))\n     \n     # exploit full .name_repair flexibility from tibble\n     \n     # \"universal\" names are unique and syntactic\n     read_excel(\n       readxl_example(\"deaths.xlsx\"),\n       range = \"arts!A5:F15\",\n       .name_repair = \"universal\"\n     )\n     \n     # specify name repair as a built-in function\n     read_excel(readxl_example(\"clippy.xlsx\"), .name_repair = toupper)\n     \n     # specify name repair as a custom function\n     my_custom_name_repair <- function(nms) tolower(gsub(\"[.]\", \"_\", nms))\n     read_excel(\n       readxl_example(\"datasets.xlsx\"),\n       .name_repair = my_custom_name_repair\n     )\n     \n     # specify name repair as an anonymous function\n     read_excel(\n       readxl_example(\"datasets.xlsx\"),\n       sheet = \"chickwts\",\n       .name_repair = ~ substr(.x, start = 1, stop = 3)\n     )\n\n\n## 5. Use Function\n\nReminder of function signature\n```\nread_excel(\n  path,\n  sheet = NULL,\n  range = NULL,\n  col_names = TRUE,\n  col_types = NULL,\n  na = \"\",\n  trim_ws = TRUE,\n  skip = 0,\n  n_max = Inf,\n  guess_max = min(1000, n_max),\n  progress = readxl_progress(),\n  .name_repair = \"unique\"\n)\n```\n\nLet's practice\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")\n```\n:::\n\n\n\n## Lets make some mistakes\n\n1. What if we read in the data without assigning it to an object (i.e., `read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")`)?\n\n2. What if we forget to specify the sheet argument? (i.e., `dd <- read_excel(path = \"data/serodata.xlsx\")`)?\n\n\n## Installing and calling packages - Common confusion\n\n</br>\n\nYou only need to install a package once (unless you update R or want to update the package), but you will need to call or load a package each time you want to use it. \n\n</br>\n\nThe exception to this rule are the \"base\" set of packages (i.e., **Base R**) that are installed automatically when you install R and that automatically called whenever you open R or RStudio.\n\n\n## Common Error\n\nBe prepared to see this error\n\n\n::: {.cell}\n\n```{.r .cell-code}\nError: could not find function \"some_function_name\"\n```\n:::\n\n\nThis usually means that either \n\n- you called the function by the wrong name \n- you have not installed a package that contains the function\n- you have installed a package but you forgot to attach it (i.e., `require(package_name)`) -- **most likely**\n\n\n## Export (write) Data \n\n-   Exporting or 'Writing out' data allows you to save modified files for future use or sharing\n-   R can write almost any file format, especially with external, non-Base R, packages\n-   We are going to focus again on writing delimited files\n\n\n## Export delimited data\n\nWithin the Base R 'util' package we can find a handful of useful functions including  `write.csv()` and `write.table()` to exporting data.\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\nData Output\n\nDescription:\n\n     'write.table' prints its required argument 'x' (after converting\n     it to a data frame if it is not one nor a matrix) to a file or\n     connection.\n\nUsage:\n\n     write.table(x, file = \"\", append = FALSE, quote = TRUE, sep = \" \",\n                 eol = \"\\n\", na = \"NA\", dec = \".\", row.names = TRUE,\n                 col.names = TRUE, qmethod = c(\"escape\", \"double\"),\n                 fileEncoding = \"\")\n     \n     write.csv(...)\n     write.csv2(...)\n     \nArguments:\n\n       x: the object to be written, preferably a matrix or data frame.\n          If not, it is attempted to coerce 'x' to a data frame.\n\n    file: either a character string naming a file or a connection open\n          for writing.  '\"\"' indicates output to the console.\n\n  append: logical. Only relevant if 'file' is a character string.  If\n          'TRUE', the output is appended to the file.  If 'FALSE', any\n          existing file of the name is destroyed.\n\n   quote: a logical value ('TRUE' or 'FALSE') or a numeric vector.  If\n          'TRUE', any character or factor columns will be surrounded by\n          double quotes.  If a numeric vector, its elements are taken\n          as the indices of columns to quote.  In both cases, row and\n          column names are quoted if they are written.  If 'FALSE',\n          nothing is quoted.\n\n     sep: the field separator string.  Values within each row of 'x'\n          are separated by this string.\n\n     eol: the character(s) to print at the end of each line (row).  For\n          example, 'eol = \"\\r\\n\"' will produce Windows' line endings on\n          a Unix-alike OS, and 'eol = \"\\r\"' will produce files as\n          expected by Excel:mac 2004.\n\n      na: the string to use for missing values in the data.\n\n     dec: the string to use for decimal points in numeric or complex\n          columns: must be a single character.\n\nrow.names: either a logical value indicating whether the row names of\n          'x' are to be written along with 'x', or a character vector\n          of row names to be written.\n\ncol.names: either a logical value indicating whether the column names\n          of 'x' are to be written along with 'x', or a character\n          vector of column names to be written.  See the section on\n          'CSV files' for the meaning of 'col.names = NA'.\n\n qmethod: a character string specifying how to deal with embedded\n          double quote characters when quoting strings.  Must be one of\n          '\"escape\"' (default for 'write.table'), in which case the\n          quote character is escaped in C style by a backslash, or\n          '\"double\"' (default for 'write.csv' and 'write.csv2'), in\n          which case it is doubled.  You can specify just the initial\n          letter.\n\nfileEncoding: character string: if non-empty declares the encoding to\n          be used on a file (not a connection) so the character data\n          can be re-encoded as they are written.  See 'file'.\n\n     ...: arguments to 'write.table': 'append', 'col.names', 'sep',\n          'dec' and 'qmethod' cannot be altered.\n\nDetails:\n\n     If the table has no columns the rownames will be written only if\n     'row.names = TRUE', and _vice versa_.\n\n     Real and complex numbers are written to the maximal possible\n     precision.\n\n     If a data frame has matrix-like columns these will be converted to\n     multiple columns in the result (_via_ 'as.matrix') and so a\n     character 'col.names' or a numeric 'quote' should refer to the\n     columns in the result, not the input.  Such matrix-like columns\n     are unquoted by default.\n\n     Any columns in a data frame which are lists or have a class (e.g.,\n     dates) will be converted by the appropriate 'as.character' method:\n     such columns are unquoted by default.  On the other hand, any\n     class information for a matrix is discarded and non-atomic (e.g.,\n     list) matrices are coerced to character.\n\n     Only columns which have been converted to character will be quoted\n     if specified by 'quote'.\n\n     The 'dec' argument only applies to columns that are not subject to\n     conversion to character because they have a class or are part of a\n     matrix-like column (or matrix), in particular to columns protected\n     by 'I()'.  Use 'options(\"OutDec\")' to control such conversions.\n\n     In almost all cases the conversion of numeric quantities is\n     governed by the option '\"scipen\"' (see 'options'), but with the\n     internal equivalent of 'digits = 15'.  For finer control, use\n     'format' to make a character matrix/data frame, and call\n     'write.table' on that.\n\n     These functions check for a user interrupt every 1000 lines of\n     output.\n\n     If 'file' is a non-open connection, an attempt is made to open it\n     and then close it after use.\n\n     To write a Unix-style file on Windows, use a binary connection\n     e.g. 'file = file(\"filename\", \"wb\")'.\n\nCSV files:\n\n     By default there is no column name for a column of row names.  If\n     'col.names = NA' and 'row.names = TRUE' a blank column name is\n     added, which is the convention used for CSV files to be read by\n     spreadsheets.  Note that such CSV files can be read in R by\n\n       read.csv(file = \"<filename>\", row.names = 1)\n     \n     'write.csv' and 'write.csv2' provide convenience wrappers for\n     writing CSV files.  They set 'sep' and 'dec' (see below), 'qmethod\n     = \"double\"', and 'col.names' to 'NA' if 'row.names = TRUE' (the\n     default) and to 'TRUE' otherwise.\n\n     'write.csv' uses '\".\"' for the decimal point and a comma for the\n     separator.\n\n     'write.csv2' uses a comma for the decimal point and a semicolon\n     for the separator, the Excel convention for CSV files in some\n     Western European locales.\n\n     These wrappers are deliberately inflexible: they are designed to\n     ensure that the correct conventions are used to write a valid\n     file.  Attempts to change 'append', 'col.names', 'sep', 'dec' or\n     'qmethod' are ignored, with a warning.\n\n     CSV files do not record an encoding, and this causes problems if\n     they are not ASCII for many other applications.  Windows Excel\n     2007/10 will open files (e.g., by the file association mechanism)\n     correctly if they are ASCII or UTF-16 (use 'fileEncoding =\n     \"UTF-16LE\"') or perhaps in the current Windows codepage (e.g.,\n     '\"CP1252\"'), but the 'Text Import Wizard' (from the 'Data' tab)\n     allows far more choice of encodings.  Excel:mac 2004/8 can\n     _import_ only 'Macintosh' (which seems to mean Mac Roman),\n     'Windows' (perhaps Latin-1) and 'PC-8' files.  OpenOffice 3.x asks\n     for the character set when opening the file.\n\n     There is an IETF RFC4180\n     (<https://www.rfc-editor.org/rfc/rfc4180>) for CSV files, which\n     mandates comma as the separator and CRLF line endings.\n     'write.csv' writes compliant files on Windows: use 'eol = \"\\r\\n\"'\n     on other platforms.\n\nNote:\n\n     'write.table' can be slow for data frames with large numbers\n     (hundreds or more) of columns: this is inevitable as each column\n     could be of a different class and so must be handled separately.\n     If they are all of the same class, consider using a matrix\n     instead.\n\nSee Also:\n\n     The 'R Data Import/Export' manual.\n\n     'read.table', 'write'.\n\n     'write.matrix' in package 'MASS'.\n\nExamples:\n\n     x <- data.frame(a = I(\"a \\\" quote\"), b = pi)\n     tf <- tempfile(fileext = \".csv\")\n     \n     ## To write a CSV file for input to Excel one might use\n     write.table(x, file = tf, sep = \",\", col.names = NA,\n                 qmethod = \"double\")\n     file.show(tf)\n     ## and to read this file back into R one needs\n     read.table(tf, header = TRUE, sep = \",\", row.names = 1)\n     ## NB: you do need to specify a separator if qmethod = \"double\".\n     \n     ### Alternatively\n     write.csv(x, file = tf)\n     read.csv(tf, row.names = 1)\n     ## or without row names\n     write.csv(x, file = tf, row.names = FALSE)\n     read.csv(tf)\n     \n     ## Not run:\n     \n     ## To write a file in Mac Roman for simple use in Mac Excel 2004/8\n     write.csv(x, file = \"foo.csv\", fileEncoding = \"macroman\")\n     ## or for Windows Excel 2007/10\n     write.csv(x, file = \"foo.csv\", fileEncoding = \"UTF-16LE\")\n     ## End(Not run)\n```\n:::\n:::\n\n\n## Export delimited data\n\nLet's practice exporting the data as three files with three different delimiters (comma, tab, semicolon)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwrite.csv(df, file=\"data/serodata_new.csv\", row.names = FALSE) #comma delimited\nwrite.table(df, file=\"data/serodata1_new.txt\", sep=\"\\t\", row.names = FALSE) #tab delimited\nwrite.table(df, file=\"data/serodata2_new.txt\", sep=\";\", row.names = FALSE) #semicolon delimited\n```\n:::\n\n\nNote, I wrote the data to new file names.  Even though we didn't change the data at all in this module, it is good practice to keep raw data raw, and not to write over it.\n\n## R .rds and .rda/RData files\n\nThere are two file extensions worth discussing.\n\nR has two native data formats—'Rdata' (sometimes shortened to 'Rda') and 'Rds'. These formats are used when R objects are saved for later use. 'Rdata' is used to save multiple R objects, while 'Rds' is used to save a single R object. 'Rds' is fast to write/read and is very small.\n\n## .rds binary file\n\nSaving datasets in `.rds` format can save time if you have to read it back in later.\n\n`write_rds()` and `read_rds()` from `readr` package can be used to write/read a single R object to/from file.\n\n```\nlibrary(readr)\nwrite_rds(object1, file = \"filename.rds\")\nobject1 <- read_rds(file = \"filename.rds\")\n```\n\n\n## .rda/RData files \n\nThe Base R functions `save()` and `load()` can be used to save and load multiple R objects. \n\n`save()` writes an external representation of R objects to the specified file, and can by loaded back into the environment using `load()`. A nice feature about using `save` and `load` is that the R object(s) is directly imported into the environment and you don't have to specify the name. The files can be saved as `.RData` or `.Rda` files.\n\nFunction signature\n```\nsave(object1, object2, file = \"filename.RData\")\nload(\"filename.RData\")\n```\n\nNote, that you separate the objects you want to save with commas.\n\n\n\n## Summary\n\n- Importing or 'Reading in' data are the first step of any real project /  data analysis\n- The Base R 'util' package has useful functions including  `read.csv()` and `read.delim()` to importing/reading data or `write.csv()` and `write.table()` for exporting/writing data\n- When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be used\n- If data are imported correctly, they can be found in the Environment pane of RStudio\n- You only need to install a package once (unless you update R or the package), but you will need to attach a package each time you want to use it. \n- To complete a task you don't know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Attach package, 5. Use function\n\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module06-DataSubset/execute-results/html.json b/_freeze/modules/Module06-DataSubset/execute-results/html.json
index 9376191..f443d53 100644
--- a/_freeze/modules/Module06-DataSubset/execute-results/html.json
+++ b/_freeze/modules/Module06-DataSubset/execute-results/html.json
@@ -1,7 +1,7 @@
 {
-  "hash": "a55663183334bb6cd6f8411f5a7fd0e8",
+  "hash": "cb8299ad6bc8167b765b1cfd90875b0a",
   "result": {
-    "markdown": "---\ntitle: \"Module 6: Get to Know Your Data and Subsetting\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 6, you should be able to...\n\n-   Use basic functions to get to know you data\n-   Use three indexing approaches\n-   Rely on indexing to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\n-   Describe what logical operators are and how to use them\n-   Use on the `subset()` function to subset data\n\n\n## Getting to know our data\n\nThe `dim()`, `nrow()`, and `ncol()` functions are good options to check the dimensions of your data before moving forward. \n\nLet's first read in the data from the previous module.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ndim(df) # rows, columns\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 651   5\n```\n:::\n\n```{.r .cell-code}\nnrow(df) # number of rows\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 651\n```\n:::\n\n```{.r .cell-code}\nncol(df) # number of columns\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 5\n```\n:::\n:::\n\n\n## Quick summary of data\n\nThe `colnames()`, `str()` and `summary()`functions from Base R are great functions to assess the data type and some summary statistics.    \n\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\nstr(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t651 obs. of  5 variables:\n $ observation_id   : int  5772 8095 9784 9338 6369 6885 6252 8913 7332 6941 ...\n $ IgG_concentration: num  0.318 3.437 0.3 143.236 0.448 ...\n $ age              : int  2 4 4 4 1 4 4 NA 4 2 ...\n $ gender           : chr  \"Female\" \"Female\" \"Male\" \"Male\" ...\n $ slum             : chr  \"Non slum\" \"Non slum\" \"Non slum\" \"Non slum\" ...\n```\n:::\n\n```{.r .cell-code}\nsummary(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n observation_id IgG_concentration       age            gender         \n Min.   :5006   Min.   :  0.0054   Min.   : 1.000   Length:651        \n 1st Qu.:6306   1st Qu.:  0.3000   1st Qu.: 3.000   Class :character  \n Median :7495   Median :  1.6658   Median : 6.000   Mode  :character  \n Mean   :7492   Mean   : 87.3683   Mean   : 6.606                     \n 3rd Qu.:8749   3rd Qu.:141.4405   3rd Qu.:10.000                     \n Max.   :9982   Max.   :916.4179   Max.   :15.000                     \n                NA's   :10         NA's   :9                          \n     slum          \n Length:651        \n Class :character  \n Mode  :character  \n                   \n                   \n                   \n                   \n```\n:::\n:::\n\n\nNote, if you have a very large dataset with 15+ variables, `summary()` is not so efficient. \n\n## Description of data\n\nThis is data based on a simulated pathogen X IgG antibody serological survey.  The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization.  We will use this dataset for lectures throughout the Workshop.\n\n## View the data as a whole dataframe\n\nThe `View()` function, one of the few Base R functions with a capital letter can be used to open a new tab in the Console and view the data as you would in excel.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nView(df)\n```\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/ViewTab.png){width=100%}\n:::\n:::\n\n\n## View the data as a whole dataframe\n\nYou can also open a new tab of the data by clicking on the data icon beside the object in the Environment window.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/View.png){width=90%}\n:::\n:::\n\n\n## Indexing\n\nR contains several constructs which allow access to individual elements or subsets through indexing operations. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing syntax: `[ ]`, `[[ ]]` and `$`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx[i] #if x is a vector\nx[i, j] #if x is a matrix/data frame\nx[[i]] #if x is a list\nx$a #if x is a data frame or list\nx$\"a\" #if x is a data frame or list\n```\n:::\n\n\n## Vectors and multi-dimensional objects\n\nTo index a vector, `vector[i]` select the ith element. To index a multi-dimensional objects such as a matrix, `matrix[i, j]` selects the element in row i and column j, where as in a three dimensional `array[k, i, i, j]` selects the element in matrix k, row i, and column j. \n\nLet's practice by first creating the same objects as we did in Module 1.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\nHere is a reminder of what these objects look like.\n\n::: {.cell}\n\n```{.r .cell-code}\nvector.object1\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 2 3 4 5\n```\n:::\n\n```{.r .cell-code}\nmatrix.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nFinally, let's use indexing to pull our elements of the objects.  \n\n::: {.cell}\n\n```{.r .cell-code}\nvector.object1[2] #pulling the second element\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n\n```{.r .cell-code}\nmatrix.object[1,2] #pulling the element in row 1 column 2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n:::\n\n\n\n## List objects\n\nFor lists, one generally uses `list[[p]]` to select any single element p.\n\nLet's practice by creating the same list as we did in Module 1.\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object <- list(number.object, vector.object2, matrix.object)\nlist.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nNow we use indexing to pull out the 3rd element in the list.\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object[[3]]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\n## $ for indexing\n\n`$` allows only a literal character string or a symbol as the index.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$IgG_concentration\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n```\n:::\n:::\n\n\nNote, if you have spaces in your variable name, you will need to use back ticks `variable name` after the `$`.  This is a good reason to not create variables / column names with spaces.\n\n## $ for indexing with lists\n\nList elements can be named\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object.named <- list(\n  emory = number.object,\n  uga = vector.object2,\n  gsu = matrix.object\n)\nlist.object.named\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n$emory\n[1] 3\n\n$uga\n[1] \"blue\"   \"red\"    \"yellow\"\n\n$gsu\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nIf list elements are named, than you can reference data from list using `$` or using double square brackets, `[[ ]]`\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object.named$uga \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"blue\"   \"red\"    \"yellow\"\n```\n:::\n\n```{.r .cell-code}\nlist.object.named[[\"uga\"]] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"blue\"   \"red\"    \"yellow\"\n```\n:::\n:::\n\n\n\n## Using indexing to rename columns\n\nAs mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df) # just prints\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\ncolnames(df)[1:2] <- c(\"IgG_concentration_mIU/mL\", \"age_year\") # reassigns\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"IgG_concentration_mIU/mL\" \"age_year\"                \n[3] \"age\"                      \"gender\"                  \n[5] \"slum\"                    \n```\n:::\n\n```{.r .cell-code}\ncolnames(df)[1:2] <- c(\"IgG_concentration\", \"age\") #reset\n```\n:::\n\n\n##  Using indexing to subset by columns\n\nWe can also subset a data frames and matrices (2-dimensional objects) using the bracket `[ row , column ]`.  We can subset by columns and pull the `x` column using the index of the column or the column name. \n\nFor example, here I am pulling the 3nd column, which has the variable name `age`\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[ , \"age\"] #same as df[ , 3]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n```\n:::\n:::\n\nWe can select multiple columns using multiple column names:\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[, c(\"age\", \"gender\")] #same as df[ , c(3,4)]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n             age gender\n1   3.176895e-01 Female\n2   3.436823e+00 Female\n3   3.000000e-01   Male\n4   1.432363e+02   Male\n5   4.476534e-01   Male\n6   2.527076e-02   Male\n7   6.101083e-01 Female\n8   3.000000e-01 Female\n9   2.916968e+00   Male\n10  1.649819e+00   Male\n11  4.574007e+00   Male\n12  1.583904e+02 Female\n13            NA   Male\n14  1.065068e+02   Male\n15  1.113870e+02   Male\n16  4.144893e+01   Male\n17  3.000000e-01   Male\n18  2.527076e-01 Female\n19  8.159247e+01 Female\n20  1.825342e+02   Male\n21  4.244656e+01   Male\n22  1.193493e+02 Female\n23  3.000000e-01   Male\n24  3.000000e-01 Female\n25  9.025271e-01 Female\n26  3.501805e-01   Male\n27  3.000000e-01   Male\n28  1.227437e+00 Female\n29  1.702055e+02 Female\n30  3.000000e-01 Female\n31  4.801444e-01   Male\n32  2.527076e-02   Male\n33  3.000000e-01 Female\n34  5.776173e-02   Male\n35  4.801444e-01 Female\n36  3.826715e-01 Female\n37  3.000000e-01   Male\n38  4.048558e+02   Male\n39  3.000000e-01   Male\n40  5.451264e-01   Male\n41  3.000000e-01 Female\n42  5.590753e+01   Male\n43  2.202166e-01 Female\n44  1.709760e+02   Male\n45  1.227437e+00   Male\n46  4.567527e+02   Male\n47  4.838480e+01   Male\n48  1.227437e-01 Female\n49  1.877256e-01 Female\n50  3.000000e-01 Female\n51  3.501805e-01   Male\n52  3.339350e+00   Male\n53  3.000000e-01 Female\n54  5.451264e-01 Female\n55            NA   Male\n56  2.104693e+00   Male\n57            NA   Male\n58  3.826715e-01 Female\n59  3.926366e+01 Female\n60  1.129964e+00   Male\n61  3.501805e+00 Female\n62  7.542808e+01 Female\n63  4.800475e+01 Female\n64  1.000000e+00   Male\n65  4.068884e+01   Male\n66  3.000000e-01 Female\n67  4.377672e+01 Female\n68  1.193493e+02   Male\n69  6.977740e+01   Male\n70  1.373288e+02 Female\n71  1.642979e+02   Male\n72            NA Female\n73  1.542808e+02   Male\n74  6.033058e-01   Male\n75  2.809917e-01   Male\n76  1.966942e+00   Male\n77  2.041322e+00   Male\n78  2.115702e+00 Female\n79  4.663043e+02   Male\n80  3.000000e-01   Male\n81  1.500796e+02   Male\n82  1.543790e+02 Female\n83  2.561983e-01 Female\n84  1.596338e+02   Male\n85  1.732484e+02 Female\n86  4.641304e+02 Female\n87  3.736364e+01   Male\n88  1.572452e+02 Female\n89  3.000000e-01   Male\n90  3.000000e-01   Male\n91  8.264463e-02   Male\n92  6.776859e-01 Female\n93  7.272727e-01   Male\n94  2.066116e-01 Female\n95  1.966942e+00   Male\n96  3.000000e-01   Male\n97  3.000000e-01   Male\n98  2.809917e-01 Female\n99  8.016529e-01 Female\n100 1.818182e-01 Female\n101 1.818182e-01   Male\n102 8.264463e-02 Female\n103 3.422727e+01 Female\n104 8.743506e+00   Male\n105 3.000000e-01   Male\n106 1.641720e+02 Female\n107 4.049587e-01   Male\n108 1.001592e+02   Male\n109 4.489130e+02 Female\n110 1.101911e+02 Female\n111 4.440909e+01   Male\n112 1.288217e+02 Female\n113 2.840909e+01   Male\n114 1.003981e+02 Female\n115 8.512397e-01 Female\n116 1.322314e-01   Male\n117 1.297521e+00 Female\n118 1.570248e-01   Male\n119 1.966942e+00 Female\n120 1.536624e+02   Male\n121 3.000000e-01 Female\n122 3.000000e-01 Female\n123 1.074380e+00   Male\n124 1.099174e+00 Female\n125 3.057851e-01 Female\n126 3.000000e-01 Female\n127 5.785124e-02 Female\n128 4.391304e+02 Female\n129 6.130435e+02 Female\n130 1.074380e-01   Male\n131 7.125796e+01   Male\n132 4.222727e+01   Male\n133 1.620223e+02 Female\n134 3.750000e+01 Female\n135 1.534236e+02 Female\n136 6.239130e+02 Female\n137 5.521739e+02   Male\n138 5.785124e-02 Female\n139 6.547945e-01 Female\n140 8.767123e-02 Female\n141 3.000000e-01   Male\n142 2.849315e+00 Female\n143 3.835616e-02   Male\n144 2.849315e-01   Male\n145 4.649315e+00   Male\n146 1.369863e-01 Female\n147 3.589041e-01   Male\n148 1.049315e+00   Male\n149 4.668998e+01 Female\n150 1.473510e+02 Female\n151 4.589744e+01   Male\n152 2.109589e-01   Male\n153 1.741722e+02 Female\n154 2.496503e+01 Female\n155 1.850993e+02   Male\n156 1.863014e-01   Male\n157 1.863014e-01   Male\n158 4.589744e+01 Female\n159 1.942881e+02 Female\n160 5.079646e+02 Female\n161 8.767123e-01   Male\n162 2.750685e+00   Male\n163 1.503311e+02 Female\n164 3.000000e-01   Male\n165 3.095890e-01   Male\n166 3.000000e-01   Male\n167 6.371681e+02 Female\n168 6.054795e-01 Female\n169 1.955298e+02 Female\n170 1.786424e+02   Male\n171 1.120861e+02 Female\n172 1.331954e+02   Male\n173 2.159292e+02   Male\n174 5.628319e+02   Male\n175 1.900662e+02 Female\n176 6.547945e-01   Male\n177 1.665753e+00   Male\n178 1.739238e+02   Male\n179 9.991722e+01   Male\n180 9.321192e+01   Male\n181 8.767123e-02 Female\n182           NA   Male\n183 6.794521e-01 Female\n184 5.808219e-01   Male\n185 1.369863e-01 Female\n186 2.060274e+00 Female\n187 1.610099e+02   Male\n188 4.082192e-01 Female\n189 8.273973e-01   Male\n190 4.601770e+02 Female\n191 1.389073e+02 Female\n192 3.867133e+01 Female\n193 9.260274e-01 Female\n194 5.918874e+01 Female\n195 1.870861e+02 Female\n196 4.328767e-01   Male\n197 6.301370e-02   Male\n198 3.000000e-01 Female\n199 1.548013e+02   Male\n200 5.819536e+01 Female\n201 1.724338e+02 Female\n202 1.932401e+01 Female\n203 2.164420e+00 Female\n204 9.757412e-01 Female\n205 1.509434e-01   Male\n206 1.509434e-01 Female\n207 7.766571e+01   Male\n208 4.319563e+01 Female\n209 1.752022e-01   Male\n210 3.094775e+01 Female\n211 1.266846e-01   Male\n212 2.919806e+01   Male\n213 9.545455e+00 Female\n214 2.735115e+01 Female\n215 1.314841e+02 Female\n216 3.643985e+01   Male\n217 1.498559e+02 Female\n218 9.363636e+00 Female\n219 2.479784e-01   Male\n220 5.390836e-02 Female\n221 8.787062e-01 Female\n222 1.994609e-01   Male\n223 3.000000e-01 Female\n224 3.000000e-01   Male\n225 5.390836e-03 Female\n226 4.177898e-01 Female\n227 3.000000e-01 Female\n228 2.479784e-01   Male\n229 2.964960e-02   Male\n230 2.964960e-01   Male\n231 5.148248e+00 Female\n232 1.994609e-01   Male\n233 3.000000e-01   Male\n234 1.779539e+02   Male\n235 3.290210e+02 Female\n236 3.000000e-01   Male\n237 1.809798e+02 Female\n238 4.905660e-01   Male\n239 1.266846e-01   Male\n240 1.543948e+02 Female\n241 1.379683e+02 Female\n242 6.153846e+02   Male\n243 1.474784e+02   Male\n244 3.000000e-01 Female\n245 1.024259e+00   Male\n246 4.444056e+02 Female\n247 3.000000e-01   Male\n248 2.504043e+00 Female\n249 3.000000e-01 Female\n250 3.000000e-01 Female\n251 7.816712e-02 Female\n252 3.000000e-01 Female\n253 5.390836e-02   Male\n254 1.494236e+02 Female\n255 5.972622e+01   Male\n256 6.361186e-01 Female\n257 1.837896e+02 Female\n258 1.320809e+02 Female\n259 1.571906e-01   Male\n260 1.520231e+02   Male\n261 3.000000e-01 Female\n262 3.000000e-01 Female\n263 1.823699e+02   Male\n264 3.000000e-01   Male\n265 2.173913e+00   Male\n266 2.142202e+01   Male\n267 3.000000e-01 Female\n268 3.408027e+00   Male\n269 4.155963e+01   Male\n270 9.698997e-02   Male\n271 1.238532e+01 Female\n272 9.528926e+00   Male\n273 1.916185e+02 Female\n274 1.060201e+00   Male\n275 3.679104e+02 Female\n276 4.288991e+01   Male\n277 9.971098e+01   Male\n278 3.000000e-01   Male\n279 1.208092e+02   Male\n280 3.000000e-01   Male\n281 6.688963e-03 Female\n282 2.505017e+00 Female\n283 1.481605e+00   Male\n284 3.000000e-01 Female\n285 5.183946e-01 Female\n286 3.000000e-01 Female\n287 1.872910e-01   Male\n288 3.678930e-01 Female\n289 3.000000e-01   Male\n290 4.529851e+02 Female\n291 3.169725e+01 Female\n292 3.000000e-01   Male\n293 4.922018e+01   Male\n294 2.548507e+02   Male\n295 1.661850e+02   Male\n296 9.164179e+02   Male\n297 3.678930e-01 Female\n298 1.236994e+02   Male\n299 6.705202e+01   Male\n300 3.834862e+01   Male\n301 1.963211e+00 Female\n302 3.000000e-01   Male\n303 2.474916e-01   Male\n304 3.000000e-01 Female\n305 2.173913e-01   Male\n306 8.193980e-01   Male\n307 2.444816e+00 Female\n308 3.000000e-01   Male\n309 1.571906e-01 Female\n310 1.849711e+02   Male\n311 6.119403e+02 Female\n312 3.000000e-01 Female\n313 4.280936e-01 Female\n314 9.698997e-02   Male\n315 3.678930e-02 Female\n316 4.832090e+02   Male\n317 1.390173e+02 Female\n318 3.000000e-01   Male\n319 6.555970e+02 Female\n320 1.526012e+02 Female\n321 3.000000e-01 Female\n322 7.222222e-01   Male\n323 7.724426e+01   Male\n324 3.000000e-01   Male\n325 6.111111e-01 Female\n326 1.555556e+00 Female\n327 3.055556e-01   Male\n328 1.500000e+00   Male\n329 1.470772e+02   Male\n330 1.694444e+00 Female\n331 3.138298e+02 Female\n332 1.414405e+02 Female\n333 1.990605e+02 Female\n334 4.212766e+02   Male\n335 3.000000e-01   Male\n336 3.000000e-01   Male\n337 6.478723e+02   Male\n338 3.000000e-01   Male\n339 2.222222e+00 Female\n340 3.000000e-01   Male\n341 2.055556e+00   Male\n342 2.777778e-02 Female\n343 8.333333e-02   Male\n344 1.032359e+02 Female\n345 1.611111e+00 Female\n346 8.333333e-02 Female\n347 2.333333e+00 Female\n348 5.755319e+02   Male\n349 1.686848e+02 Female\n350 1.111111e-01   Male\n351 3.000000e-01   Male\n352 8.372340e+02 Female\n353 3.000000e-01   Male\n354 3.784504e+01   Male\n355 3.819149e+02   Male\n356 5.555556e-02 Female\n357 3.000000e+02 Female\n358 1.855950e+02   Male\n359 1.944444e-01 Female\n360 3.000000e-01   Male\n361 5.555556e-02 Female\n362 1.138889e+00   Male\n363 4.254237e+01 Female\n364 3.000000e-01   Male\n365 3.000000e-01   Male\n366 3.000000e-01 Female\n367 3.000000e-01 Female\n368 3.138298e+02 Female\n369 1.235908e+02   Male\n370 4.159574e+02   Male\n371 3.009685e+01 Female\n372 1.567850e+02 Female\n373 1.367432e+02 Female\n374 3.731235e+01 Female\n375 9.164927e+01   Male\n376 2.936170e+02 Female\n377 8.820459e+01 Female\n378 1.035491e+02   Male\n379 7.379958e+01 Female\n380 3.000000e-01   Male\n381 1.718750e+02   Male\n382 2.128527e+00   Male\n383 1.253918e+00 Female\n384 2.382445e-01   Male\n385 4.639498e-01 Female\n386 1.253918e-01   Male\n387 1.253918e-01   Male\n388 3.000000e-01 Female\n389 1.000000e+00   Male\n390 1.570043e+02   Male\n391 4.344086e+02 Female\n392 2.184953e+00   Male\n393 1.507837e+00 Female\n394 3.228840e-01 Female\n395 4.588024e+01   Male\n396 1.660560e+02   Male\n397 3.000000e-01   Male\n398 3.043011e+02   Male\n399 2.612903e+02 Female\n400 1.621767e+02   Male\n401 3.228840e-01   Male\n402 4.639498e-01 Female\n403 2.495298e+00 Female\n404 3.257053e+00 Female\n405 3.793103e-01 Female\n406           NA   Male\n407 6.896552e-02 Female\n408 3.000000e-01   Male\n409 1.423197e+00 Female\n410 3.000000e-01 Female\n411 3.000000e-01 Female\n412 1.786638e+02   Male\n413 3.279570e+02   Male\n414           NA Female\n415 1.903017e+02   Male\n416 1.654095e+02 Female\n417 4.639498e-01 Female\n418 1.815733e+02   Male\n419 1.366771e+00   Male\n420 1.536050e-01 Female\n421 1.306587e+01   Male\n422 2.129032e+02 Female\n423 1.925647e+02   Male\n424 3.000000e-01 Female\n425 1.028213e+00 Female\n426 3.793103e-01 Female\n427 8.025078e-01 Female\n428 4.860215e+02 Female\n429 3.000000e-01 Female\n430 2.100313e-01   Male\n431 2.767665e+01 Female\n432 1.592476e+00   Male\n433 9.717868e-02 Female\n434 1.028213e+00 Female\n435 3.793103e-01   Male\n436 1.292026e+02   Male\n437 4.425150e+01 Female\n438 3.193548e+02 Female\n439 1.860991e+02 Female\n440 6.614420e-01 Female\n441 5.203762e-01   Male\n442 1.330819e+02   Male\n443 1.673491e+02 Female\n444 3.000000e-01   Male\n445 1.117457e+02   Male\n446 3.045509e+01 Female\n447 3.000000e-01   Male\n448 8.280255e-02 Female\n449 3.000000e-01 Female\n450 1.200637e+00 Female\n451 1.687898e-01   Male\n452 7.367273e+02 Female\n453 8.280255e-02   Male\n454 5.127389e-01   Male\n455 1.974522e-01   Male\n456 7.993631e-01 Female\n457 3.000000e-01   Male\n458 3.298182e+02   Male\n459 9.736842e+01 Female\n460 3.000000e-01 Female\n461 3.000000e-01 Female\n462 4.214545e+02 Female\n463 3.000000e-01   Male\n464 2.578182e+02 Female\n465 2.261147e-01   Male\n466 3.000000e-01 Female\n467 1.883901e+02   Male\n468 9.458204e+01 Female\n469 3.000000e-01 Female\n470 3.000000e-01   Male\n471 7.707006e-01 Female\n472 5.032727e+02   Male\n473 1.544586e+00 Female\n474 1.431115e+02 Female\n475 3.000000e-01   Male\n476 1.458599e+00   Male\n477 1.247678e+02 Female\n478           NA Female\n479 4.334545e+02   Male\n480 3.000000e-01 Female\n481 6.156364e+02 Female\n482 9.574303e+01   Male\n483 1.928019e+02   Male\n484 1.888545e+02   Male\n485 1.598297e+02 Female\n486 5.127389e-01   Male\n487 1.171053e+02 Female\n488           NA   Male\n489 2.547771e-02 Female\n490 1.707430e+02 Female\n491 3.000000e-01   Male\n492 1.869969e+02   Male\n493 4.731481e+01   Male\n494 1.988390e+02 Female\n495 3.000000e-01   Male\n496 8.808050e+01   Male\n497 2.003185e+00 Female\n498 3.000000e-01   Male\n499 3.509259e+01 Female\n500 9.365325e+01 Female\n501 3.000000e-01   Male\n502 3.736111e+01 Female\n503 1.674923e+02 Female\n504 8.808050e+01   Male\n505 1.656347e+02 Female\n506 3.722222e+01 Female\n507 6.756364e+02 Female\n508 3.000000e-01   Male\n509 1.698142e+02   Male\n510 1.628483e+02 Female\n511 5.985130e-01   Male\n512 1.903346e+00 Female\n513 3.000000e-01   Male\n514 3.000000e-01   Male\n515 8.996283e-01   Male\n516 3.977695e-01 Female\n517 3.000000e-01   Male\n518 3.000000e-01   Male\n519 3.000000e-01   Male\n520 3.000000e-01 Female\n521 7.446809e+02   Male\n522 6.095745e+02 Female\n523 1.427445e+02   Male\n524 3.000000e-01 Female\n525 2.973978e-02   Male\n526 3.977695e-01 Female\n527 4.095745e+02 Female\n528 4.595745e+02   Male\n529 3.000000e-01 Female\n530 1.976341e+02 Female\n531 3.776596e+02 Female\n532 1.777603e+02 Female\n533 4.312268e-01   Male\n534 6.765957e+02 Female\n535 7.978723e+02   Male\n536 9.665427e-02   Male\n537 1.879338e+02   Male\n538 4.358670e+01 Female\n539 3.000000e-01 Female\n540 3.000000e-01   Male\n541 2.638955e+01   Male\n542 3.180523e+01 Female\n543 1.746845e+02   Male\n544 1.876972e+02   Male\n545 1.044164e+02   Male\n546 1.202681e+02   Male\n547 1.630915e+02 Female\n548 1.276025e+02 Female\n549 8.880126e+01   Male\n550 3.563830e+02   Male\n551 2.212766e+02   Male\n552 1.969121e+01 Female\n553 3.755319e+02 Female\n554 1.214511e+02   Male\n555 1.034700e+02 Female\n556 3.000000e-01 Female\n557 3.643123e-01 Female\n558 6.319703e-02 Female\n559 3.000000e-01   Male\n560 3.000000e-01   Male\n561 3.000000e-01 Female\n562 3.000000e-01 Female\n563 3.000000e-01   Male\n564 3.000000e-01   Male\n565 3.000000e-01 Female\n566 3.000000e-01   Male\n567 1.664038e+02 Female\n568 2.946809e+02 Female\n569 4.391924e+01   Male\n570 1.874606e+02 Female\n571 1.143533e+02   Male\n572 1.600158e+02   Male\n573 1.635688e-01   Male\n574 8.809148e+01 Female\n575 1.337539e+02   Male\n576 1.985804e+02   Male\n577 1.578864e+02 Female\n578 3.000000e-01 Female\n579 3.000000e-01   Male\n580 1.953642e-01 Female\n581 1.119205e+00   Male\n582 2.523636e+02   Male\n583 3.000000e-01   Male\n584 4.844371e+00 Female\n585 3.000000e-01   Male\n586 1.492553e+02 Female\n587 1.993617e+02   Male\n588 2.847682e-01 Female\n589 3.145695e-01 Female\n590 3.000000e-01   Male\n591 3.406429e+01 Female\n592 6.595745e+01   Male\n593 3.000000e-01   Male\n594 2.174545e+02   Male\n595           NA Female\n596 5.957447e+01 Female\n597 7.236364e+02 Female\n598 3.000000e-01   Male\n599 3.000000e-01 Female\n600 3.000000e-01   Male\n601 2.676364e+02   Male\n602 1.891489e+02   Male\n603 3.036364e+02 Female\n604 3.000000e-01 Female\n605 3.000000e-01   Male\n606 3.000000e-01   Male\n607 3.000000e-01 Female\n608 3.000000e-01   Male\n609 1.447020e+00   Male\n610 2.130909e+02 Female\n611 1.357616e-01 Female\n612 3.000000e-01 Female\n613 3.000000e-01 Female\n614 5.534545e+02 Female\n615 1.891489e+02 Female\n616 7.202128e+01 Female\n617 3.250287e+01   Male\n618 1.655629e-02   Male\n619 3.123636e+02   Male\n620 3.000000e-01   Male\n621 7.138298e+01   Male\n622 3.000000e-01 Female\n623 6.946809e+01 Female\n624 4.012629e+01   Male\n625 1.629787e+02 Female\n626 1.508511e+02 Female\n627 1.655629e-02   Male\n628 3.000000e-01   Male\n629 4.635762e-02   Male\n630 3.000000e-01 Female\n631 3.000000e-01 Female\n632 3.000000e-01   Male\n633 1.942553e+02   Male\n634 3.690909e+02   Male\n635 3.000000e-01 Female\n636 3.000000e-01 Female\n637 2.847682e+00   Male\n638 1.435106e+02 Female\n639 3.000000e-01   Male\n640 4.752009e+01 Female\n641 2.621125e+01 Female\n642 1.055319e+02 Female\n643 3.000000e-01 Female\n644 1.149007e+00   Male\n645 2.927273e+02 Female\n646 3.000000e-01 Female\n647 3.000000e-01 Female\n648 4.839265e+01   Male\n649 3.000000e-01   Male\n650 3.000000e-01 Female\n651 2.251656e-01 Female\n```\n:::\n:::\n\nWe can remove select columns using indexing as well, OR by simply changing the column to `NULL`\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[, -5] #remove column 5, \"slum\" variable\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    IgG_concentration          age age.1 gender\n1                5772 3.176895e-01     2 Female\n2                8095 3.436823e+00     4 Female\n3                9784 3.000000e-01     4   Male\n4                9338 1.432363e+02     4   Male\n5                6369 4.476534e-01     1   Male\n6                6885 2.527076e-02     4   Male\n7                6252 6.101083e-01     4 Female\n8                8913 3.000000e-01    NA Female\n9                7332 2.916968e+00     4   Male\n10               6941 1.649819e+00     2   Male\n11               5104 4.574007e+00     3   Male\n12               9078 1.583904e+02    15 Female\n13               9960           NA     8   Male\n14               9651 1.065068e+02    12   Male\n15               9229 1.113870e+02    15   Male\n16               5210 4.144893e+01     9   Male\n17               5105 3.000000e-01     8   Male\n18               7607 2.527076e-01     7 Female\n19               7582 8.159247e+01    11 Female\n20               8179 1.825342e+02    10   Male\n21               5660 4.244656e+01     8   Male\n22               6696 1.193493e+02    11 Female\n23               7842 3.000000e-01     2   Male\n24               6578 3.000000e-01     2 Female\n25               9619 9.025271e-01     3 Female\n26               9838 3.501805e-01     5   Male\n27               6935 3.000000e-01     1   Male\n28               5885 1.227437e+00     3 Female\n29               9657 1.702055e+02     5 Female\n30               9146 3.000000e-01     5 Female\n31               7056 4.801444e-01     3   Male\n32               9144 2.527076e-02     1   Male\n33               8696 3.000000e-01     4 Female\n34               7042 5.776173e-02     3   Male\n35               5278 4.801444e-01     2 Female\n36               6541 3.826715e-01    11 Female\n37               6070 3.000000e-01     7   Male\n38               5490 4.048558e+02     8   Male\n39               6527 3.000000e-01     6   Male\n40               5389 5.451264e-01     6   Male\n41               9003 3.000000e-01    11 Female\n42               6682 5.590753e+01    10   Male\n43               7844 2.202166e-01     6 Female\n44               8257 1.709760e+02    12   Male\n45               7767 1.227437e+00    11   Male\n46               8391 4.567527e+02    10   Male\n47               8317 4.838480e+01    11   Male\n48               7397 1.227437e-01    13 Female\n49               8495 1.877256e-01     3 Female\n50               8093 3.000000e-01     4 Female\n51               7375 3.501805e-01     3   Male\n52               5255 3.339350e+00     1   Male\n53               8445 3.000000e-01     2 Female\n54               8959 5.451264e-01     2 Female\n55               8400           NA     4   Male\n56               7420 2.104693e+00     2   Male\n57               5206           NA     2   Male\n58               7431 3.826715e-01     3 Female\n59               7230 3.926366e+01     3 Female\n60               8208 1.129964e+00     4   Male\n61               8538 3.501805e+00     1 Female\n62               6125 7.542808e+01    13 Female\n63               5767 4.800475e+01    13 Female\n64               5487 1.000000e+00     6   Male\n65               5539 4.068884e+01    13   Male\n66               5759 3.000000e-01     5 Female\n67               6845 4.377672e+01    13 Female\n68               7170 1.193493e+02    14   Male\n69               6588 6.977740e+01    13   Male\n70               7939 1.373288e+02     8 Female\n71               5006 1.642979e+02     7   Male\n72               9180           NA     6 Female\n73               9638 1.542808e+02    13   Male\n74               7781 6.033058e-01     3   Male\n75               6932 2.809917e-01     4   Male\n76               8120 1.966942e+00     2   Male\n77               9292 2.041322e+00    NA   Male\n78               9228 2.115702e+00     5 Female\n79               8185 4.663043e+02     3   Male\n80               6797 3.000000e-01     3   Male\n81               5970 1.500796e+02    14   Male\n82               7219 1.543790e+02    11 Female\n83               6870 2.561983e-01     7 Female\n84               7653 1.596338e+02     7   Male\n85               8824 1.732484e+02    11 Female\n86               8311 4.641304e+02     9 Female\n87               9458 3.736364e+01    14   Male\n88               8275 1.572452e+02    13 Female\n89               6786 3.000000e-01     1   Male\n90               6595 3.000000e-01     1   Male\n91               5264 8.264463e-02     4   Male\n92               9188 6.776859e-01     1 Female\n93               6611 7.272727e-01     2   Male\n94               6840 2.066116e-01     3 Female\n95               5663 1.966942e+00     2   Male\n96               9611 3.000000e-01     1   Male\n97               7717 3.000000e-01     2   Male\n98               8374 2.809917e-01     2 Female\n99               5134 8.016529e-01     4 Female\n100              8122 1.818182e-01     5 Female\n101              6192 1.818182e-01     5   Male\n102              9668 8.264463e-02     6 Female\n103              9577 3.422727e+01    14 Female\n104              6403 8.743506e+00    14   Male\n105              9464 3.000000e-01    10   Male\n106              8157 1.641720e+02     6 Female\n107              9451 4.049587e-01     6   Male\n108              6615 1.001592e+02     8   Male\n109              9074 4.489130e+02     6 Female\n110              7479 1.101911e+02    12 Female\n111              8946 4.440909e+01    12   Male\n112              5296 1.288217e+02    14 Female\n113              6238 2.840909e+01    15   Male\n114              6303 1.003981e+02    12 Female\n115              6662 8.512397e-01     4 Female\n116              6251 1.322314e-01     4   Male\n117              9110 1.297521e+00     3 Female\n118              8480 1.570248e-01    NA   Male\n119              5229 1.966942e+00     2 Female\n120              9173 1.536624e+02     3   Male\n121              9896 3.000000e-01    NA Female\n122              5057 3.000000e-01     3 Female\n123              7732 1.074380e+00     3   Male\n124              6882 1.099174e+00     2 Female\n125              9587 3.057851e-01     4 Female\n126              9930 3.000000e-01    10 Female\n127              6960 5.785124e-02     7 Female\n128              6335 4.391304e+02    11 Female\n129              6286 6.130435e+02     6 Female\n130              9035 1.074380e-01    11   Male\n131              5720 7.125796e+01     9   Male\n132              7368 4.222727e+01     6   Male\n133              5170 1.620223e+02    13 Female\n134              6691 3.750000e+01    10 Female\n135              6173 1.534236e+02     6 Female\n136              8170 6.239130e+02    11 Female\n137              9637 5.521739e+02     7   Male\n138              9482 5.785124e-02     6 Female\n139              7880 6.547945e-01     4 Female\n140              6307 8.767123e-02     4 Female\n141              8822 3.000000e-01     4   Male\n142              8190 2.849315e+00     4 Female\n143              7554 3.835616e-02     4   Male\n144              6519 2.849315e-01     4   Male\n145              9764 4.649315e+00     3   Male\n146              8792 1.369863e-01     4 Female\n147              6721 3.589041e-01     3   Male\n148              9042 1.049315e+00     3   Male\n149              7407 4.668998e+01    13 Female\n150              7229 1.473510e+02     7 Female\n151              7532 4.589744e+01    10   Male\n152              6516 2.109589e-01     6   Male\n153              7941 1.741722e+02    10 Female\n154              8124 2.496503e+01    12 Female\n155              7869 1.850993e+02    10   Male\n156              5647 1.863014e-01    10   Male\n157              9120 1.863014e-01    13   Male\n158              6608 4.589744e+01    13 Female\n159              8635 1.942881e+02     5 Female\n160              9341 5.079646e+02     3 Female\n161              9982 8.767123e-01     4   Male\n162              6976 2.750685e+00     1   Male\n163              6008 1.503311e+02     3 Female\n164              5432 3.000000e-01     4   Male\n165              5749 3.095890e-01     4   Male\n166              6428 3.000000e-01     1   Male\n167              5947 6.371681e+02     5 Female\n168              6027 6.054795e-01     6 Female\n169              5064 1.955298e+02    14 Female\n170              5861 1.786424e+02     6   Male\n171              6702 1.120861e+02    13 Female\n172              7851 1.331954e+02     9   Male\n173              8310 2.159292e+02    11   Male\n174              5897 5.628319e+02    10   Male\n175              9249 1.900662e+02     5 Female\n176              9163 6.547945e-01    14   Male\n177              6550 1.665753e+00     7   Male\n178              5859 1.739238e+02    10   Male\n179              5607 9.991722e+01     6   Male\n180              8746 9.321192e+01     5   Male\n181              5274 8.767123e-02     3 Female\n182              9412           NA     4   Male\n183              5691 6.794521e-01     2 Female\n184              9016 5.808219e-01     3   Male\n185              9128 1.369863e-01     3 Female\n186              8539 2.060274e+00     2 Female\n187              5703 1.610099e+02     3   Male\n188              9573 4.082192e-01     5 Female\n189              5852 8.273973e-01     2   Male\n190              5971 4.601770e+02     3 Female\n191              7015 1.389073e+02    14 Female\n192              8221 3.867133e+01     9 Female\n193              6752 9.260274e-01    14 Female\n194              7436 5.918874e+01     9 Female\n195              6869 1.870861e+02     8 Female\n196              8947 4.328767e-01     7   Male\n197              7360 6.301370e-02    13   Male\n198              7494 3.000000e-01     8 Female\n199              8243 1.548013e+02     6   Male\n200              6176 5.819536e+01    12 Female\n201              6818 1.724338e+02    14 Female\n202              8083 1.932401e+01    15 Female\n203              6711 2.164420e+00     2 Female\n204              8890 9.757412e-01     4 Female\n205              5576 1.509434e-01     3   Male\n206              8396 1.509434e-01     3 Female\n207              5986 7.766571e+01     3   Male\n208              9758 4.319563e+01     4 Female\n209              5444 1.752022e-01     3   Male\n210              6394 3.094775e+01    14 Female\n211              5694 1.266846e-01     8   Male\n212              9604 2.919806e+01     7   Male\n213              7895 9.545455e+00    14 Female\n214              5141 2.735115e+01    13 Female\n215              8034 1.314841e+02    13 Female\n216              6566 3.643985e+01     7   Male\n217              6827 1.498559e+02     8 Female\n218              7400 9.363636e+00    10 Female\n219              9094 2.479784e-01     9   Male\n220              9474 5.390836e-02     9 Female\n221              7984 8.787062e-01     3 Female\n222              9524 1.994609e-01     4   Male\n223              9598 3.000000e-01     4 Female\n224              9664 3.000000e-01     4   Male\n225              9910 5.390836e-03     2 Female\n226              9216 4.177898e-01     1 Female\n227              9706 3.000000e-01     3 Female\n228              5320 2.479784e-01     2   Male\n229              5256 2.964960e-02     3   Male\n230              9006 2.964960e-01     5   Male\n231              6413 5.148248e+00     2 Female\n232              8717 1.994609e-01     2   Male\n233              9873 3.000000e-01     9   Male\n234              6699 1.779539e+02    13   Male\n235              8228 3.290210e+02    10 Female\n236              6494 3.000000e-01     6   Male\n237              9294 1.809798e+02    13 Female\n238              7680 4.905660e-01    11   Male\n239              7534 1.266846e-01    10   Male\n240              9920 1.543948e+02     8 Female\n241              9814 1.379683e+02     9 Female\n242              5363 6.153846e+02    10   Male\n243              5842 1.474784e+02    14   Male\n244              7992 3.000000e-01     1 Female\n245              5565 1.024259e+00     2   Male\n246              5258 4.444056e+02     3 Female\n247              8200 3.000000e-01     2   Male\n248              8795 2.504043e+00     3 Female\n249              7676 3.000000e-01     2 Female\n250              7029 3.000000e-01     3 Female\n251              7535 7.816712e-02     5 Female\n252              5026 3.000000e-01    10 Female\n253              8630 5.390836e-02     7   Male\n254              6989 1.494236e+02    13 Female\n255              8454 5.972622e+01    15   Male\n256              9741 6.361186e-01    11 Female\n257              6418 1.837896e+02    10 Female\n258              9922 1.320809e+02     3 Female\n259              8504 1.571906e-01     2   Male\n260              6491 1.520231e+02     3   Male\n261              6002 3.000000e-01     3 Female\n262              7127 3.000000e-01     3 Female\n263              8540 1.823699e+02     4   Male\n264              7115 3.000000e-01     3   Male\n265              7268 2.173913e+00     2   Male\n266              8279 2.142202e+01     4   Male\n267              8880 3.000000e-01     2 Female\n268              8076 3.408027e+00     8   Male\n269              6250 4.155963e+01    11   Male\n270              8542 9.698997e-02     6   Male\n271              5393 1.238532e+01    14 Female\n272              9197 9.528926e+00    14   Male\n273              6651 1.916185e+02     5 Female\n274              7473 1.060201e+00     5   Male\n275              6589 3.679104e+02    10 Female\n276              6867 4.288991e+01    13   Male\n277              5413 9.971098e+01     6   Male\n278              6765 3.000000e-01     5   Male\n279              8933 1.208092e+02    12   Male\n280              6294 3.000000e-01     2   Male\n281              8688 6.688963e-03     3 Female\n282              8108 2.505017e+00     1 Female\n283              6926 1.481605e+00     1   Male\n284              5880 3.000000e-01     1 Female\n285              5529 5.183946e-01     2 Female\n286              8963 3.000000e-01     5 Female\n287              9594 1.872910e-01     5   Male\n288              8075 3.678930e-01     4 Female\n289              5680 3.000000e-01     2   Male\n290              5617 4.529851e+02    NA Female\n291              5080 3.169725e+01     6 Female\n292              7719 3.000000e-01     8   Male\n293              6780 4.922018e+01    15   Male\n294              8768 2.548507e+02    11   Male\n295              7031 1.661850e+02    14   Male\n296              7740 9.164179e+02     6   Male\n297              8855 3.678930e-01    10 Female\n298              7241 1.236994e+02    12   Male\n299              8156 6.705202e+01    14   Male\n300              7333 3.834862e+01    10   Male\n301              6906 1.963211e+00     1 Female\n302              9511 3.000000e-01     3   Male\n303              9336 2.474916e-01     2   Male\n304              6644 3.000000e-01     3 Female\n305              5554 2.173913e-01     4   Male\n306              8094 8.193980e-01     3   Male\n307              8836 2.444816e+00     4 Female\n308              7147 3.000000e-01     4   Male\n309              7745 1.571906e-01     1 Female\n310              9345 1.849711e+02     7   Male\n311              5606 6.119403e+02    11 Female\n312              9766 3.000000e-01     7 Female\n313              6666 4.280936e-01     5 Female\n314              9965 9.698997e-02    10   Male\n315              7927 3.678930e-02     9 Female\n316              6266 4.832090e+02    13   Male\n317              9487 1.390173e+02    11 Female\n318              7089 3.000000e-01    13   Male\n319              5731 6.555970e+02     9 Female\n320              7962 1.526012e+02    15 Female\n321              9532 3.000000e-01     7 Female\n322              6687 7.222222e-01     4   Male\n323              6570 7.724426e+01     1   Male\n324              5781 3.000000e-01     1   Male\n325              8935 6.111111e-01     2 Female\n326              5780 1.555556e+00     2 Female\n327              9029 3.055556e-01     3   Male\n328              5668 1.500000e+00     2   Male\n329              8203 1.470772e+02     3   Male\n330              7381 1.694444e+00     4 Female\n331              7734 3.138298e+02     7 Female\n332              7257 1.414405e+02    11 Female\n333              8418 1.990605e+02    10 Female\n334              8259 4.212766e+02     5   Male\n335              5587 3.000000e-01     8   Male\n336              8499 3.000000e-01    15   Male\n337              7897 6.478723e+02    14   Male\n338              8300 3.000000e-01     2   Male\n339              9691 2.222222e+00     2 Female\n340              5873 3.000000e-01     2   Male\n341              6690 2.055556e+00     5   Male\n342              9970 2.777778e-02     4 Female\n343              8978 8.333333e-02     3   Male\n344              6181 1.032359e+02     5 Female\n345              8218 1.611111e+00     4 Female\n346              5387 8.333333e-02     2 Female\n347              7850 2.333333e+00     1 Female\n348              7326 5.755319e+02     7   Male\n349              8448 1.686848e+02     8 Female\n350              7264 1.111111e-01    NA   Male\n351              8361 3.000000e-01     9   Male\n352              7497 8.372340e+02     8 Female\n353              5559 3.000000e-01     5   Male\n354              7321 3.784504e+01    14   Male\n355              8372 3.819149e+02    14   Male\n356              5030 5.555556e-02     7 Female\n357              6936 3.000000e+02    13 Female\n358              9628 1.855950e+02     2   Male\n359              8558 1.944444e-01     1 Female\n360              7840 3.000000e-01     1   Male\n361              5100 5.555556e-02     4 Female\n362              8244 1.138889e+00     3   Male\n363              9115 4.254237e+01     4 Female\n364              5489 3.000000e-01     3   Male\n365              5766 3.000000e-01     1   Male\n366              5024 3.000000e-01     5 Female\n367              8599 3.000000e-01     4 Female\n368              8895 3.138298e+02     4 Female\n369              7708 1.235908e+02     4   Male\n370              7646 4.159574e+02    11   Male\n371              6640 3.009685e+01    15 Female\n372              8958 1.567850e+02    12 Female\n373              6477 1.367432e+02    11 Female\n374              7910 3.731235e+01     8 Female\n375              7829 9.164927e+01    13   Male\n376              7503 2.936170e+02    10 Female\n377              5209 8.820459e+01    10 Female\n378              6763 1.035491e+02    15   Male\n379              8976 7.379958e+01     8 Female\n380              9223 3.000000e-01    14   Male\n381              7692 1.718750e+02     4   Male\n382              7453 2.128527e+00     1   Male\n383              9775 1.253918e+00     5 Female\n384              9662 2.382445e-01     2   Male\n385              8733 4.639498e-01     2 Female\n386              5695 1.253918e-01     4   Male\n387              7714 1.253918e-01     4   Male\n388              9224 3.000000e-01     2 Female\n389              7635 1.000000e+00     3   Male\n390              7176 1.570043e+02    11   Male\n391              6102 4.344086e+02    10 Female\n392              7817 2.184953e+00     6   Male\n393              9719 1.507837e+00    12 Female\n394              9740 3.228840e-01    10 Female\n395              9528 4.588024e+01     8   Male\n396              7142 1.660560e+02     8   Male\n397              5689 3.000000e-01    13   Male\n398              5439 3.043011e+02    10   Male\n399              6718 2.612903e+02    13 Female\n400              6569 1.621767e+02    10   Male\n401              9444 3.228840e-01     2   Male\n402              6964 4.639498e-01     4 Female\n403              6420 2.495298e+00     3 Female\n404              9189 3.257053e+00     2 Female\n405              9368 3.793103e-01     1 Female\n406              6360           NA     3   Male\n407              8196 6.896552e-02     3 Female\n408              8297 3.000000e-01     4   Male\n409              6674 1.423197e+00     5 Female\n410              5269 3.000000e-01     5 Female\n411              6599 3.000000e-01     1 Female\n412              7713 1.786638e+02    11   Male\n413              8644 3.279570e+02     6   Male\n414              9680           NA    14 Female\n415              6305 1.903017e+02     8   Male\n416              8493 1.654095e+02     8 Female\n417              5297 4.639498e-01     9 Female\n418              7723 1.815733e+02     7   Male\n419              7510 1.366771e+00     6   Male\n420              5102 1.536050e-01    12 Female\n421              7816 1.306587e+01     8   Male\n422              5143 2.129032e+02    11 Female\n423              7414 1.925647e+02    14   Male\n424              5127 3.000000e-01     3 Female\n425              5830 1.028213e+00     1 Female\n426              8929 3.793103e-01     5 Female\n427              7993 8.025078e-01     2 Female\n428              8092 4.860215e+02     3 Female\n429              9750 3.000000e-01     4 Female\n430              6660 2.100313e-01     2   Male\n431              8054 2.767665e+01     3 Female\n432              6086 1.592476e+00     4   Male\n433              6878 9.717868e-02     1 Female\n434              8125 1.028213e+00     7 Female\n435              9500 3.793103e-01    10   Male\n436              8105 1.292026e+02    11   Male\n437              9593 4.425150e+01     7 Female\n438              5202 3.193548e+02    10 Female\n439              7207 1.860991e+02    14 Female\n440              5518 6.614420e-01     7 Female\n441              9820 5.203762e-01    11   Male\n442              6958 1.330819e+02    12   Male\n443              9445 1.673491e+02    10 Female\n444              8774 3.000000e-01     6   Male\n445              9614 1.117457e+02    13   Male\n446              9810 3.045509e+01     8 Female\n447              7271 3.000000e-01     2   Male\n448              8031 8.280255e-02     3 Female\n449              7232 3.000000e-01     1 Female\n450              7452 1.200637e+00     2 Female\n451              5921 1.687898e-01    NA   Male\n452              8136 7.367273e+02    NA Female\n453              6605 8.280255e-02     4   Male\n454              5125 5.127389e-01     4   Male\n455              5911 1.974522e-01     1   Male\n456              9644 7.993631e-01     2 Female\n457              5760 3.000000e-01     2   Male\n458              7055 3.298182e+02    12   Male\n459              9064 9.736842e+01    12 Female\n460              6925 3.000000e-01     8 Female\n461              7757 3.000000e-01    14 Female\n462              8527 4.214545e+02    13 Female\n463              8521 3.000000e-01     6   Male\n464              6260 2.578182e+02    11 Female\n465              9578 2.261147e-01    11   Male\n466              9570 3.000000e-01    10 Female\n467              6246 1.883901e+02    12   Male\n468              9622 9.458204e+01    14 Female\n469              7661 3.000000e-01    11 Female\n470              9374 3.000000e-01     1   Male\n471              8446 7.707006e-01     2 Female\n472              8332 5.032727e+02     3   Male\n473              8008 1.544586e+00     3 Female\n474              9365 1.431115e+02     5 Female\n475              9819 3.000000e-01     3   Male\n476              5173 1.458599e+00     1   Male\n477              6722 1.247678e+02     4 Female\n478              7668           NA     4 Female\n479              8980 4.334545e+02     4   Male\n480              5204 3.000000e-01     2 Female\n481              6412 6.156364e+02     5 Female\n482              6404 9.574303e+01     7   Male\n483              5693 1.928019e+02     8   Male\n484              8100 1.888545e+02    10   Male\n485              9760 1.598297e+02     6 Female\n486              6377 5.127389e-01     7   Male\n487              6012 1.171053e+02    10 Female\n488              6224           NA     6   Male\n489              6561 2.547771e-02     6 Female\n490              8475 1.707430e+02    15 Female\n491              6629 3.000000e-01     5   Male\n492              7200 1.869969e+02     3   Male\n493              9453 4.731481e+01     5   Male\n494              6449 1.988390e+02     3 Female\n495              9452 3.000000e-01     5   Male\n496              7162 8.808050e+01     5   Male\n497              8962 2.003185e+00     1 Female\n498              7328 3.000000e-01     1   Male\n499              9097 3.509259e+01     7 Female\n500              9131 9.365325e+01    14 Female\n501              7280 3.000000e-01     9   Male\n502              5783 3.736111e+01    10 Female\n503              9895 1.674923e+02    10 Female\n504              7986 8.808050e+01    11   Male\n505              7146 1.656347e+02    11 Female\n506              8671 3.722222e+01    12 Female\n507              5273 6.756364e+02    11 Female\n508              5063 3.000000e-01    12   Male\n509              6729 1.698142e+02    12   Male\n510              9085 1.628483e+02    10 Female\n511              9929 5.985130e-01     1   Male\n512              8479 1.903346e+00     2 Female\n513              7395 3.000000e-01     4   Male\n514              6374 3.000000e-01     2   Male\n515              7878 8.996283e-01     3   Male\n516              9603 3.977695e-01     3 Female\n517              7994 3.000000e-01     2   Male\n518              5277 3.000000e-01     4   Male\n519              5054 3.000000e-01     3   Male\n520              5440 3.000000e-01     1 Female\n521              6551 7.446809e+02     4   Male\n522              5281 6.095745e+02    12 Female\n523              7145 1.427445e+02     6   Male\n524              5275 3.000000e-01     7 Female\n525              9542 2.973978e-02     7   Male\n526              9371 3.977695e-01    13 Female\n527              5598 4.095745e+02     8 Female\n528              7148 4.595745e+02     7   Male\n529              5624 3.000000e-01     8 Female\n530              6998 1.976341e+02     8 Female\n531              9286 3.776596e+02    11 Female\n532              7589 1.777603e+02    14 Female\n533              7095 4.312268e-01     3   Male\n534              5455 6.765957e+02     2 Female\n535              6257 7.978723e+02     2   Male\n536              8627 9.665427e-02     3   Male\n537              9786 1.879338e+02     2   Male\n538              8176 4.358670e+01     2 Female\n539              9198 3.000000e-01     3 Female\n540              6586 3.000000e-01     2   Male\n541              8850 2.638955e+01     5   Male\n542              9560 3.180523e+01    10 Female\n543              7144 1.746845e+02    14   Male\n544              8230 1.876972e+02     9   Male\n545              7559 1.044164e+02     6   Male\n546              5312 1.202681e+02     7   Male\n547              6560 1.630915e+02    14 Female\n548              6091 1.276025e+02     7 Female\n549              5578 8.880126e+01     7   Male\n550              5837 3.563830e+02     9   Male\n551              8347 2.212766e+02    14   Male\n552              6453 1.969121e+01    10 Female\n553              5758 3.755319e+02    13 Female\n554              5569 1.214511e+02     5   Male\n555              8766 1.034700e+02     4 Female\n556              8002 3.000000e-01     4 Female\n557              7839 3.643123e-01     5 Female\n558              5434 6.319703e-02     4 Female\n559              7636 3.000000e-01     4   Male\n560              6164 3.000000e-01     4   Male\n561              9243 3.000000e-01     3 Female\n562              5872 3.000000e-01     1 Female\n563              8079 3.000000e-01     4   Male\n564              9762 3.000000e-01     1   Male\n565              9476 3.000000e-01     1 Female\n566              8345 3.000000e-01     7   Male\n567              8128 1.664038e+02    13 Female\n568              7956 2.946809e+02    10 Female\n569              8677 4.391924e+01    14   Male\n570              5881 1.874606e+02    12 Female\n571              7498 1.143533e+02    14   Male\n572              8134 1.600158e+02     8   Male\n573              7748 1.635688e-01     7   Male\n574              7990 8.809148e+01    11 Female\n575              6184 1.337539e+02     8   Male\n576              6339 1.985804e+02    12   Male\n577              5113 1.578864e+02     9 Female\n578              9449 3.000000e-01     5 Female\n579              8110 3.000000e-01     4   Male\n580              9307 1.953642e-01     3 Female\n581              5555 1.119205e+00     2   Male\n582              9152 2.523636e+02     2   Male\n583              7969 3.000000e-01     3   Male\n584              6116 4.844371e+00     4 Female\n585              8294 3.000000e-01     4   Male\n586              8938 1.492553e+02     4 Female\n587              9539 1.993617e+02     5   Male\n588              9470 2.847682e-01     3 Female\n589              6677 3.145695e-01     6 Female\n590              8752 3.000000e-01     3   Male\n591              5574 3.406429e+01    11 Female\n592              5989 6.595745e+01    11   Male\n593              9813 3.000000e-01     7   Male\n594              6150 2.174545e+02     8   Male\n595              5730           NA     6 Female\n596              8038 5.957447e+01    10 Female\n597              5964 7.236364e+02     8 Female\n598              9043 3.000000e-01     8   Male\n599              5095 3.000000e-01     9 Female\n600              8922 3.000000e-01     8   Male\n601              5469 2.676364e+02    13   Male\n602              6726 1.891489e+02    11   Male\n603              7495 3.036364e+02     8 Female\n604              8159 3.000000e-01     2 Female\n605              6709 3.000000e-01     4   Male\n606              5855 3.000000e-01     2   Male\n607              6058 3.000000e-01     2 Female\n608              7292 3.000000e-01     4   Male\n609              6437 1.447020e+00     2   Male\n610              9326 2.130909e+02     4 Female\n611              8222 1.357616e-01     2 Female\n612              6789 3.000000e-01     4 Female\n613              6348 3.000000e-01     1 Female\n614              5958 5.534545e+02     4 Female\n615              9211 1.891489e+02    12 Female\n616              9450 7.202128e+01     7 Female\n617              6540 3.250287e+01    11   Male\n618              8796 1.655629e-02     6   Male\n619              7971 3.123636e+02     8   Male\n620              7549 3.000000e-01    14   Male\n621              9799 7.138298e+01    11   Male\n622              7013 3.000000e-01     7 Female\n623              5599 6.946809e+01    14 Female\n624              8601 4.012629e+01     6   Male\n625              7383 1.629787e+02    13 Female\n626              6656 1.508511e+02    13 Female\n627              5641 1.655629e-02     3   Male\n628              6222 3.000000e-01     1   Male\n629              7674 4.635762e-02     3   Male\n630              5293 3.000000e-01     1 Female\n631              6715 3.000000e-01     1 Female\n632              7057 3.000000e-01     2   Male\n633              7072 1.942553e+02     4   Male\n634              6380 3.690909e+02     4   Male\n635              6762 3.000000e-01     2 Female\n636              5799 3.000000e-01     4 Female\n637              6681 2.847682e+00     5   Male\n638              8755 1.435106e+02     3 Female\n639              6896 3.000000e-01     3   Male\n640              5945 4.752009e+01     6 Female\n641              5035 2.621125e+01    11 Female\n642              6776 1.055319e+02     9 Female\n643              7863 3.000000e-01     7 Female\n644              9836 1.149007e+00     8   Male\n645              7860 2.927273e+02    NA Female\n646              5248 3.000000e-01     8 Female\n647              5677 3.000000e-01    14 Female\n648              9576 4.839265e+01    10   Male\n649              5824 3.000000e-01    10   Male\n650              9184 3.000000e-01    11 Female\n651              5397 2.251656e-01    13 Female\n```\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$slum <- NULL # this is the same as above\n```\n:::\n\nWe can also grab the `age` column using the `$` operator. \n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n```\n:::\n:::\n\n\n\n##  Using indexing to subset by rows\n\nWe can use indexing to also subset by rows. For example, here we pull the 100th observation/row.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[100,] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    IgG_concentration       age age gender     slum\n100              8122 0.1818182   5 Female Non slum\n```\n:::\n:::\n\nAnd, here we pull the `age` of the 100th observation/row.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[100,\"age\"] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 0.1818182\n```\n:::\n:::\n\n \n\n## Logical operators\n\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE\n\noperator | operator option |description\n-----|-----|-----:\n`<`|%l%|less than\n`<=`|%le%|less than or equal to\n`>`|%g%|greater than\n`>=`|%ge%|greater than or equal to\n`==`||equal to\n`!=`|not equal to\n`x&y`||x and y\n`x|y`||x or y\n`%in%`||match\n`%!in%`||do not match\n\n\n## Logical operators examples\n\nLet's practice.  First, here is a reminder of what the number.object contains.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n:::\n\n\nNow, we will use logical operators to evaluate the object.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object<4\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object>=3\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object!=5\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object %in% c(6,7,2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n:::\n\n\n\n## Using indexing and logical operators to rename columns\n\n1. We can assign the column names from data frame `df` to an object `cn`, then we can modify `cn` directly using indexing and logical operators, finally we reassign the column names, `cn`, back to the data frame `df`:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncn <- colnames(df)\ncn\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"IgG_concentration\" \"age\"               \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\ncn[cn==\"IgG_concentration\"] <-\"IgG_concentration_mIU\" #rename cn to \"IgG_concentration_mIU\" when cn is \"IgG_concentration\"\ncolnames(df) <- cn\n```\n:::\n\n\nNote, I am resetting the column name back to the original name for the sake of the rest of the module.\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df)[colnames(df)==\"IgG_concentration_mIU\"] <- \"IgG_concentration\" #reset\n```\n:::\n\n\n\n##  Using indexing and logical operators to subset data\n\n\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 10 and then saved the subset data to `df_lt10`. Note that the logical operators `df$age<=10` is before the comma because I want to subset by rows (the first dimension).\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte10 <- df[df$age<=10, ]\n```\n:::\n\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte5_gt10 <- df[df$age<=5 | df$age>10, ]\n```\n:::\n\nLets check that my subsets worked using the `summary()` function. \n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df_lte10$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's \n0.005391 0.300000 0.300000 0.724742 0.640788 9.545455       10 \n```\n:::\n\n```{.r .cell-code}\nsummary(df_lte5_gt10$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's \n  0.0054   0.3000   1.6018  87.9886 142.8362 916.4179       10 \n```\n:::\n:::\n\n\n\n## Missing values \n\nMissing data need to be carefully described and dealt with in data analysis. Understanding the different types of missing data and how you can identify them, is the first step to data cleaning.\n\nTypes of \"missing\" values:\n\n-   `NA` - general missing data\n-   `NaN` - stands for \"**N**ot **a** **N**umber\", happens when you do\n    0/0.\n-   `Inf` and `-Inf` - Infinity, happens when you divide a positive\n    number (or negative number) by 0.\n-   blank space - sometimes when data is read it, there is a blank space left\n\n## Logical operators to help identify and missing data\n\noperator | operator option |description\n-----|-----|-----:\n`is.na`||is NAN or NA\n`is.nan`||is NAN\n`!is.na`||is not NAN or NA\n`!is.nan`||is not NAN\n`is.infinite`||is infinite\n`any`||are any TRUE\n`which`||which are TRUE\n\n## More logical operators examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest <- c(0,NA, -1)/0\ntest\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  NaN   NA -Inf\n```\n:::\n\n```{.r .cell-code}\nis.na(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE  TRUE FALSE\n```\n:::\n\n```{.r .cell-code}\nis.nan(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE FALSE\n```\n:::\n\n```{.r .cell-code}\nis.infinite(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE FALSE  TRUE\n```\n:::\n:::\n\n\n## More logical operators examples\n\n`any(is.na(x))` means do we have any `NA`'s in the object `x`?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nany(is.na(df$IgG_concentration)) # are there any NAs - YES/TRUE\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n\n```{.r .cell-code}\nany(is.na(df$slum)) # are there any NAs- NO/FALSE\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n:::\n\n\n`which(is.na(x))` means which of the elements in object `x` are `NA`'s?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwhich(is.na(df$IgG_concentration)) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\ninteger(0)\n```\n:::\n\n```{.r .cell-code}\nwhich(is.na(df$slum)) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\ninteger(0)\n```\n:::\n:::\n\n\n## `subset()` function\n\nThe Base R `subset()` function is a slightly easier way to select variables and observations.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?subset\n```\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nSubsetting Vectors, Matrices and Data Frames\n\nDescription:\n\n     Return subsets of vectors, matrices or data frames which meet\n     conditions.\n\nUsage:\n\n     subset(x, ...)\n     \n     ## Default S3 method:\n     subset(x, subset, ...)\n     \n     ## S3 method for class 'matrix'\n     subset(x, subset, select, drop = FALSE, ...)\n     \n     ## S3 method for class 'data.frame'\n     subset(x, subset, select, drop = FALSE, ...)\n     \nArguments:\n\n       x: object to be subsetted.\n\n  subset: logical expression indicating elements or rows to keep:\n          missing values are taken as false.\n\n  select: expression, indicating columns to select from a data frame.\n\n    drop: passed on to '[' indexing operator.\n\n     ...: further arguments to be passed to or from other methods.\n\nDetails:\n\n     This is a generic function, with methods supplied for matrices,\n     data frames and vectors (including lists).  Packages and users can\n     add further methods.\n\n     For ordinary vectors, the result is simply 'x[subset &\n     !is.na(subset)]'.\n\n     For data frames, the 'subset' argument works on the rows.  Note\n     that 'subset' will be evaluated in the data frame, so columns can\n     be referred to (by name) as variables in the expression (see the\n     examples).\n\n     The 'select' argument exists only for the methods for data frames\n     and matrices.  It works by first replacing column names in the\n     selection expression with the corresponding column numbers in the\n     data frame and then using the resulting integer vector to index\n     the columns.  This allows the use of the standard indexing\n     conventions so that for example ranges of columns can be specified\n     easily, or single columns can be dropped (see the examples).\n\n     The 'drop' argument is passed on to the indexing method for\n     matrices and data frames: note that the default for matrices is\n     different from that for indexing.\n\n     Factors may have empty levels after subsetting; unused levels are\n     not automatically removed.  See 'droplevels' for a way to drop all\n     unused levels from a data frame.\n\nValue:\n\n     An object similar to 'x' contain just the selected elements (for a\n     vector), rows and columns (for a matrix or data frame), and so on.\n\nWarning:\n\n     This is a convenience function intended for use interactively.\n     For programming it is better to use the standard subsetting\n     functions like '[', and in particular the non-standard evaluation\n     of argument 'subset' can have unanticipated consequences.\n\nAuthor(s):\n\n     Peter Dalgaard and Brian Ripley\n\nSee Also:\n\n     '[', 'transform' 'droplevels'\n\nExamples:\n\n     subset(airquality, Temp > 80, select = c(Ozone, Temp))\n     subset(airquality, Day == 1, select = -Temp)\n     subset(airquality, select = Ozone:Wind)\n     \n     with(airquality, subset(Ozone, Temp > 80))\n     \n     ## sometimes requiring a logical 'subset' argument is a nuisance\n     nm <- rownames(state.x77)\n     start_with_M <- nm %in% grep(\"^M\", nm, value = TRUE)\n     subset(state.x77, start_with_M, Illiteracy:Murder)\n     # but in recent versions of R this can simply be\n     subset(state.x77, grepl(\"^M\", nm), Illiteracy:Murder)\n\n\n## Subsetting use the `subset()` function\n\nHere are a few examples using the `subset()` function\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte10_v2 <- subset(df, df$age<=10, select=c(IgG_concentration, age))\ndf_lt5_f <- subset(df, df$age<=5 & gender==\"Female\", select=c(IgG_concentration, slum))\n```\n:::\n\n\n## `subset()` function vs logical operators\n\n`subset()` automatically removes NAs, which is a different behavior from doing logical operations on NAs.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df_lte10$age)\n```\n\n::: {.cell-output-display}\n|      Min.| 1st Qu.| Median|      Mean|   3rd Qu.|     Max.| NA's|\n|---------:|-------:|------:|---------:|---------:|--------:|----:|\n| 0.0053908|     0.3|    0.3| 0.7247421| 0.6407876| 9.545454|   10|\n:::\n\n```{.r .cell-code}\nsummary(df_lte10_v2$age)\n```\n\n::: {.cell-output-display}\n|      Min.| 1st Qu.| Median|      Mean|   3rd Qu.|     Max.|\n|---------:|-------:|------:|---------:|---------:|--------:|\n| 0.0053908|     0.3|    0.3| 0.7247421| 0.6407876| 9.545454|\n:::\n:::\n\n\nWe can also see this by looking at the number or rows in each dataset.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnrow(df_lte10)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 370\n```\n:::\n\n```{.r .cell-code}\nnrow(df_lte10_v2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 360\n```\n:::\n:::\n\n\n\n\n## Summary\n\n- `colnames()`, `str()` and `summary()`functions from Base R are great functions to assess the data type and some summary statistics\n- There are three basic indexing syntax: `[ ]`, `[[ ]]` and `$`\n- Indexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\n- Logical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing\n- There are 5 “types” of missing values, the most common being “NA”\n- Logical operators meant to determine missing values are very helpful for data cleaning\n- The Base R `subset()` function is a slightly easier way to select variables and observations.\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n-   [\"Indexing\" CRAN Project](https://cran.r-project.org/doc/manuals/R-lang.html#Indexing)\n-   [\"Logical operators\" CRAN Project](https://cran.r-project.org/web/packages/extraoperators/vignettes/logicals-vignette.html)\n\n",
+    "markdown": "---\ntitle: \"Module 6: Get to Know Your Data and Subsetting\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n#execute: \n#  echo: true\n---\n\n\n## Learning Objectives\n\nAfter module 6, you should be able to...\n\n-   Use basic functions to get to know you data\n-   Use three indexing approaches\n-   Rely on indexing to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\n-   Describe what logical operators are and how to use them\n-   Use on the `subset()` function to subset data\n\n\n## Getting to know our data\n\nThe `dim()`, `nrow()`, and `ncol()` functions are good options to check the dimensions of your data before moving forward. \n\nLet's first read in the data from the previous module.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ndim(df) # rows, columns\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 651   5\n```\n:::\n\n```{.r .cell-code}\nnrow(df) # number of rows\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 651\n```\n:::\n\n```{.r .cell-code}\nncol(df) # number of columns\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 5\n```\n:::\n:::\n\n\n## Quick summary of data\n\nThe `colnames()`, `str()` and `summary()`functions from Base R are great functions to assess the data type and some summary statistics.    \n\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\nstr(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t651 obs. of  5 variables:\n $ observation_id   : int  5772 8095 9784 9338 6369 6885 6252 8913 7332 6941 ...\n $ IgG_concentration: num  0.318 3.437 0.3 143.236 0.448 ...\n $ age              : int  2 4 4 4 1 4 4 NA 4 2 ...\n $ gender           : chr  \"Female\" \"Female\" \"Male\" \"Male\" ...\n $ slum             : chr  \"Non slum\" \"Non slum\" \"Non slum\" \"Non slum\" ...\n```\n:::\n\n```{.r .cell-code}\nsummary(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n observation_id IgG_concentration       age            gender         \n Min.   :5006   Min.   :  0.0054   Min.   : 1.000   Length:651        \n 1st Qu.:6306   1st Qu.:  0.3000   1st Qu.: 3.000   Class :character  \n Median :7495   Median :  1.6658   Median : 6.000   Mode  :character  \n Mean   :7492   Mean   : 87.3683   Mean   : 6.606                     \n 3rd Qu.:8749   3rd Qu.:141.4405   3rd Qu.:10.000                     \n Max.   :9982   Max.   :916.4179   Max.   :15.000                     \n                NA's   :10         NA's   :9                          \n     slum          \n Length:651        \n Class :character  \n Mode  :character  \n                   \n                   \n                   \n                   \n```\n:::\n:::\n\n\nNote, if you have a very large dataset with 15+ variables, `summary()` is not so efficient. \n\n## Description of data\n\nThis is data based on a simulated pathogen X IgG antibody serological survey.  The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization.  We will use this dataset for modules throughout the Workshop.\n\n## View the data as a whole dataframe\n\nThe `View()` function, one of the few Base R functions with a capital letter, and can be used to open a new tab in the Console and view the data as you would in excel.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nView(df)\n```\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/ViewTab.png){width=100%}\n:::\n:::\n\n\n## View the data as a whole dataframe\n\nYou can also open a new tab of the data by clicking on the data icon beside the object in the Environment pane\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/View.png){width=90%}\n:::\n:::\n\n\nYou can also hold down `Cmd` or `CTRL` and click on the name of a data frame in your code.\n\n## Indexing\n\nR contains several operators which allow access to individual elements or subsets through indexing. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing operators: `[`, `[[` and `$`. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nx[i] #if x is a vector\nx[i, j] #if x is a matrix/data frame\nx[[i]] #if x is a list\nx$a #if x is a data frame or list\nx$\"a\" #if x is a data frame or list\n```\n:::\n\n\n## Vectors and multi-dimensional objects\n\nTo index a vector, `vector[i]` select the ith element. To index a multi-dimensional objects such as a matrix, `matrix[i, j]` selects the element in row i and column j, where as in a three dimensional `array[k, i, j]` selects the element in matrix k, row i, and column j. \n\nLet's practice by first creating the same objects as we did in Module 1.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object <- 3\ncharacter.object <- \"blue\"\nvector.object1 <- c(2,3,4,5)\nvector.object2 <- c(\"blue\", \"red\", \"yellow\")\nmatrix.object <- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n```\n:::\n\n\nHere is a reminder of what these objects look like.\n\n::: {.cell}\n\n```{.r .cell-code}\nvector.object1\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 2 3 4 5\n```\n:::\n\n```{.r .cell-code}\nmatrix.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nFinally, let's use indexing to pull out elements of the objects.  \n\n::: {.cell}\n\n```{.r .cell-code}\nvector.object1[2] #pulling the second element\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n\n```{.r .cell-code}\nmatrix.object[1,2] #pulling the element in row 1 column 2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n:::\n\n\n\n## List objects\n\nFor lists, one generally uses `list[[p]]` to select any single element p.\n\nLet's practice by creating the same list as we did in Module 1.\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object <- list(number.object, vector.object2, matrix.object)\nlist.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nNow we use indexing to pull out the 3rd element in the list.\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object[[3]]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nWhat happens if we use a single square bracket?\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object[3]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nThe `[[` operator is called the \"extract\" operator and gives us the element\nfrom the list. The `[` operator is called the \"subset\" operator and gives\nus a subset of the list, that is still a list.\n\n## $ for indexing for data frame\n\n`$` allows only a literal character string or a symbol as the index.  For a data frame it extracts a variable.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$IgG_concentration\n```\n:::\n\n\nNote, if you have spaces in your variable name, you will need to use back ticks \\` after the `$`.  This is a good reason to not create variables / column names with spaces.\n\n## $ for indexing with lists\n\n`$` allows only a literal character string or a symbol as the index.  For a list it extracts a named element.\n\nList elements can be named\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object.named <- list(\n  emory = number.object,\n  uga = vector.object2,\n  gsu = matrix.object\n)\nlist.object.named\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n$emory\n[1] 3\n\n$uga\n[1] \"blue\"   \"red\"    \"yellow\"\n\n$gsu\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n```\n:::\n:::\n\n\nIf list elements are named, than you can reference data from list using `$` or using double square brackets, `[[`\n\n::: {.cell}\n\n```{.r .cell-code}\nlist.object.named$uga \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"blue\"   \"red\"    \"yellow\"\n```\n:::\n\n```{.r .cell-code}\nlist.object.named[[\"uga\"]] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"blue\"   \"red\"    \"yellow\"\n```\n:::\n:::\n\n\n\n## Using indexing to rename columns\n\nAs mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\ncolnames(df)[2:3] <- c(\"IgG_concentration_IU/mL\", \"age_year\") # reassigns\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"          \"IgG_concentration_IU/mL\"\n[3] \"age_year\"                \"gender\"                 \n[5] \"slum\"                   \n```\n:::\n:::\n\n\nFor the sake of the module, I am going to reassign them back to the original variable names\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df)[2:3] <- c(\"IgG_concentration\", \"age\") #reset\n```\n:::\n\n\n##  Using indexing to subset by columns\n\nWe can also subset data frames and matrices (2-dimensional objects) using the bracket `[ row , column ]`.  We can subset by columns and pull the `x` column using the index of the column or the column name. Leaving either row or column dimension blank means to select all of them.\n\nFor example, here I am pulling the 3rd column, which has the variable name `age`, for all of rows.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[ , \"age\"] #same as df[ , 3]\n```\n:::\n\nWe can select multiple columns using multiple column names, again this is selecting these variables for all of the rows.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[, c(\"age\", \"gender\")] #same as df[ , c(3,4)]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    age gender\n1     2 Female\n2     4 Female\n3     4   Male\n4     4   Male\n5     1   Male\n6     4   Male\n7     4 Female\n8    NA Female\n9     4   Male\n10    2   Male\n11    3   Male\n12   15 Female\n13    8   Male\n14   12   Male\n15   15   Male\n16    9   Male\n17    8   Male\n18    7 Female\n19   11 Female\n20   10   Male\n21    8   Male\n22   11 Female\n23    2   Male\n24    2 Female\n25    3 Female\n26    5   Male\n27    1   Male\n28    3 Female\n29    5 Female\n30    5 Female\n31    3   Male\n32    1   Male\n33    4 Female\n34    3   Male\n35    2 Female\n36   11 Female\n37    7   Male\n38    8   Male\n39    6   Male\n40    6   Male\n41   11 Female\n42   10   Male\n43    6 Female\n44   12   Male\n45   11   Male\n46   10   Male\n47   11   Male\n48   13 Female\n49    3 Female\n50    4 Female\n51    3   Male\n52    1   Male\n53    2 Female\n54    2 Female\n55    4   Male\n56    2   Male\n57    2   Male\n58    3 Female\n59    3 Female\n60    4   Male\n61    1 Female\n62   13 Female\n63   13 Female\n64    6   Male\n65   13   Male\n66    5 Female\n67   13 Female\n68   14   Male\n69   13   Male\n70    8 Female\n71    7   Male\n72    6 Female\n73   13   Male\n74    3   Male\n75    4   Male\n76    2   Male\n77   NA   Male\n78    5 Female\n79    3   Male\n80    3   Male\n81   14   Male\n82   11 Female\n83    7 Female\n84    7   Male\n85   11 Female\n86    9 Female\n87   14   Male\n88   13 Female\n89    1   Male\n90    1   Male\n91    4   Male\n92    1 Female\n93    2   Male\n94    3 Female\n95    2   Male\n96    1   Male\n97    2   Male\n98    2 Female\n99    4 Female\n100   5 Female\n101   5   Male\n102   6 Female\n103  14 Female\n104  14   Male\n105  10   Male\n106   6 Female\n107   6   Male\n108   8   Male\n109   6 Female\n110  12 Female\n111  12   Male\n112  14 Female\n113  15   Male\n114  12 Female\n115   4 Female\n116   4   Male\n117   3 Female\n118  NA   Male\n119   2 Female\n120   3   Male\n121  NA Female\n122   3 Female\n123   3   Male\n124   2 Female\n125   4 Female\n126  10 Female\n127   7 Female\n128  11 Female\n129   6 Female\n130  11   Male\n131   9   Male\n132   6   Male\n133  13 Female\n134  10 Female\n135   6 Female\n136  11 Female\n137   7   Male\n138   6 Female\n139   4 Female\n140   4 Female\n141   4   Male\n142   4 Female\n143   4   Male\n144   4   Male\n145   3   Male\n146   4 Female\n147   3   Male\n148   3   Male\n149  13 Female\n150   7 Female\n151  10   Male\n152   6   Male\n153  10 Female\n154  12 Female\n155  10   Male\n156  10   Male\n157  13   Male\n158  13 Female\n159   5 Female\n160   3 Female\n161   4   Male\n162   1   Male\n163   3 Female\n164   4   Male\n165   4   Male\n166   1   Male\n167   5 Female\n168   6 Female\n169  14 Female\n170   6   Male\n171  13 Female\n172   9   Male\n173  11   Male\n174  10   Male\n175   5 Female\n176  14   Male\n177   7   Male\n178  10   Male\n179   6   Male\n180   5   Male\n181   3 Female\n182   4   Male\n183   2 Female\n184   3   Male\n185   3 Female\n186   2 Female\n187   3   Male\n188   5 Female\n189   2   Male\n190   3 Female\n191  14 Female\n192   9 Female\n193  14 Female\n194   9 Female\n195   8 Female\n196   7   Male\n197  13   Male\n198   8 Female\n199   6   Male\n200  12 Female\n201  14 Female\n202  15 Female\n203   2 Female\n204   4 Female\n205   3   Male\n206   3 Female\n207   3   Male\n208   4 Female\n209   3   Male\n210  14 Female\n211   8   Male\n212   7   Male\n213  14 Female\n214  13 Female\n215  13 Female\n216   7   Male\n217   8 Female\n218  10 Female\n219   9   Male\n220   9 Female\n221   3 Female\n222   4   Male\n223   4 Female\n224   4   Male\n225   2 Female\n226   1 Female\n227   3 Female\n228   2   Male\n229   3   Male\n230   5   Male\n231   2 Female\n232   2   Male\n233   9   Male\n234  13   Male\n235  10 Female\n236   6   Male\n237  13 Female\n238  11   Male\n239  10   Male\n240   8 Female\n241   9 Female\n242  10   Male\n243  14   Male\n244   1 Female\n245   2   Male\n246   3 Female\n247   2   Male\n248   3 Female\n249   2 Female\n250   3 Female\n251   5 Female\n252  10 Female\n253   7   Male\n254  13 Female\n255  15   Male\n256  11 Female\n257  10 Female\n258   3 Female\n259   2   Male\n260   3   Male\n261   3 Female\n262   3 Female\n263   4   Male\n264   3   Male\n265   2   Male\n266   4   Male\n267   2 Female\n268   8   Male\n269  11   Male\n270   6   Male\n271  14 Female\n272  14   Male\n273   5 Female\n274   5   Male\n275  10 Female\n276  13   Male\n277   6   Male\n278   5   Male\n279  12   Male\n280   2   Male\n281   3 Female\n282   1 Female\n283   1   Male\n284   1 Female\n285   2 Female\n286   5 Female\n287   5   Male\n288   4 Female\n289   2   Male\n290  NA Female\n291   6 Female\n292   8   Male\n293  15   Male\n294  11   Male\n295  14   Male\n296   6   Male\n297  10 Female\n298  12   Male\n299  14   Male\n300  10   Male\n301   1 Female\n302   3   Male\n303   2   Male\n304   3 Female\n305   4   Male\n306   3   Male\n307   4 Female\n308   4   Male\n309   1 Female\n310   7   Male\n311  11 Female\n312   7 Female\n313   5 Female\n314  10   Male\n315   9 Female\n316  13   Male\n317  11 Female\n318  13   Male\n319   9 Female\n320  15 Female\n321   7 Female\n322   4   Male\n323   1   Male\n324   1   Male\n325   2 Female\n326   2 Female\n327   3   Male\n328   2   Male\n329   3   Male\n330   4 Female\n331   7 Female\n332  11 Female\n333  10 Female\n334   5   Male\n335   8   Male\n336  15   Male\n337  14   Male\n338   2   Male\n339   2 Female\n340   2   Male\n341   5   Male\n342   4 Female\n343   3   Male\n344   5 Female\n345   4 Female\n346   2 Female\n347   1 Female\n348   7   Male\n349   8 Female\n350  NA   Male\n351   9   Male\n352   8 Female\n353   5   Male\n354  14   Male\n355  14   Male\n356   7 Female\n357  13 Female\n358   2   Male\n359   1 Female\n360   1   Male\n361   4 Female\n362   3   Male\n363   4 Female\n364   3   Male\n365   1   Male\n366   5 Female\n367   4 Female\n368   4 Female\n369   4   Male\n370  11   Male\n371  15 Female\n372  12 Female\n373  11 Female\n374   8 Female\n375  13   Male\n376  10 Female\n377  10 Female\n378  15   Male\n379   8 Female\n380  14   Male\n381   4   Male\n382   1   Male\n383   5 Female\n384   2   Male\n385   2 Female\n386   4   Male\n387   4   Male\n388   2 Female\n389   3   Male\n390  11   Male\n391  10 Female\n392   6   Male\n393  12 Female\n394  10 Female\n395   8   Male\n396   8   Male\n397  13   Male\n398  10   Male\n399  13 Female\n400  10   Male\n401   2   Male\n402   4 Female\n403   3 Female\n404   2 Female\n405   1 Female\n406   3   Male\n407   3 Female\n408   4   Male\n409   5 Female\n410   5 Female\n411   1 Female\n412  11   Male\n413   6   Male\n414  14 Female\n415   8   Male\n416   8 Female\n417   9 Female\n418   7   Male\n419   6   Male\n420  12 Female\n421   8   Male\n422  11 Female\n423  14   Male\n424   3 Female\n425   1 Female\n426   5 Female\n427   2 Female\n428   3 Female\n429   4 Female\n430   2   Male\n431   3 Female\n432   4   Male\n433   1 Female\n434   7 Female\n435  10   Male\n436  11   Male\n437   7 Female\n438  10 Female\n439  14 Female\n440   7 Female\n441  11   Male\n442  12   Male\n443  10 Female\n444   6   Male\n445  13   Male\n446   8 Female\n447   2   Male\n448   3 Female\n449   1 Female\n450   2 Female\n451  NA   Male\n452  NA Female\n453   4   Male\n454   4   Male\n455   1   Male\n456   2 Female\n457   2   Male\n458  12   Male\n459  12 Female\n460   8 Female\n461  14 Female\n462  13 Female\n463   6   Male\n464  11 Female\n465  11   Male\n466  10 Female\n467  12   Male\n468  14 Female\n469  11 Female\n470   1   Male\n471   2 Female\n472   3   Male\n473   3 Female\n474   5 Female\n475   3   Male\n476   1   Male\n477   4 Female\n478   4 Female\n479   4   Male\n480   2 Female\n481   5 Female\n482   7   Male\n483   8   Male\n484  10   Male\n485   6 Female\n486   7   Male\n487  10 Female\n488   6   Male\n489   6 Female\n490  15 Female\n491   5   Male\n492   3   Male\n493   5   Male\n494   3 Female\n495   5   Male\n496   5   Male\n497   1 Female\n498   1   Male\n499   7 Female\n500  14 Female\n501   9   Male\n502  10 Female\n503  10 Female\n504  11   Male\n505  11 Female\n506  12 Female\n507  11 Female\n508  12   Male\n509  12   Male\n510  10 Female\n511   1   Male\n512   2 Female\n513   4   Male\n514   2   Male\n515   3   Male\n516   3 Female\n517   2   Male\n518   4   Male\n519   3   Male\n520   1 Female\n521   4   Male\n522  12 Female\n523   6   Male\n524   7 Female\n525   7   Male\n526  13 Female\n527   8 Female\n528   7   Male\n529   8 Female\n530   8 Female\n531  11 Female\n532  14 Female\n533   3   Male\n534   2 Female\n535   2   Male\n536   3   Male\n537   2   Male\n538   2 Female\n539   3 Female\n540   2   Male\n541   5   Male\n542  10 Female\n543  14   Male\n544   9   Male\n545   6   Male\n546   7   Male\n547  14 Female\n548   7 Female\n549   7   Male\n550   9   Male\n551  14   Male\n552  10 Female\n553  13 Female\n554   5   Male\n555   4 Female\n556   4 Female\n557   5 Female\n558   4 Female\n559   4   Male\n560   4   Male\n561   3 Female\n562   1 Female\n563   4   Male\n564   1   Male\n565   1 Female\n566   7   Male\n567  13 Female\n568  10 Female\n569  14   Male\n570  12 Female\n571  14   Male\n572   8   Male\n573   7   Male\n574  11 Female\n575   8   Male\n576  12   Male\n577   9 Female\n578   5 Female\n579   4   Male\n580   3 Female\n581   2   Male\n582   2   Male\n583   3   Male\n584   4 Female\n585   4   Male\n586   4 Female\n587   5   Male\n588   3 Female\n589   6 Female\n590   3   Male\n591  11 Female\n592  11   Male\n593   7   Male\n594   8   Male\n595   6 Female\n596  10 Female\n597   8 Female\n598   8   Male\n599   9 Female\n600   8   Male\n601  13   Male\n602  11   Male\n603   8 Female\n604   2 Female\n605   4   Male\n606   2   Male\n607   2 Female\n608   4   Male\n609   2   Male\n610   4 Female\n611   2 Female\n612   4 Female\n613   1 Female\n614   4 Female\n615  12 Female\n616   7 Female\n617  11   Male\n618   6   Male\n619   8   Male\n620  14   Male\n621  11   Male\n622   7 Female\n623  14 Female\n624   6   Male\n625  13 Female\n626  13 Female\n627   3   Male\n628   1   Male\n629   3   Male\n630   1 Female\n631   1 Female\n632   2   Male\n633   4   Male\n634   4   Male\n635   2 Female\n636   4 Female\n637   5   Male\n638   3 Female\n639   3   Male\n640   6 Female\n641  11 Female\n642   9 Female\n643   7 Female\n644   8   Male\n645  NA Female\n646   8 Female\n647  14 Female\n648  10   Male\n649  10   Male\n650  11 Female\n651  13 Female\n```\n:::\n:::\n\nWe can remove select columns using indexing as well, OR by simply changing the column to `NULL`\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[, -5] #remove column 5, \"slum\" variable\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$slum <- NULL # this is the same as above\n```\n:::\n\nWe can also grab the `age` column using the `$` operator, again this is selecting the variable for all of the rows.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age\n```\n:::\n\n\n\n##  Using indexing to subset by rows\n\nWe can use indexing to also subset by rows. For example, here we pull the 100th observation/row.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[100,] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n    observation_id IgG_concentration age gender     slum\n100           8122         0.1818182   5 Female Non slum\n```\n:::\n:::\n\nAnd, here we pull the `age` of the 100th observation/row.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf[100,\"age\"] \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 5\n```\n:::\n:::\n\n \n\n## Logical operators\n\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE\n\noperator | operator option |description\n-----|-----|-----:\n`<`|%l%|less than\n`<=`|%le%|less than or equal to\n`>`|%g%|greater than\n`>=`|%ge%|greater than or equal to\n`==`||equal to\n`!=`||not equal to\n`x&y`||x and y\n`x|y`||x or y\n`%in%`||match\n`%!in%`||do not match\n\n\n## Logical operators examples\n\nLet's practice.  First, here is a reminder of what the number.object contains.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 3\n```\n:::\n:::\n\n\nNow, we will use logical operators to evaluate the object.\n\n::: {.cell}\n\n```{.r .cell-code}\nnumber.object<4\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object>=3\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object!=5\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nnumber.object %in% c(6,7,2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n:::\n\n\n\n## Using indexing and logical operators to rename columns\n\n1. We can assign the column names from data frame `df` to an object `cn`, then we can modify `cn` directly using indexing and logical operators, finally we reassign the column names, `cn`, back to the data frame `df`:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncn <- colnames(df)\ncn\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n\n```{.r .cell-code}\ncn==\"IgG_concentration\"\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE  TRUE FALSE FALSE FALSE\n```\n:::\n\n```{.r .cell-code}\ncn[cn==\"IgG_concentration\"] <-\"IgG_concentration_mIU\" #rename cn to \"IgG_concentration_mIU\" when cn is \"IgG_concentration\"\ncolnames(df) <- cn\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"        \"IgG_concentration_mIU\" \"age\"                  \n[4] \"gender\"                \"slum\"                 \n```\n:::\n:::\n\n\nNote, I am resetting the column name back to the original name for the sake of the rest of the module.\n\n::: {.cell}\n\n```{.r .cell-code}\ncolnames(df)[colnames(df)==\"IgG_concentration_mIU\"] <- \"IgG_concentration\" #reset\n```\n:::\n\n\n\n##  Using indexing and logical operators to subset data\n\n\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 10 and then saved the subset data to `df_lt10`. Note that the logical operators `df$age<=10` is before the comma because I want to subset by rows (the first dimension).\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte10 <- df[df$age<=10, ]\n```\n:::\n\nLets check that my subsets worked using the `summary()` function. \n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df_lte10$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's \n    1.0     3.0     4.0     4.8     7.0    10.0       9 \n```\n:::\n:::\n\n\n</br>\n\nIn the next example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte5_gt10 <- df[df$age<=5 | df$age>10, ]\n```\n:::\n\nLets check that my subsets worked using the `summary()` function. \n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df_lte5_gt10$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's \n   1.00    2.50    4.00    6.08   11.00   15.00       9 \n```\n:::\n:::\n\n\n\n## Missing values \n\nMissing data need to be carefully described and dealt with in data analysis. Understanding the different types of missing data and how you can identify them, is the first step to data cleaning.\n\nTypes of \"missing\" values:\n\n- `NA` - **N**ot **A**pplicable general missing data\n- `NaN` - stands for \"**N**ot **a** **N**umber\", happens when you do 0/0.\n- `Inf` and `-Inf` - Infinity, happens when you divide a positive number (or negative number) by 0.\n- blank space - sometimes when data is read it, there is a blank space left\n- an empty string (e.g., `\"\"`) \n- `NULL`- undefined value that represents something that does not exist\n\n## Logical operators to help identify and missing data\n\noperator |description\n-----|-----|-----:\n`is.na`|is NAN or NA\n`is.nan`|is NAN\n`!is.na`|is not NAN or NA\n`!is.nan`|is not NAN\n`is.infinite`|is infinite\n`any`|are any TRUE\n`all`|all are TRUE\n`which`|which are TRUE\n\n## More logical operators examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntest <- c(0,NA, -1)/0\ntest\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  NaN   NA -Inf\n```\n:::\n\n```{.r .cell-code}\nis.na(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE  TRUE FALSE\n```\n:::\n\n```{.r .cell-code}\nis.nan(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE FALSE\n```\n:::\n\n```{.r .cell-code}\nis.infinite(test)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE FALSE  TRUE\n```\n:::\n:::\n\n\n## More logical operators examples\n\n`any(is.na(x))` means do we have any `NA`'s in the object `x`?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nany(is.na(df$IgG_concentration)) # are there any NAs - YES/TRUE\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nany(is.na(df$slum)) # are there any NAs- NO/FALSE\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n:::\n\n\n`which(is.na(x))` means which of the elements in object `x` are `NA`'s?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwhich(is.na(df$IgG_concentration)) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n [1]  13  55  57  72 182 406 414 478 488 595\n```\n:::\n\n```{.r .cell-code}\nwhich(is.na(df$slum)) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\ninteger(0)\n```\n:::\n:::\n\n\n## `subset()` function\n\nThe Base R `subset()` function is a slightly easier way to select variables and observations.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?subset\n```\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nSubsetting Vectors, Matrices and Data Frames\n\nDescription:\n\n     Return subsets of vectors, matrices or data frames which meet\n     conditions.\n\nUsage:\n\n     subset(x, ...)\n     \n     ## Default S3 method:\n     subset(x, subset, ...)\n     \n     ## S3 method for class 'matrix'\n     subset(x, subset, select, drop = FALSE, ...)\n     \n     ## S3 method for class 'data.frame'\n     subset(x, subset, select, drop = FALSE, ...)\n     \nArguments:\n\n       x: object to be subsetted.\n\n  subset: logical expression indicating elements or rows to keep:\n          missing values are taken as false.\n\n  select: expression, indicating columns to select from a data frame.\n\n    drop: passed on to '[' indexing operator.\n\n     ...: further arguments to be passed to or from other methods.\n\nDetails:\n\n     This is a generic function, with methods supplied for matrices,\n     data frames and vectors (including lists).  Packages and users can\n     add further methods.\n\n     For ordinary vectors, the result is simply 'x[subset &\n     !is.na(subset)]'.\n\n     For data frames, the 'subset' argument works on the rows.  Note\n     that 'subset' will be evaluated in the data frame, so columns can\n     be referred to (by name) as variables in the expression (see the\n     examples).\n\n     The 'select' argument exists only for the methods for data frames\n     and matrices.  It works by first replacing column names in the\n     selection expression with the corresponding column numbers in the\n     data frame and then using the resulting integer vector to index\n     the columns.  This allows the use of the standard indexing\n     conventions so that for example ranges of columns can be specified\n     easily, or single columns can be dropped (see the examples).\n\n     The 'drop' argument is passed on to the indexing method for\n     matrices and data frames: note that the default for matrices is\n     different from that for indexing.\n\n     Factors may have empty levels after subsetting; unused levels are\n     not automatically removed.  See 'droplevels' for a way to drop all\n     unused levels from a data frame.\n\nValue:\n\n     An object similar to 'x' contain just the selected elements (for a\n     vector), rows and columns (for a matrix or data frame), and so on.\n\nWarning:\n\n     This is a convenience function intended for use interactively.\n     For programming it is better to use the standard subsetting\n     functions like '[', and in particular the non-standard evaluation\n     of argument 'subset' can have unanticipated consequences.\n\nAuthor(s):\n\n     Peter Dalgaard and Brian Ripley\n\nSee Also:\n\n     '[', 'transform' 'droplevels'\n\nExamples:\n\n     subset(airquality, Temp > 80, select = c(Ozone, Temp))\n     subset(airquality, Day == 1, select = -Temp)\n     subset(airquality, select = Ozone:Wind)\n     \n     with(airquality, subset(Ozone, Temp > 80))\n     \n     ## sometimes requiring a logical 'subset' argument is a nuisance\n     nm <- rownames(state.x77)\n     start_with_M <- nm %in% grep(\"^M\", nm, value = TRUE)\n     subset(state.x77, start_with_M, Illiteracy:Murder)\n     # but in recent versions of R this can simply be\n     subset(state.x77, grepl(\"^M\", nm), Illiteracy:Murder)\n\n\n## Subsetting use the `subset()` function\n\nHere are a few examples using the `subset()` function\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_lte10_v2 <- subset(df, df$age<=10, select=c(IgG_concentration, age))\ndf_lt5_f <- subset(df, df$age<=5 & gender==\"Female\", select=c(IgG_concentration, slum))\n```\n:::\n\n\n## `subset()` function vs logical operators\n\n`subset()` automatically removes NAs, which is a different behavior from doing logical operations on NAs.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df_lte10$age) #created with indexing\n```\n\n::: {.cell-output-display}\n| Min.| 1st Qu.| Median| Mean| 3rd Qu.| Max.| NA's|\n|----:|-------:|------:|----:|-------:|----:|----:|\n|    1|       3|      4|  4.8|       7|   10|    9|\n:::\n\n```{.r .cell-code}\nsummary(df_lte10_v2$age) #created with the subset function\n```\n\n::: {.cell-output-display}\n| Min.| 1st Qu.| Median| Mean| 3rd Qu.| Max.|\n|----:|-------:|------:|----:|-------:|----:|\n|    1|       3|      4|  4.8|       7|   10|\n:::\n:::\n\n\nWe can also see this by looking at the number or rows in each dataset.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnrow(df_lte10)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 504\n```\n:::\n\n```{.r .cell-code}\nnrow(df_lte10_v2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 495\n```\n:::\n:::\n\n\n\n\n## Summary\n\n- `colnames()`, `str()` and `summary()`functions from Base R are functions to assess the data type and some summary statistics\n- There are three basic indexing syntax: `[`, `[[` and `$`\n- Indexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\n- Logical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing\n- There are 7 “types” of missing values, the most common being “NA”\n- Logical operators meant to determine missing values are very helpful for data cleaning\n- The Base R `subset()` function is a slightly easier way to select variables and observations.\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n-   [\"Indexing\" CRAN Project](https://cran.r-project.org/doc/manuals/R-lang.html#Indexing)\n-   [\"Logical operators\" CRAN Project](https://cran.r-project.org/web/packages/extraoperators/vignettes/logicals-vignette.html)\n\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module07-VarCreationClassesSummaries/execute-results/html.json b/_freeze/modules/Module07-VarCreationClassesSummaries/execute-results/html.json
index 2996645..89c0109 100644
--- a/_freeze/modules/Module07-VarCreationClassesSummaries/execute-results/html.json
+++ b/_freeze/modules/Module07-VarCreationClassesSummaries/execute-results/html.json
@@ -1,7 +1,7 @@
 {
-  "hash": "5ecd3b27a4a72d2ba1db1285b9852998",
+  "hash": "d36a9161972c30d45b4350606c8bff8d",
   "result": {
-    "markdown": "---\ntitle: \"Module 7: Variable Creation, Classes, and Summaries\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n## Learning Objectives\n\nAfter module 7, you should be able to...\n\n-   Create new variables\n-   Characterize variable classes\n-   Manipulate the classes of variables\n-   Conduct 1 variable data summaries\n\n## Import data for this module\nLet's first read in the data from the previous module and look at it briefly with a new function `head()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n:::\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nReturn the First or Last Parts of an Object\n\nDescription:\n\n     Returns the first or last parts of a vector, matrix, table, data\n     frame or function.  Since 'head()' and 'tail()' are generic\n     functions, they may also have been extended to other classes.\n\nUsage:\n\n     head(x, ...)\n     ## Default S3 method:\n     head(x, n = 6L, ...)\n     \n     ## S3 method for class 'matrix'\n     head(x, n = 6L, ...) # is exported as head.matrix()\n     ## NB: The methods for 'data.frame' and 'array'  are identical to the 'matrix' one\n     \n     ## S3 method for class 'ftable'\n     head(x, n = 6L, ...)\n     ## S3 method for class 'function'\n     head(x, n = 6L, ...)\n     \n     \n     tail(x, ...)\n     ## Default S3 method:\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'matrix'\n     tail(x, n = 6L, keepnums = TRUE, addrownums, ...) # exported as tail.matrix()\n     ## NB: The methods for 'data.frame', 'array', and 'table'\n     ##     are identical to the  'matrix'  one\n     \n     ## S3 method for class 'ftable'\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'function'\n     tail(x, n = 6L, ...)\n     \nArguments:\n\n       x: an object\n\n       n: an integer vector of length up to 'dim(x)' (or 1, for\n          non-dimensioned objects).  A 'logical' is silently coerced to\n          integer.  Values specify the indices to be selected in the\n          corresponding dimension (or along the length) of the object.\n          A positive value of 'n[i]' includes the first/last 'n[i]'\n          indices in that dimension, while a negative value excludes\n          the last/first 'abs(n[i])', including all remaining indices.\n          'NA' or non-specified values (when 'length(n) <\n          length(dim(x))') select all indices in that dimension. Must\n          contain at least one non-missing value.\n\nkeepnums: in each dimension, if no names in that dimension are present,\n          create them using the indices included in that dimension.\n          Ignored if 'dim(x)' is 'NULL' or its length 1.\n\naddrownums: deprecated - 'keepnums' should be used instead. Taken as\n          the value of 'keepnums' if it is explicitly set when\n          'keepnums' is not.\n\n     ...: arguments to be passed to or from other methods.\n\nDetails:\n\n     For vector/array based objects, 'head()' ('tail()') returns a\n     subset of the same dimensionality as 'x', usually of the same\n     class. For historical reasons, by default they select the first\n     (last) 6 indices in the first dimension (\"rows\") or along the\n     length of a non-dimensioned vector, and the full extent (all\n     indices) in any remaining dimensions. 'head.matrix()' and\n     'tail.matrix()' are exported.\n\n     The default and array(/matrix) methods for 'head()' and 'tail()'\n     are quite general. They will work as is for any class which has a\n     'dim()' method, a 'length()' method (only required if 'dim()'\n     returns 'NULL'), and a '[' method (that accepts the 'drop'\n     argument and can subset in all dimensions in the dimensioned\n     case).\n\n     For functions, the lines of the deparsed function are returned as\n     character strings.\n\n     When 'x' is an array(/matrix) of dimensionality two and more,\n     'tail()' will add dimnames similar to how they would appear in a\n     full printing of 'x' for all dimensions 'k' where 'n[k]' is\n     specified and non-missing and 'dimnames(x)[[k]]' (or 'dimnames(x)'\n     itself) is 'NULL'.  Specifically, the form of the added dimnames\n     will vary for different dimensions as follows:\n\n     'k=1' (rows): '\"[n,]\"' (right justified with whitespace padding)\n\n     'k=2' (columns): '\"[,n]\"' (with _no_ whitespace padding)\n\n     'k>2' (higher dims): '\"n\"', i.e., the indices as _character_\n          values\n\n     Setting 'keepnums = FALSE' suppresses this behaviour.\n\n     As 'data.frame' subsetting ('indexing') keeps 'attributes', so do\n     the 'head()' and 'tail()' methods for data frames.\n\nValue:\n\n     An object (usually) like 'x' but generally smaller.  Hence, for\n     'array's, the result corresponds to 'x[.., drop=FALSE]'.  For\n     'ftable' objects 'x', a transformed 'format(x)'.\n\nNote:\n\n     For array inputs the output of 'tail' when 'keepnums' is 'TRUE',\n     any dimnames vectors added for dimensions '>2' are the original\n     numeric indices in that dimension _as character vectors_.  This\n     means that, e.g., for 3-dimensional array 'arr', 'tail(arr,\n     c(2,2,-1))[ , , 2]' and 'tail(arr, c(2,2,-1))[ , , \"2\"]' may both\n     be valid but have completely different meanings.\n\nAuthor(s):\n\n     Patrick Burns, improved and corrected by R-Core. Negative argument\n     added by Vincent Goulet.  Multi-dimension support added by Gabriel\n     Becker.\n\nExamples:\n\n     head(letters)\n     head(letters, n = -6L)\n     \n     head(freeny.x, n = 10L)\n     head(freeny.y)\n     \n     head(iris3)\n     head(iris3, c(6L, 2L))\n     head(iris3, c(6L, -1L, 2L))\n     \n     tail(letters)\n     tail(letters, n = -6L)\n     \n     tail(freeny.x)\n     ## the bottom-right \"corner\" :\n     tail(freeny.x, n = c(4, 2))\n     tail(freeny.y)\n     \n     tail(iris3)\n     tail(iris3, c(6L, 2L))\n     tail(iris3, c(6L, -1L, 2L))\n     \n     ## iris with dimnames stripped\n     a3d <- iris3 ; dimnames(a3d) <- NULL\n     tail(a3d, c(6, -1, 2)) # keepnums = TRUE is default here!\n     tail(a3d, c(6, -1, 2), keepnums = FALSE)\n     \n     ## data frame w/ a (non-standard) attribute:\n     treeS <- structure(trees, foo = \"bar\")\n     (n <- nrow(treeS))\n     stopifnot(exprs = { # attribute is kept\n         identical(htS <- head(treeS), treeS[1:6, ])\n         identical(attr(htS, \"foo\") , \"bar\")\n         identical(tlS <- tail(treeS), treeS[(n-5):n, ])\n         ## BUT if I use \"useAttrib(.)\", this is *not* ok, when n is of length 2:\n         ## --- because [i,j]-indexing of data frames *also* drops \"other\" attributes ..\n         identical(tail(treeS, 3:2), treeS[(n-2):n, 2:3] )\n     })\n     \n     tail(library) # last lines of function\n     \n     head(stats::ftable(Titanic))\n     \n     ## 1d-array (with named dim) :\n     a1 <- array(1:7, 7); names(dim(a1)) <- \"O2\"\n     stopifnot(exprs = {\n       identical( tail(a1, 10), a1)\n       identical( head(a1, 10), a1)\n       identical( head(a1, 1), a1 [1 , drop=FALSE] ) # was a1[1] in R <= 3.6.x\n       identical( tail(a1, 2), a1[6:7])\n       identical( tail(a1, 1), a1 [7 , drop=FALSE] ) # was a1[7] in R <= 3.6.x\n     })\n\n\n\n## Adding new columns\n\nYou can add a new column, called `newcol` to `df`, using the `$` operator:\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$log_IgG <- log(df$IgG_concentration)\nhead(df,3)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |   log_IgG|\n|--------------:|-----------------:|---:|:------|:--------|---------:|\n|           5772|         0.3176895|   2|Female |Non slum | -1.146681|\n|           8095|         3.4368231|   4|Female |Non slum |  1.234547|\n|           9784|         0.3000000|   4|Male   |Non slum | -1.203973|\n:::\n:::\n\n\n## Creating conditional variables\n\nOne frequently-used tool is creating variables with conditions. A general function for creating new variables based on existing variables is the Base R `ifelse()` function, which \"returns a value depending on whether the element of test is `TRUE` or `FALSE`.\"\n\n\nConditional Element Selection\n\nDescription:\n\n     'ifelse' returns a value with the same shape as 'test' which is\n     filled with elements selected from either 'yes' or 'no' depending\n     on whether the element of 'test' is 'TRUE' or 'FALSE'.\n\nUsage:\n\n     ifelse(test, yes, no)\n     \nArguments:\n\n    test: an object which can be coerced to logical mode.\n\n     yes: return values for true elements of 'test'.\n\n      no: return values for false elements of 'test'.\n\nDetails:\n\n     If 'yes' or 'no' are too short, their elements are recycled.\n     'yes' will be evaluated if and only if any element of 'test' is\n     true, and analogously for 'no'.\n\n     Missing values in 'test' give missing values in the result.\n\nValue:\n\n     A vector of the same length and attributes (including dimensions\n     and '\"class\"') as 'test' and data values from the values of 'yes'\n     or 'no'.  The mode of the answer will be coerced from logical to\n     accommodate first any values taken from 'yes' and then any values\n     taken from 'no'.\n\nWarning:\n\n     The mode of the result may depend on the value of 'test' (see the\n     examples), and the class attribute (see 'oldClass') of the result\n     is taken from 'test' and may be inappropriate for the values\n     selected from 'yes' and 'no'.\n\n     Sometimes it is better to use a construction such as\n\n       (tmp <- yes; tmp[!test] <- no[!test]; tmp)\n     \n     , possibly extended to handle missing values in 'test'.\n\n     Further note that 'if(test) yes else no' is much more efficient\n     and often much preferable to 'ifelse(test, yes, no)' whenever\n     'test' is a simple true/false result, i.e., when 'length(test) ==\n     1'.\n\n     The 'srcref' attribute of functions is handled specially: if\n     'test' is a simple true result and 'yes' evaluates to a function\n     with 'srcref' attribute, 'ifelse' returns 'yes' including its\n     attribute (the same applies to a false 'test' and 'no' argument).\n     This functionality is only for backwards compatibility, the form\n     'if(test) yes else no' should be used whenever 'yes' and 'no' are\n     functions.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'if'.\n\nExamples:\n\n     x <- c(6:-4)\n     sqrt(x)  #- gives warning\n     sqrt(ifelse(x >= 0, x, NA))  # no warning\n     \n     ## Note: the following also gives the warning !\n     ifelse(x >= 0, sqrt(x), NA)\n     \n     \n     ## ifelse() strips attributes\n     ## This is important when working with Dates and factors\n     x <- seq(as.Date(\"2000-02-29\"), as.Date(\"2004-10-04\"), by = \"1 month\")\n     ## has many \"yyyy-mm-29\", but a few \"yyyy-03-01\" in the non-leap years\n     y <- ifelse(as.POSIXlt(x)$mday == 29, x, NA)\n     head(y) # not what you expected ... ==> need restore the class attribute:\n     class(y) <- class(x)\n     y\n     ## This is a (not atypical) case where it is better *not* to use ifelse(),\n     ## but rather the more efficient and still clear:\n     y2 <- x\n     y2[as.POSIXlt(x)$mday != 29] <- NA\n     ## which gives the same as ifelse()+class() hack:\n     stopifnot(identical(y2, y))\n     \n     \n     ## example of different return modes (and 'test' alone determining length):\n     yes <- 1:3\n     no  <- pi^(1:4)\n     utils::str( ifelse(NA,    yes, no) ) # logical, length 1\n     utils::str( ifelse(TRUE,  yes, no) ) # integer, length 1\n     utils::str( ifelse(FALSE, yes, no) ) # double,  length 1\n\n\n\n## `ifelse` example\n\nReminder of the first three arguments in the `ifelse()` function are `ifelse(test, yes, no)`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \"old\")\nhead(df)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |    log_IgG|age_group |\n|--------------:|-----------------:|---:|:------|:--------|----------:|:---------|\n|           5772|         0.3176895|   2|Female |Non slum | -1.1466807|young     |\n|           8095|         3.4368231|   4|Female |Non slum |  1.2345475|young     |\n|           9784|         0.3000000|   4|Male   |Non slum | -1.2039728|young     |\n|           9338|       143.2363014|   4|Male   |Non slum |  4.9644957|young     |\n|           6369|         0.4476534|   1|Male   |Non slum | -0.8037359|young     |\n|           6885|         0.0252708|   4|Male   |Non slum | -3.6781074|young     |\n:::\n:::\n\n\n\n## Nesting `ifelse` statements example\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \n                              ifelse(df$age>10, \"old\", NA)))\nhead(df)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |    log_IgG|age_group |\n|--------------:|-----------------:|---:|:------|:--------|----------:|:---------|\n|           5772|         0.3176895|   2|Female |Non slum | -1.1466807|young     |\n|           8095|         3.4368231|   4|Female |Non slum |  1.2345475|young     |\n|           9784|         0.3000000|   4|Male   |Non slum | -1.2039728|young     |\n|           9338|       143.2363014|   4|Male   |Non slum |  4.9644957|young     |\n|           6369|         0.4476534|   1|Male   |Non slum | -0.8037359|young     |\n|           6885|         0.0252708|   4|Male   |Non slum | -3.6781074|young     |\n:::\n:::\n\n\n\n# Data Classes\n\n## Overview - Data Classes\n\n1. One dimensional types (i.e., vectors of characters, numeric, logical, or factor values)\n\n2. Two dimensional types (e.g., matrix, data frame, tibble)\n\n3. Special data classes (e.g., lists, dates). \n\n## \t`class()` function\n\nThe `class()` function allows you to evaluate the class of an object.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"numeric\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$gender)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\nReturn the First or Last Parts of an Object\n\nDescription:\n\n     Returns the first or last parts of a vector, matrix, table, data\n     frame or function.  Since 'head()' and 'tail()' are generic\n     functions, they may also have been extended to other classes.\n\nUsage:\n\n     head(x, ...)\n     ## Default S3 method:\n     head(x, n = 6L, ...)\n     \n     ## S3 method for class 'matrix'\n     head(x, n = 6L, ...) # is exported as head.matrix()\n     ## NB: The methods for 'data.frame' and 'array'  are identical to the 'matrix' one\n     \n     ## S3 method for class 'ftable'\n     head(x, n = 6L, ...)\n     ## S3 method for class 'function'\n     head(x, n = 6L, ...)\n     \n     \n     tail(x, ...)\n     ## Default S3 method:\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'matrix'\n     tail(x, n = 6L, keepnums = TRUE, addrownums, ...) # exported as tail.matrix()\n     ## NB: The methods for 'data.frame', 'array', and 'table'\n     ##     are identical to the  'matrix'  one\n     \n     ## S3 method for class 'ftable'\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'function'\n     tail(x, n = 6L, ...)\n     \nArguments:\n\n       x: an object\n\n       n: an integer vector of length up to 'dim(x)' (or 1, for\n          non-dimensioned objects).  A 'logical' is silently coerced to\n          integer.  Values specify the indices to be selected in the\n          corresponding dimension (or along the length) of the object.\n          A positive value of 'n[i]' includes the first/last 'n[i]'\n          indices in that dimension, while a negative value excludes\n          the last/first 'abs(n[i])', including all remaining indices.\n          'NA' or non-specified values (when 'length(n) <\n          length(dim(x))') select all indices in that dimension. Must\n          contain at least one non-missing value.\n\nkeepnums: in each dimension, if no names in that dimension are present,\n          create them using the indices included in that dimension.\n          Ignored if 'dim(x)' is 'NULL' or its length 1.\n\naddrownums: deprecated - 'keepnums' should be used instead. Taken as\n          the value of 'keepnums' if it is explicitly set when\n          'keepnums' is not.\n\n     ...: arguments to be passed to or from other methods.\n\nDetails:\n\n     For vector/array based objects, 'head()' ('tail()') returns a\n     subset of the same dimensionality as 'x', usually of the same\n     class. For historical reasons, by default they select the first\n     (last) 6 indices in the first dimension (\"rows\") or along the\n     length of a non-dimensioned vector, and the full extent (all\n     indices) in any remaining dimensions. 'head.matrix()' and\n     'tail.matrix()' are exported.\n\n     The default and array(/matrix) methods for 'head()' and 'tail()'\n     are quite general. They will work as is for any class which has a\n     'dim()' method, a 'length()' method (only required if 'dim()'\n     returns 'NULL'), and a '[' method (that accepts the 'drop'\n     argument and can subset in all dimensions in the dimensioned\n     case).\n\n     For functions, the lines of the deparsed function are returned as\n     character strings.\n\n     When 'x' is an array(/matrix) of dimensionality two and more,\n     'tail()' will add dimnames similar to how they would appear in a\n     full printing of 'x' for all dimensions 'k' where 'n[k]' is\n     specified and non-missing and 'dimnames(x)[[k]]' (or 'dimnames(x)'\n     itself) is 'NULL'.  Specifically, the form of the added dimnames\n     will vary for different dimensions as follows:\n\n     'k=1' (rows): '\"[n,]\"' (right justified with whitespace padding)\n\n     'k=2' (columns): '\"[,n]\"' (with _no_ whitespace padding)\n\n     'k>2' (higher dims): '\"n\"', i.e., the indices as _character_\n          values\n\n     Setting 'keepnums = FALSE' suppresses this behaviour.\n\n     As 'data.frame' subsetting ('indexing') keeps 'attributes', so do\n     the 'head()' and 'tail()' methods for data frames.\n\nValue:\n\n     An object (usually) like 'x' but generally smaller.  Hence, for\n     'array's, the result corresponds to 'x[.., drop=FALSE]'.  For\n     'ftable' objects 'x', a transformed 'format(x)'.\n\nNote:\n\n     For array inputs the output of 'tail' when 'keepnums' is 'TRUE',\n     any dimnames vectors added for dimensions '>2' are the original\n     numeric indices in that dimension _as character vectors_.  This\n     means that, e.g., for 3-dimensional array 'arr', 'tail(arr,\n     c(2,2,-1))[ , , 2]' and 'tail(arr, c(2,2,-1))[ , , \"2\"]' may both\n     be valid but have completely different meanings.\n\nAuthor(s):\n\n     Patrick Burns, improved and corrected by R-Core. Negative argument\n     added by Vincent Goulet.  Multi-dimension support added by Gabriel\n     Becker.\n\nExamples:\n\n     head(letters)\n     head(letters, n = -6L)\n     \n     head(freeny.x, n = 10L)\n     head(freeny.y)\n     \n     head(iris3)\n     head(iris3, c(6L, 2L))\n     head(iris3, c(6L, -1L, 2L))\n     \n     tail(letters)\n     tail(letters, n = -6L)\n     \n     tail(freeny.x)\n     ## the bottom-right \"corner\" :\n     tail(freeny.x, n = c(4, 2))\n     tail(freeny.y)\n     \n     tail(iris3)\n     tail(iris3, c(6L, 2L))\n     tail(iris3, c(6L, -1L, 2L))\n     \n     ## iris with dimnames stripped\n     a3d <- iris3 ; dimnames(a3d) <- NULL\n     tail(a3d, c(6, -1, 2)) # keepnums = TRUE is default here!\n     tail(a3d, c(6, -1, 2), keepnums = FALSE)\n     \n     ## data frame w/ a (non-standard) attribute:\n     treeS <- structure(trees, foo = \"bar\")\n     (n <- nrow(treeS))\n     stopifnot(exprs = { # attribute is kept\n         identical(htS <- head(treeS), treeS[1:6, ])\n         identical(attr(htS, \"foo\") , \"bar\")\n         identical(tlS <- tail(treeS), treeS[(n-5):n, ])\n         ## BUT if I use \"useAttrib(.)\", this is *not* ok, when n is of length 2:\n         ## --- because [i,j]-indexing of data frames *also* drops \"other\" attributes ..\n         identical(tail(treeS, 3:2), treeS[(n-2):n, 2:3] )\n     })\n     \n     tail(library) # last lines of function\n     \n     head(stats::ftable(Titanic))\n     \n     ## 1d-array (with named dim) :\n     a1 <- array(1:7, 7); names(dim(a1)) <- \"O2\"\n     stopifnot(exprs = {\n       identical( tail(a1, 10), a1)\n       identical( head(a1, 10), a1)\n       identical( head(a1, 1), a1 [1 , drop=FALSE] ) # was a1[1] in R <= 3.6.x\n       identical( tail(a1, 2), a1[6:7])\n       identical( tail(a1, 1), a1 [7 , drop=FALSE] ) # was a1[7] in R <= 3.6.x\n     })\n\n\n\n## One dimensional data types\n\n* Character: strings or individual characters, quoted\n* Numeric: any real number(s)\n    - Double: contains fractional values (i.e., double precision) - default numeric\n    - Integer: any integer(s)/whole numbers\n* Logical: variables composed of TRUE or FALSE\n* Factor: categorical/qualitative variables\n\n## Character and numeric\n\nThis can also be a bit tricky. \n\nIf only one character in the whole vector, the class is assumed to be character\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(1, 2, \"tree\")) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\n\nHere because integers are in quotations, it is read as a character class by R.\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(\"1\", \"4\", \"7\")) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\n\nNote, this is the first time we have shown you nested functions.  Here, instead of creating a new vector object (e.g., `x <- c(\"1\", \"4\", \"7\")`) and then feeding the vector object `x` into the first argument of the `class()` function (e.g., `class(x)`), we combined the two steps and directly fed a vector object into the class function.\n\n## Numeric Subclasses\n\nThere are two major numeric subclasses\n\n1. `Double` is a special subset of `numeric` that contains <span style=\"color: red;\">fractional values</span>. `Double` stands for [double-precision](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)\n2. `Integer` is a special subset of `numeric` that contains only <span style=\"color: red;\">whole numbers</span>. \n\n`typeof()` identifies the vector type (double, integer, logical, or character), whereas `class()` identifies the root class. The difference between the two will be more clear when we look at two dimensional classes below.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"numeric\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n\n```{.r .cell-code}\ntypeof(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"double\"\n```\n:::\n\n```{.r .cell-code}\ntypeof(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n:::\n\n\n\n## Logical\n\nReminder `logical` is a type that only has two possible elements: `TRUE` and `FALSE`. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(TRUE, FALSE, TRUE, TRUE, FALSE))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"logical\"\n```\n:::\n:::\n\n\nNote that `logical` elements are NOT in quotes. Putting R special classes (e.g., `NA` or `FALSE`) in quotations turns them into character value. \n\n\n## Other useful functions for evaluating/setting classes\n\nThere are two useful functions associated with practically all R classes: \n\n- `is.CLASS_NAME(x)` to **logically check** whether or not `x` is of certain  class.  For example,  `is.integer` or `is.character` or `is.numeric`\n- `as.CLASS_NAME(x)` to **coerce between classes** `x` from current `x` class into a certain class. For example, `as.integer` or `as.character` or `as.numeric`.  This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).\n\n## Examples `is.CLASS_NAME(x)`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nis.numeric(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nis.character(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n\n```{.r .cell-code}\nis.character(df$gender)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n:::\n\n\n## Examples `as.CLASS_NAME(x)`\n\nIn some cases, coercing is seamless\n\n::: {.cell}\n\n```{.r .cell-code}\nas.character(c(1, 4, 7))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"1\" \"4\" \"7\"\n```\n:::\n\n```{.r .cell-code}\nas.numeric(c(\"1\", \"4\", \"7\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1 4 7\n```\n:::\n\n```{.r .cell-code}\nas.logical(c(\"TRUE\", \"FALSE\", \"FALSE\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE FALSE\n```\n:::\n:::\n\n\nIn some cases the coercing is not possible; if executed, will return `NA` (an R constant representing \"**N**ot **A**vailable\" i.e. missing value)\n\n::: {.cell}\n\n```{.r .cell-code}\nas.numeric(c(\"1\", \"4\", \"7a\"))\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning: NAs introduced by coercion\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  1  4 NA\n```\n:::\n\n```{.r .cell-code}\nas.logical(c(\"TRUE\", \"FALSE\", \"UNKNOWN\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE    NA\n```\n:::\n:::\n\n\n\n## Factors\n\nA `factor` is a special `character` vector where the elements have pre-defined groups or 'levels'. You can think of these as qualitative or categorical variables. Use the `factor()` function to create factors from character values. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$age_group)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n\n```{.r .cell-code}\ndf$age_group_factor <- factor(df$age_group)\nclass(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"factor\"\n```\n:::\n\n```{.r .cell-code}\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"middle\" \"old\"    \"young\" \n```\n:::\n:::\n\n\nNote that levels are, by default, set to **alphanumerical** order! And, the first is always the \"reference\" group. However, we often prefer a different reference group.\n\n## Reference Groups \n\n**Why do we care about reference groups?** \n\nGeneralized linear regression allows you to compare the outcome of two or more groups. Your reference group is the group that everything else is compared to. Say we want to assess whether being <5 years old is associated with higher IgG antibody concentrations \n\nBy default `middle` is the reference group therefore we will only generate beta coefficients comparing `middle` to `young` AND `middle` to `old`.  But, we want `young` to be the reference group so we will generate beta coefficients comparing `young` to `middle` AND `young` to `old`.\n\n## Changing factor reference \n\nChanging the reference group of a factor variable.\n- If the object is already a factor then use `relevel()` function and the `ref` argument to specify the reference.\n- If the object is a character then use `factor()` function and `levels` argument to specify the order of the values, the first being the reference.\n\n\n\nReorder Levels of Factor\n\nDescription:\n\n     The levels of a factor are re-ordered so that the level specified\n     by 'ref' is first and the others are moved down. This is useful\n     for 'contr.treatment' contrasts which take the first level as the\n     reference.\n\nUsage:\n\n     relevel(x, ref, ...)\n     \nArguments:\n\n       x: an unordered factor.\n\n     ref: the reference level, typically a string.\n\n     ...: additional arguments for future methods.\n\nDetails:\n\n     This, as 'reorder()', is a special case of simply calling\n     'factor(x, levels = levels(x)[....])'.\n\nValue:\n\n     A factor of the same length as 'x'.\n\nSee Also:\n\n     'factor', 'contr.treatment', 'levels', 'reorder'.\n\nExamples:\n\n     warpbreaks$tension <- relevel(warpbreaks$tension, ref = \"M\")\n     summary(lm(breaks ~ wool + tension, data = warpbreaks))\n\nFactors\n\nDescription:\n\n     The function 'factor' is used to encode a vector as a factor (the\n     terms 'category' and 'enumerated type' are also used for factors).\n     If argument 'ordered' is 'TRUE', the factor levels are assumed to\n     be ordered.  For compatibility with S there is also a function\n     'ordered'.\n\n     'is.factor', 'is.ordered', 'as.factor' and 'as.ordered' are the\n     membership and coercion functions for these classes.\n\nUsage:\n\n     factor(x = character(), levels, labels = levels,\n            exclude = NA, ordered = is.ordered(x), nmax = NA)\n     \n     ordered(x = character(), ...)\n     \n     is.factor(x)\n     is.ordered(x)\n     \n     as.factor(x)\n     as.ordered(x)\n     \n     addNA(x, ifany = FALSE)\n     \n     .valid.factor(object)\n     \nArguments:\n\n       x: a vector of data, usually taking a small number of distinct\n          values.\n\n  levels: an optional vector of the unique values (as character\n          strings) that 'x' might have taken.  The default is the\n          unique set of values taken by 'as.character(x)', sorted into\n          increasing order _of 'x'_.  Note that this set can be\n          specified as smaller than 'sort(unique(x))'.\n\n  labels: _either_ an optional character vector of labels for the\n          levels (in the same order as 'levels' after removing those in\n          'exclude'), _or_ a character string of length 1.  Duplicated\n          values in 'labels' can be used to map different values of 'x'\n          to the same factor level.\n\n exclude: a vector of values to be excluded when forming the set of\n          levels.  This may be factor with the same level set as 'x' or\n          should be a 'character'.\n\n ordered: logical flag to determine if the levels should be regarded as\n          ordered (in the order given).\n\n    nmax: an upper bound on the number of levels; see 'Details'.\n\n     ...: (in 'ordered(.)'): any of the above, apart from 'ordered'\n          itself.\n\n   ifany: only add an 'NA' level if it is used, i.e.  if\n          'any(is.na(x))'.\n\n  object: an R object.\n\nDetails:\n\n     The type of the vector 'x' is not restricted; it only must have an\n     'as.character' method and be sortable (by 'order').\n\n     Ordered factors differ from factors only in their class, but\n     methods and the model-fitting functions treat the two classes\n     quite differently.\n\n     The encoding of the vector happens as follows.  First all the\n     values in 'exclude' are removed from 'levels'. If 'x[i]' equals\n     'levels[j]', then the 'i'-th element of the result is 'j'.  If no\n     match is found for 'x[i]' in 'levels' (which will happen for\n     excluded values) then the 'i'-th element of the result is set to\n     'NA'.\n\n     Normally the 'levels' used as an attribute of the result are the\n     reduced set of levels after removing those in 'exclude', but this\n     can be altered by supplying 'labels'.  This should either be a set\n     of new labels for the levels, or a character string, in which case\n     the levels are that character string with a sequence number\n     appended.\n\n     'factor(x, exclude = NULL)' applied to a factor without 'NA's is a\n     no-operation unless there are unused levels: in that case, a\n     factor with the reduced level set is returned.  If 'exclude' is\n     used, since R version 3.4.0, excluding non-existing character\n     levels is equivalent to excluding nothing, and when 'exclude' is a\n     'character' vector, that _is_ applied to the levels of 'x'.\n     Alternatively, 'exclude' can be factor with the same level set as\n     'x' and will exclude the levels present in 'exclude'.\n\n     The codes of a factor may contain 'NA'.  For a numeric 'x', set\n     'exclude = NULL' to make 'NA' an extra level (prints as '<NA>');\n     by default, this is the last level.\n\n     If 'NA' is a level, the way to set a code to be missing (as\n     opposed to the code of the missing level) is to use 'is.na' on the\n     left-hand-side of an assignment (as in 'is.na(f)[i] <- TRUE';\n     indexing inside 'is.na' does not work).  Under those circumstances\n     missing values are currently printed as '<NA>', i.e., identical to\n     entries of level 'NA'.\n\n     'is.factor' is generic: you can write methods to handle specific\n     classes of objects, see InternalMethods.\n\n     Where 'levels' is not supplied, 'unique' is called.  Since factors\n     typically have quite a small number of levels, for large vectors\n     'x' it is helpful to supply 'nmax' as an upper bound on the number\n     of unique values.\n\n     When using 'c' to combine a (possibly ordered) factor with other\n     objects, if all objects are (possibly ordered) factors, the result\n     will be a factor with levels the union of the level sets of the\n     elements, in the order the levels occur in the level sets of the\n     elements (which means that if all the elements have the same level\n     set, that is the level set of the result), equivalent to how\n     'unlist' operates on a list of factor objects.\n\nValue:\n\n     'factor' returns an object of class '\"factor\"' which has a set of\n     integer codes the length of 'x' with a '\"levels\"' attribute of\n     mode 'character' and unique ('!anyDuplicated(.)') entries.  If\n     argument 'ordered' is true (or 'ordered()' is used) the result has\n     class 'c(\"ordered\", \"factor\")'.  Undocumentedly for a long time,\n     'factor(x)' loses all 'attributes(x)' but '\"names\"', and resets\n     '\"levels\"' and '\"class\"'.\n\n     Applying 'factor' to an ordered or unordered factor returns a\n     factor (of the same type) with just the levels which occur: see\n     also '[.factor' for a more transparent way to achieve this.\n\n     'is.factor' returns 'TRUE' or 'FALSE' depending on whether its\n     argument is of type factor or not.  Correspondingly, 'is.ordered'\n     returns 'TRUE' when its argument is an ordered factor and 'FALSE'\n     otherwise.\n\n     'as.factor' coerces its argument to a factor.  It is an\n     abbreviated (sometimes faster) form of 'factor'.\n\n     'as.ordered(x)' returns 'x' if this is ordered, and 'ordered(x)'\n     otherwise.\n\n     'addNA' modifies a factor by turning 'NA' into an extra level (so\n     that 'NA' values are counted in tables, for instance).\n\n     '.valid.factor(object)' checks the validity of a factor, currently\n     only 'levels(object)', and returns 'TRUE' if it is valid,\n     otherwise a string describing the validity problem.  This function\n     is used for 'validObject(<factor>)'.\n\nWarning:\n\n     The interpretation of a factor depends on both the codes and the\n     '\"levels\"' attribute.  Be careful only to compare factors with the\n     same set of levels (in the same order).  In particular,\n     'as.numeric' applied to a factor is meaningless, and may happen by\n     implicit coercion.  To transform a factor 'f' to approximately its\n     original numeric values, 'as.numeric(levels(f))[f]' is recommended\n     and slightly more efficient than 'as.numeric(as.character(f))'.\n\n     The levels of a factor are by default sorted, but the sort order\n     may well depend on the locale at the time of creation, and should\n     not be assumed to be ASCII.\n\n     There are some anomalies associated with factors that have 'NA' as\n     a level.  It is suggested to use them sparingly, e.g., only for\n     tabulation purposes.\n\nComparison operators and group generic methods:\n\n     There are '\"factor\"' and '\"ordered\"' methods for the group generic\n     'Ops' which provide methods for the Comparison operators, and for\n     the 'min', 'max', and 'range' generics in 'Summary' of\n     '\"ordered\"'.  (The rest of the groups and the 'Math' group\n     generate an error as they are not meaningful for factors.)\n\n     Only '==' and '!=' can be used for factors: a factor can only be\n     compared to another factor with an identical set of levels (not\n     necessarily in the same ordering) or to a character vector.\n     Ordered factors are compared in the same way, but the general\n     dispatch mechanism precludes comparing ordered and unordered\n     factors.\n\n     All the comparison operators are available for ordered factors.\n     Collation is done by the levels of the operands: if both operands\n     are ordered factors they must have the same level set.\n\nNote:\n\n     In earlier versions of R, storing character data as a factor was\n     more space efficient if there is even a small proportion of\n     repeats.  However, identical character strings now share storage,\n     so the difference is small in most cases.  (Integer values are\n     stored in 4 bytes whereas each reference to a character string\n     needs a pointer of 4 or 8 bytes.)\n\nReferences:\n\n     Chambers, J. M. and Hastie, T. J. (1992) _Statistical Models in\n     S_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     '[.factor' for subsetting of factors.\n\n     'gl' for construction of balanced factors and 'C' for factors with\n     specified contrasts.  'levels' and 'nlevels' for accessing the\n     levels, and 'unclass' to get integer codes.\n\nExamples:\n\n     (ff <- factor(substring(\"statistics\", 1:10, 1:10), levels = letters))\n     as.integer(ff)      # the internal codes\n     (f. <- factor(ff))  # drops the levels that do not occur\n     ff[, drop = TRUE]   # the same, more transparently\n     \n     factor(letters[1:20], labels = \"letter\")\n     \n     class(ordered(4:1)) # \"ordered\", inheriting from \"factor\"\n     z <- factor(LETTERS[3:1], ordered = TRUE)\n     ## and \"relational\" methods work:\n     stopifnot(sort(z)[c(1,3)] == range(z), min(z) < max(z))\n     \n     \n     ## suppose you want \"NA\" as a level, and to allow missing values.\n     (x <- factor(c(1, 2, NA), exclude = NULL))\n     is.na(x)[2] <- TRUE\n     x  # [1] 1    <NA> <NA>\n     is.na(x)\n     # [1] FALSE  TRUE FALSE\n     \n     ## More rational, since R 3.4.0 :\n     factor(c(1:2, NA), exclude =  \"\" ) # keeps <NA> , as\n     factor(c(1:2, NA), exclude = NULL) # always did\n     ## exclude = <character>\n     z # ordered levels 'A < B < C'\n     factor(z, exclude = \"C\") # does exclude\n     factor(z, exclude = \"B\") # ditto\n     \n     ## Now, labels maybe duplicated:\n     ## factor() with duplicated labels allowing to \"merge levels\"\n     x <- c(\"Man\", \"Male\", \"Man\", \"Lady\", \"Female\")\n     ## Map from 4 different values to only two levels:\n     (xf <- factor(x, levels = c(\"Male\", \"Man\" , \"Lady\",   \"Female\"),\n                      labels = c(\"Male\", \"Male\", \"Female\", \"Female\")))\n     #> [1] Male   Male   Male   Female Female\n     #> Levels: Male Female\n     \n     ## Using addNA()\n     Month <- airquality$Month\n     table(addNA(Month))\n     table(addNA(Month, ifany = TRUE))\n\n\n\n## Changing factor reference examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group_factor <- relevel(df$age_group_factor, ref=\"young\")\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"young\"  \"middle\" \"old\"   \n```\n:::\n:::\n\n\nOR\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group_factor <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"young\"  \"middle\" \"old\"   \n```\n:::\n:::\n\n\nArranging, tabulating, and plotting the data will reflect the new order\n\n\n## Two-dimensional data classes\n\nTwo-dimensional classes are those we would often use to store data read from a file \n\n* a matrix (`matrix` class)\n* a data frame (`data.frame` or `tibble` classes)\n\n\n## Matrices\n\nMatrices, like data frames are also composed of rows and columns. Matrices, unlike `data.frame`, the entire matrix is composed of one R class. **For example: all entries are `numeric`, or all entries are `character`**\n\n`as.matrix()` creates a matrix from a data frame (where all values are the same class).\n\nYou can also create a matrix from scratch using `matrix()` Use `?matrix` to see the arguments.  \n\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix(1:6, ncol = 2) \n```\n\n::: {.cell-output-display}\n|   |   |\n|--:|--:|\n|  1|  4|\n|  2|  5|\n|  3|  6|\n:::\n\n```{.r .cell-code}\nmatrix(1:6, ncol=2, byrow=TRUE) \n```\n\n::: {.cell-output-display}\n|   |   |\n|--:|--:|\n|  1|  2|\n|  3|  4|\n|  5|  6|\n:::\n:::\n\n\nNotice, the first matrix filled in numbers 1-6 by columns first and then rows because default `byrow` argument is FALSE. In the second matrix, we changed the argument `byrow` to `TRUE`, and now numbers 1-6 are filled by rows first and then columns.\n\n## Data Frame \n\nYou can transform an existing matrix into data frames and tibble using `as.data.frame()`.  \n\n\n::: {.cell}\n\n```{.r .cell-code}\nas.data.frame(matrix(1:6, ncol = 2) ) \n```\n\n::: {.cell-output-display}\n| V1| V2|\n|--:|--:|\n|  1|  4|\n|  2|  5|\n|  3|  6|\n:::\n:::\n\n\n\n## Numeric variable data summary\n\nData summarization on numeric vectors/variables:\n-\t\t`mean()`: takes the mean of x\n-\t\t`sd()`: takes the standard deviation of x\n-\t\t`median()`: takes the median of x\n-\t\t`quantile()`: displays sample quantiles of x. Default is min, IQR, max\n-\t\t`range()`: displays the range. Same as `c(min(), max())`\n-\t\t`sum()`: sum of x\n-\t\t`max()`: maximum value in x\n-\t\t`min()`: minimum value in x\n\nNote, **all have the ** `na.rm =` **argument for missing data**\n\n\nArithmetic Mean\n\nDescription:\n\n     Generic function for the (trimmed) arithmetic mean.\n\nUsage:\n\n     mean(x, ...)\n     \n     ## Default S3 method:\n     mean(x, trim = 0, na.rm = FALSE, ...)\n     \nArguments:\n\n       x: An R object.  Currently there are methods for numeric/logical\n          vectors and date, date-time and time interval objects.\n          Complex vectors are allowed for 'trim = 0', only.\n\n    trim: the fraction (0 to 0.5) of observations to be trimmed from\n          each end of 'x' before the mean is computed.  Values of trim\n          outside that range are taken as the nearest endpoint.\n\n   na.rm: a logical evaluating to 'TRUE' or 'FALSE' indicating whether\n          'NA' values should be stripped before the computation\n          proceeds.\n\n     ...: further arguments passed to or from other methods.\n\nValue:\n\n     If 'trim' is zero (the default), the arithmetic mean of the values\n     in 'x' is computed, as a numeric or complex vector of length one.\n     If 'x' is not logical (coerced to numeric), numeric (including\n     integer) or complex, 'NA_real_' is returned, with a warning.\n\n     If 'trim' is non-zero, a symmetrically trimmed mean is computed\n     with a fraction of 'trim' observations deleted from each end\n     before the mean is computed.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'weighted.mean', 'mean.POSIXct', 'colMeans' for row and column\n     means.\n\nExamples:\n\n     x <- c(0:10, 50)\n     xm <- mean(x)\n     c(xm, mean(x, trim = 0.10))\n\n\n## Numeric variable data summary examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df)\n```\n\n::: {.cell-output-display}\n|   |observation_id |IgG_concentration |     age       |   gender        |    slum         |   log_IgG      | age_group       |age_group_factor |\n|:--|:--------------|:-----------------|:--------------|:----------------|:----------------|:---------------|:----------------|:----------------|\n|   |Min.   :5006   |Min.   :  0.0054  |Min.   : 1.000 |Length:651       |Length:651       |Min.   :-5.2231 |Length:651       |young :316       |\n|   |1st Qu.:6306   |1st Qu.:  0.3000  |1st Qu.: 3.000 |Class :character |Class :character |1st Qu.:-1.2040 |Class :character |middle:179       |\n|   |Median :7495   |Median :  1.6658  |Median : 6.000 |Mode  :character |Mode  :character |Median : 0.5103 |Mode  :character |old   :147       |\n|   |Mean   :7492   |Mean   : 87.3683  |Mean   : 6.606 |NA               |NA               |Mean   : 1.6074 |NA               |NA's  :  9       |\n|   |3rd Qu.:8749   |3rd Qu.:141.4405  |3rd Qu.:10.000 |NA               |NA               |3rd Qu.: 4.9519 |NA               |NA               |\n|   |Max.   :9982   |Max.   :916.4179  |Max.   :15.000 |NA               |NA               |Max.   : 6.8205 |NA               |NA               |\n|   |NA             |NA's   :10        |NA's   :9      |NA               |NA               |NA's   :10      |NA               |NA               |\n:::\n\n```{.r .cell-code}\nrange(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] NA NA\n```\n:::\n\n```{.r .cell-code}\nrange(df$age, na.rm=TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  1 15\n```\n:::\n\n```{.r .cell-code}\nmedian(df$IgG_concentration, na.rm=TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1.665753\n```\n:::\n:::\n\n\n\n## Character Variable Data Summaries\n\nData summarization on character or factor vectors/variables\n\t\t* `table()`\n\t\t\n\nCross Tabulation and Table Creation\n\nDescription:\n\n     'table' uses cross-classifying factors to build a contingency\n     table of the counts at each combination of factor levels.\n\nUsage:\n\n     table(...,\n           exclude = if (useNA == \"no\") c(NA, NaN),\n           useNA = c(\"no\", \"ifany\", \"always\"),\n           dnn = list.names(...), deparse.level = 1)\n     \n     as.table(x, ...)\n     is.table(x)\n     \n     ## S3 method for class 'table'\n     as.data.frame(x, row.names = NULL, ...,\n                   responseName = \"Freq\", stringsAsFactors = TRUE,\n                   sep = \"\", base = list(LETTERS))\n     \nArguments:\n\n     ...: one or more objects which can be interpreted as factors\n          (including numbers or character strings), or a 'list' (such\n          as a data frame) whose components can be so interpreted.\n          (For 'as.table', arguments passed to specific methods; for\n          'as.data.frame', unused.)\n\n exclude: levels to remove for all factors in '...'.  If it does not\n          contain 'NA' and 'useNA' is not specified, it implies 'useNA\n          = \"ifany\"'.  See 'Details' for its interpretation for\n          non-factor arguments.\n\n   useNA: whether to include 'NA' values in the table.  See 'Details'.\n          Can be abbreviated.\n\n     dnn: the names to be given to the dimensions in the result (the\n          _dimnames names_).\n\ndeparse.level: controls how the default 'dnn' is constructed.  See\n          'Details'.\n\n       x: an arbitrary R object, or an object inheriting from class\n          '\"table\"' for the 'as.data.frame' method. Note that\n          'as.data.frame.table(x, *)' may be called explicitly for\n          non-table 'x' for \"reshaping\" 'array's.\n\nrow.names: a character vector giving the row names for the data frame.\n\nresponseName: The name to be used for the column of table entries,\n          usually counts.\n\nstringsAsFactors: logical: should the classifying factors be returned\n          as factors (the default) or character vectors?\n\nsep, base: passed to 'provideDimnames'.\n\nDetails:\n\n     If the argument 'dnn' is not supplied, the internal function\n     'list.names' is called to compute the 'dimname names' as follows:\n     If '...' is one 'list' with its own 'names()', these 'names' are\n     used.  Otherwise, if the arguments in '...' are named, those names\n     are used.  For the remaining arguments, 'deparse.level = 0' gives\n     an empty name, 'deparse.level = 1' uses the supplied argument if\n     it is a symbol, and 'deparse.level = 2' will deparse the argument.\n\n     Only when 'exclude' is specified (i.e., not by default) and\n     non-empty, will 'table' potentially drop levels of factor\n     arguments.\n\n     'useNA' controls if the table includes counts of 'NA' values: the\n     allowed values correspond to never ('\"no\"'), only if the count is\n     positive ('\"ifany\"') and even for zero counts ('\"always\"').  Note\n     the somewhat \"pathological\" case of two different kinds of 'NA's\n     which are treated differently, depending on both 'useNA' and\n     'exclude', see 'd.patho' in the 'Examples:' below.\n\n     Both 'exclude' and 'useNA' operate on an \"all or none\" basis.  If\n     you want to control the dimensions of a multiway table separately,\n     modify each argument using 'factor' or 'addNA'.\n\n     Non-factor arguments 'a' are coerced via 'factor(a,\n     exclude=exclude)'.  Since R 3.4.0, care is taken _not_ to count\n     the excluded values (where they were included in the 'NA' count,\n     previously).\n\n     The 'summary' method for class '\"table\"' (used for objects created\n     by 'table' or 'xtabs') which gives basic information and performs\n     a chi-squared test for independence of factors (note that the\n     function 'chisq.test' currently only handles 2-d tables).\n\nValue:\n\n     'table()' returns a _contingency table_, an object of class\n     '\"table\"', an array of integer values.  Note that unlike S the\n     result is always an 'array', a 1D array if one factor is given.\n\n     'as.table' and 'is.table' coerce to and test for contingency\n     table, respectively.\n\n     The 'as.data.frame' method for objects inheriting from class\n     '\"table\"' can be used to convert the array-based representation of\n     a contingency table to a data frame containing the classifying\n     factors and the corresponding entries (the latter as component\n     named by 'responseName').  This is the inverse of 'xtabs'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'tabulate' is the underlying function and allows finer control.\n\n     Use 'ftable' for printing (and more) of multidimensional tables.\n     'margin.table', 'prop.table', 'addmargins'.\n\n     'addNA' for constructing factors with 'NA' as a level.\n\n     'xtabs' for cross tabulation of data frames with a formula\n     interface.\n\nExamples:\n\n     require(stats) # for rpois and xtabs\n     ## Simple frequency distribution\n     table(rpois(100, 5))\n     ## Check the design:\n     with(warpbreaks, table(wool, tension))\n     table(state.division, state.region)\n     \n     # simple two-way contingency table\n     with(airquality, table(cut(Temp, quantile(Temp)), Month))\n     \n     a <- letters[1:3]\n     table(a, sample(a))                    # dnn is c(\"a\", \"\")\n     table(a, sample(a), deparse.level = 0) # dnn is c(\"\", \"\")\n     table(a, sample(a), deparse.level = 2) # dnn is c(\"a\", \"sample(a)\")\n     \n     ## xtabs() <-> as.data.frame.table() :\n     UCBAdmissions ## already a contingency table\n     DF <- as.data.frame(UCBAdmissions)\n     class(tab <- xtabs(Freq ~ ., DF)) # xtabs & table\n     ## tab *is* \"the same\" as the original table:\n     all(tab == UCBAdmissions)\n     all.equal(dimnames(tab), dimnames(UCBAdmissions))\n     \n     a <- rep(c(NA, 1/0:3), 10)\n     table(a)                 # does not report NA's\n     table(a, exclude = NULL) # reports NA's\n     b <- factor(rep(c(\"A\",\"B\",\"C\"), 10))\n     table(b)\n     table(b, exclude = \"B\")\n     d <- factor(rep(c(\"A\",\"B\",\"C\"), 10), levels = c(\"A\",\"B\",\"C\",\"D\",\"E\"))\n     table(d, exclude = \"B\")\n     print(table(b, d), zero.print = \".\")\n     \n     ## NA counting:\n     is.na(d) <- 3:4\n     d. <- addNA(d)\n     d.[1:7]\n     table(d.) # \", exclude = NULL\" is not needed\n     ## i.e., if you want to count the NA's of 'd', use\n     table(d, useNA = \"ifany\")\n     \n     ## \"pathological\" case:\n     d.patho <- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) <- 3:4\n     d.patho\n     ## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :\n     as.integer(d.patho) # 1 4 NA NA 1 2\n     ##\n     ## In R >= 3.4.0, table() allows to differentiate:\n     table(d.patho)                   # counts the \"unusual\" NA\n     table(d.patho, useNA = \"ifany\")  # counts all three\n     table(d.patho, exclude = NULL)   #  (ditto)\n     table(d.patho, exclude = NA)     # counts none\n     \n     ## Two-way tables with NA counts. The 3rd variant is absurd, but shows\n     ## something that cannot be done using exclude or useNA.\n     with(airquality,\n        table(OzHi = Ozone > 80, Month, useNA = \"ifany\"))\n     with(airquality,\n        table(OzHi = Ozone > 80, Month, useNA = \"always\"))\n     with(airquality,\n        table(OzHi = Ozone > 80, addNA(Month)))\n\n\n\n\n## Character variable data summary examples\n\nNumber of observations in each category\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntable(df$gender)\n```\n\n::: {.cell-output-display}\n| Female| Male|\n|------:|----:|\n|    325|  326|\n:::\n\n```{.r .cell-code}\ntable(df$gender, useNA=\"always\")\n```\n\n::: {.cell-output-display}\n| Female| Male| NA|\n|------:|----:|--:|\n|    325|  326|  0|\n:::\n\n```{.r .cell-code}\ntable(df$age_group, useNA=\"always\")\n```\n\n::: {.cell-output-display}\n| middle| old| young| NA|\n|------:|---:|-----:|--:|\n|    179| 147|   316|  9|\n:::\n:::\n\n\nPercent of observations in each category (xxzane - better way in base r?)\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntable(df$gender)/nrow(df) #if no NA values\n```\n\n::: {.cell-output-display}\n|   Female|     Male|\n|--------:|--------:|\n| 0.499232| 0.500768|\n:::\n\n```{.r .cell-code}\ntable(df$age_group)/nrow(df[!is.na(df$age_group),]) #if there are NA values\n```\n\n::: {.cell-output-display}\n|    middle|      old|     young|\n|---------:|--------:|---------:|\n| 0.2788162| 0.228972| 0.4922118|\n:::\n\n```{.r .cell-code}\ntable(df$age_group)/nrow(subset(df, !is.na(df$age_group),)) #if there are NA values\n```\n\n::: {.cell-output-display}\n|    middle|      old|     young|\n|---------:|--------:|---------:|\n| 0.2788162| 0.228972| 0.4922118|\n:::\n:::\n\n\n\n\n## Summary\n\n-   Adding (or modifying) columns/variable to a data frame by using `$` \n-   There are two types of numeric class objects: integer and double\n-   Logical class objects only have `TRUE` or `False` (without quotes)\n-   `is.CLASS_NAME(x)` can be used to test the class of an object x\n-   `as.CLASS_NAME(x)` can be used to change the class of an object x\n-   Factors are a special character class that has levels \n-   ...\n\t\t\n\n## Acknowledgements\n\nThese are the materials I looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n\n",
+    "markdown": "---\ntitle: \"Module 7: Variable Creation, Classes, and Summaries\"\nformat:\n  revealjs:\n    smaller: true\n    scrollable: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 7, you should be able to...\n\n-   Create new variables\n-   Characterize variable classes\n-   Manipulate the classes of variables\n-   Conduct 1 variable data summaries\n\n## Import data for this module\nLet's first read in the data from the previous module and look at it briefly with a new function `head()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n:::\n:::\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nReturn the First or Last Parts of an Object\n\nDescription:\n\n     Returns the first or last parts of a vector, matrix, table, data\n     frame or function.  Since 'head()' and 'tail()' are generic\n     functions, they may also have been extended to other classes.\n\nUsage:\n\n     head(x, ...)\n     ## Default S3 method:\n     head(x, n = 6L, ...)\n     \n     ## S3 method for class 'matrix'\n     head(x, n = 6L, ...) # is exported as head.matrix()\n     ## NB: The methods for 'data.frame' and 'array'  are identical to the 'matrix' one\n     \n     ## S3 method for class 'ftable'\n     head(x, n = 6L, ...)\n     ## S3 method for class 'function'\n     head(x, n = 6L, ...)\n     \n     \n     tail(x, ...)\n     ## Default S3 method:\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'matrix'\n     tail(x, n = 6L, keepnums = TRUE, addrownums, ...) # exported as tail.matrix()\n     ## NB: The methods for 'data.frame', 'array', and 'table'\n     ##     are identical to the  'matrix'  one\n     \n     ## S3 method for class 'ftable'\n     tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n     ## S3 method for class 'function'\n     tail(x, n = 6L, ...)\n     \nArguments:\n\n       x: an object\n\n       n: an integer vector of length up to 'dim(x)' (or 1, for\n          non-dimensioned objects).  A 'logical' is silently coerced to\n          integer.  Values specify the indices to be selected in the\n          corresponding dimension (or along the length) of the object.\n          A positive value of 'n[i]' includes the first/last 'n[i]'\n          indices in that dimension, while a negative value excludes\n          the last/first 'abs(n[i])', including all remaining indices.\n          'NA' or non-specified values (when 'length(n) <\n          length(dim(x))') select all indices in that dimension. Must\n          contain at least one non-missing value.\n\nkeepnums: in each dimension, if no names in that dimension are present,\n          create them using the indices included in that dimension.\n          Ignored if 'dim(x)' is 'NULL' or its length 1.\n\naddrownums: deprecated - 'keepnums' should be used instead. Taken as\n          the value of 'keepnums' if it is explicitly set when\n          'keepnums' is not.\n\n     ...: arguments to be passed to or from other methods.\n\nDetails:\n\n     For vector/array based objects, 'head()' ('tail()') returns a\n     subset of the same dimensionality as 'x', usually of the same\n     class. For historical reasons, by default they select the first\n     (last) 6 indices in the first dimension (\"rows\") or along the\n     length of a non-dimensioned vector, and the full extent (all\n     indices) in any remaining dimensions. 'head.matrix()' and\n     'tail.matrix()' are exported.\n\n     The default and array(/matrix) methods for 'head()' and 'tail()'\n     are quite general. They will work as is for any class which has a\n     'dim()' method, a 'length()' method (only required if 'dim()'\n     returns 'NULL'), and a '[' method (that accepts the 'drop'\n     argument and can subset in all dimensions in the dimensioned\n     case).\n\n     For functions, the lines of the deparsed function are returned as\n     character strings.\n\n     When 'x' is an array(/matrix) of dimensionality two and more,\n     'tail()' will add dimnames similar to how they would appear in a\n     full printing of 'x' for all dimensions 'k' where 'n[k]' is\n     specified and non-missing and 'dimnames(x)[[k]]' (or 'dimnames(x)'\n     itself) is 'NULL'.  Specifically, the form of the added dimnames\n     will vary for different dimensions as follows:\n\n     'k=1' (rows): '\"[n,]\"' (right justified with whitespace padding)\n\n     'k=2' (columns): '\"[,n]\"' (with _no_ whitespace padding)\n\n     'k>2' (higher dims): '\"n\"', i.e., the indices as _character_\n          values\n\n     Setting 'keepnums = FALSE' suppresses this behaviour.\n\n     As 'data.frame' subsetting ('indexing') keeps 'attributes', so do\n     the 'head()' and 'tail()' methods for data frames.\n\nValue:\n\n     An object (usually) like 'x' but generally smaller.  Hence, for\n     'array's, the result corresponds to 'x[.., drop=FALSE]'.  For\n     'ftable' objects 'x', a transformed 'format(x)'.\n\nNote:\n\n     For array inputs the output of 'tail' when 'keepnums' is 'TRUE',\n     any dimnames vectors added for dimensions '>2' are the original\n     numeric indices in that dimension _as character vectors_.  This\n     means that, e.g., for 3-dimensional array 'arr', 'tail(arr,\n     c(2,2,-1))[ , , 2]' and 'tail(arr, c(2,2,-1))[ , , \"2\"]' may both\n     be valid but have completely different meanings.\n\nAuthor(s):\n\n     Patrick Burns, improved and corrected by R-Core. Negative argument\n     added by Vincent Goulet.  Multi-dimension support added by Gabriel\n     Becker.\n\nExamples:\n\n     head(letters)\n     head(letters, n = -6L)\n     \n     head(freeny.x, n = 10L)\n     head(freeny.y)\n     \n     head(iris3)\n     head(iris3, c(6L, 2L))\n     head(iris3, c(6L, -1L, 2L))\n     \n     tail(letters)\n     tail(letters, n = -6L)\n     \n     tail(freeny.x)\n     ## the bottom-right \"corner\" :\n     tail(freeny.x, n = c(4, 2))\n     tail(freeny.y)\n     \n     tail(iris3)\n     tail(iris3, c(6L, 2L))\n     tail(iris3, c(6L, -1L, 2L))\n     \n     ## iris with dimnames stripped\n     a3d <- iris3 ; dimnames(a3d) <- NULL\n     tail(a3d, c(6, -1, 2)) # keepnums = TRUE is default here!\n     tail(a3d, c(6, -1, 2), keepnums = FALSE)\n     \n     ## data frame w/ a (non-standard) attribute:\n     treeS <- structure(trees, foo = \"bar\")\n     (n <- nrow(treeS))\n     stopifnot(exprs = { # attribute is kept\n         identical(htS <- head(treeS), treeS[1:6, ])\n         identical(attr(htS, \"foo\") , \"bar\")\n         identical(tlS <- tail(treeS), treeS[(n-5):n, ])\n         ## BUT if I use \"useAttrib(.)\", this is *not* ok, when n is of length 2:\n         ## --- because [i,j]-indexing of data frames *also* drops \"other\" attributes ..\n         identical(tail(treeS, 3:2), treeS[(n-2):n, 2:3] )\n     })\n     \n     tail(library) # last lines of function\n     \n     head(stats::ftable(Titanic))\n     \n     ## 1d-array (with named dim) :\n     a1 <- array(1:7, 7); names(dim(a1)) <- \"O2\"\n     stopifnot(exprs = {\n       identical( tail(a1, 10), a1)\n       identical( head(a1, 10), a1)\n       identical( head(a1, 1), a1 [1 , drop=FALSE] ) # was a1[1] in R <= 3.6.x\n       identical( tail(a1, 2), a1[6:7])\n       identical( tail(a1, 1), a1 [7 , drop=FALSE] ) # was a1[7] in R <= 3.6.x\n     })\n\n\n\n## Adding new columns\n\nYou can add a new column, called `log_IgG` to `df`, using the `$` operator:\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$log_IgG <- log(df$IgG_concentration)\nhead(df,3)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |   log_IgG|\n|--------------:|-----------------:|---:|:------|:--------|---------:|\n|           5772|         0.3176895|   2|Female |Non slum | -1.146681|\n|           8095|         3.4368231|   4|Female |Non slum |  1.234547|\n|           9784|         0.3000000|   4|Male   |Non slum | -1.203973|\n:::\n:::\n\n\nNote, my use of the underscore in the variable name rather than a space.  This is good coding practice and make calling variables much less prone to error.\n\n## Adding new columns\n\nWe can also add a new column using the `transform()` function:\n\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\nTransform an Object, for Example a Data Frame\n\nDescription:\n\n     'transform' is a generic function, which-at least currently-only\n     does anything useful with data frames.  'transform.default'\n     converts its first argument to a data frame if possible and calls\n     'transform.data.frame'.\n\nUsage:\n\n     transform(`_data`, ...)\n     \nArguments:\n\n   _data: The object to be transformed\n\n     ...: Further arguments of the form 'tag=value'\n\nDetails:\n\n     The '...' arguments to 'transform.data.frame' are tagged vector\n     expressions, which are evaluated in the data frame '_data'.  The\n     tags are matched against 'names(_data)', and for those that match,\n     the value replace the corresponding variable in '_data', and the\n     others are appended to '_data'.\n\nValue:\n\n     The modified value of '_data'.\n\nWarning:\n\n     This is a convenience function intended for use interactively.\n     For programming it is better to use the standard subsetting\n     arithmetic functions, and in particular the non-standard\n     evaluation of argument 'transform' can have unanticipated\n     consequences.\n\nNote:\n\n     If some of the values are not vectors of the appropriate length,\n     you deserve whatever you get!\n\nAuthor(s):\n\n     Peter Dalgaard\n\nSee Also:\n\n     'within' for a more flexible approach, 'subset', 'list',\n     'data.frame'\n\nExamples:\n\n     transform(airquality, Ozone = -Ozone)\n     transform(airquality, new = -Ozone, Temp = (Temp-32)/1.8)\n     \n     attach(airquality)\n     transform(Ozone, logOzone = log(Ozone)) # marginally interesting ...\n     detach(airquality)\n```\n:::\n:::\n\n\nFor example, adding a binary column for seropositivity called `seropos`:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- transform(df, seropos = IgG_concentration >= 10)\nhead(df)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |    log_IgG|seropos |\n|--------------:|-----------------:|---:|:------|:--------|----------:|:-------|\n|           5772|         0.3176895|   2|Female |Non slum | -1.1466807|FALSE   |\n|           8095|         3.4368231|   4|Female |Non slum |  1.2345475|FALSE   |\n|           9784|         0.3000000|   4|Male   |Non slum | -1.2039728|FALSE   |\n|           9338|       143.2363014|   4|Male   |Non slum |  4.9644957|TRUE    |\n|           6369|         0.4476534|   1|Male   |Non slum | -0.8037359|FALSE   |\n|           6885|         0.0252708|   4|Male   |Non slum | -3.6781074|FALSE   |\n:::\n:::\n\n\n\n## Creating conditional variables\n\nOne frequently-used tool is creating variables with conditions. A general function for creating new variables based on existing variables is the Base R `ifelse()` function, which \"returns a value depending on whether the element of test is `TRUE` or `FALSE`.\"\n\n\nConditional Element Selection\n\nDescription:\n\n     'ifelse' returns a value with the same shape as 'test' which is\n     filled with elements selected from either 'yes' or 'no' depending\n     on whether the element of 'test' is 'TRUE' or 'FALSE'.\n\nUsage:\n\n     ifelse(test, yes, no)\n     \nArguments:\n\n    test: an object which can be coerced to logical mode.\n\n     yes: return values for true elements of 'test'.\n\n      no: return values for false elements of 'test'.\n\nDetails:\n\n     If 'yes' or 'no' are too short, their elements are recycled.\n     'yes' will be evaluated if and only if any element of 'test' is\n     true, and analogously for 'no'.\n\n     Missing values in 'test' give missing values in the result.\n\nValue:\n\n     A vector of the same length and attributes (including dimensions\n     and '\"class\"') as 'test' and data values from the values of 'yes'\n     or 'no'.  The mode of the answer will be coerced from logical to\n     accommodate first any values taken from 'yes' and then any values\n     taken from 'no'.\n\nWarning:\n\n     The mode of the result may depend on the value of 'test' (see the\n     examples), and the class attribute (see 'oldClass') of the result\n     is taken from 'test' and may be inappropriate for the values\n     selected from 'yes' and 'no'.\n\n     Sometimes it is better to use a construction such as\n\n       (tmp <- yes; tmp[!test] <- no[!test]; tmp)\n     \n     , possibly extended to handle missing values in 'test'.\n\n     Further note that 'if(test) yes else no' is much more efficient\n     and often much preferable to 'ifelse(test, yes, no)' whenever\n     'test' is a simple true/false result, i.e., when 'length(test) ==\n     1'.\n\n     The 'srcref' attribute of functions is handled specially: if\n     'test' is a simple true result and 'yes' evaluates to a function\n     with 'srcref' attribute, 'ifelse' returns 'yes' including its\n     attribute (the same applies to a false 'test' and 'no' argument).\n     This functionality is only for backwards compatibility, the form\n     'if(test) yes else no' should be used whenever 'yes' and 'no' are\n     functions.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'if'.\n\nExamples:\n\n     x <- c(6:-4)\n     sqrt(x)  #- gives warning\n     sqrt(ifelse(x >= 0, x, NA))  # no warning\n     \n     ## Note: the following also gives the warning !\n     ifelse(x >= 0, sqrt(x), NA)\n     \n     \n     ## ifelse() strips attributes\n     ## This is important when working with Dates and factors\n     x <- seq(as.Date(\"2000-02-29\"), as.Date(\"2004-10-04\"), by = \"1 month\")\n     ## has many \"yyyy-mm-29\", but a few \"yyyy-03-01\" in the non-leap years\n     y <- ifelse(as.POSIXlt(x)$mday == 29, x, NA)\n     head(y) # not what you expected ... ==> need restore the class attribute:\n     class(y) <- class(x)\n     y\n     ## This is a (not atypical) case where it is better *not* to use ifelse(),\n     ## but rather the more efficient and still clear:\n     y2 <- x\n     y2[as.POSIXlt(x)$mday != 29] <- NA\n     ## which gives the same as ifelse()+class() hack:\n     stopifnot(identical(y2, y))\n     \n     \n     ## example of different return modes (and 'test' alone determining length):\n     yes <- 1:3\n     no  <- pi^(1:4)\n     utils::str( ifelse(NA,    yes, no) ) # logical, length 1\n     utils::str( ifelse(TRUE,  yes, no) ) # integer, length 1\n     utils::str( ifelse(FALSE, yes, no) ) # double,  length 1\n\n\n\n## `ifelse` example\n\nReminder of the first three arguments in the `ifelse()` function are `ifelse(test, yes, no)`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \"old\")\nhead(df)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     |    log_IgG|seropos |age_group |\n|--------------:|-----------------:|---:|:------|:--------|----------:|:-------|:---------|\n|           5772|         0.3176895|   2|Female |Non slum | -1.1466807|FALSE   |young     |\n|           8095|         3.4368231|   4|Female |Non slum |  1.2345475|FALSE   |young     |\n|           9784|         0.3000000|   4|Male   |Non slum | -1.2039728|FALSE   |young     |\n|           9338|       143.2363014|   4|Male   |Non slum |  4.9644957|TRUE    |young     |\n|           6369|         0.4476534|   1|Male   |Non slum | -0.8037359|FALSE   |young     |\n|           6885|         0.0252708|   4|Male   |Non slum | -3.6781074|FALSE   |young     |\n:::\n:::\n\n\nLet's delve into what is actually happening, with a focus on the NA values in `age` variable.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age <= 5\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE\n [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n [61]  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE\n [73] FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE\n [85] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n [97]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[109] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE\n[121]    NA  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[133] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[145]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[157] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[169] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE\n[181]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE\n[193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n[205]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[217] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[229]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[241] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[253] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[265]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE\n[277] FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[289]  TRUE    NA FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[301]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE\n[313]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE\n[325]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE\n[337] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[349] FALSE    NA FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE\n[361]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE\n[373] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE\n[385]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[397] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[409]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[421] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[433]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[445] FALSE FALSE  TRUE  TRUE  TRUE  TRUE    NA    NA  TRUE  TRUE  TRUE  TRUE\n[457]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[469] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[481]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n[493]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE\n[505] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[517]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[529] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[541]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[553] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[565]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[577] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[589] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[601] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[613]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[625] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[637]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE    NA FALSE FALSE FALSE\n[649] FALSE FALSE FALSE\n```\n:::\n\n```{.r .cell-code}\ntable(df$age, df$age_group, useNA=\"always\", dnn=list(\"age\", \"\"))\n```\n\n::: {.cell-output-display}\n|age/ | old| young| NA|\n|:----|---:|-----:|--:|\n|1    |   0|    44|  0|\n|2    |   0|    72|  0|\n|3    |   0|    79|  0|\n|4    |   0|    80|  0|\n|5    |   0|    41|  0|\n|6    |  38|     0|  0|\n|7    |  38|     0|  0|\n|8    |  39|     0|  0|\n|9    |  20|     0|  0|\n|10   |  44|     0|  0|\n|11   |  41|     0|  0|\n|12   |  23|     0|  0|\n|13   |  35|     0|  0|\n|14   |  37|     0|  0|\n|15   |  11|     0|  0|\n|NA   |   0|     0|  9|\n:::\n:::\n\n\n## Nesting `ifelse` statements example\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \"old\"))\ntable(df$age, df$age_group, useNA=\"always\", dnn=list(\"age\", \"\"))\n```\n\n::: {.cell-output-display}\n|age/ | middle| old| young| NA|\n|:----|------:|---:|-----:|--:|\n|1    |      0|   0|    44|  0|\n|2    |      0|   0|    72|  0|\n|3    |      0|   0|    79|  0|\n|4    |      0|   0|    80|  0|\n|5    |      0|   0|    41|  0|\n|6    |     38|   0|     0|  0|\n|7    |     38|   0|     0|  0|\n|8    |     39|   0|     0|  0|\n|9    |     20|   0|     0|  0|\n|10   |     44|   0|     0|  0|\n|11   |      0|  41|     0|  0|\n|12   |      0|  23|     0|  0|\n|13   |      0|  35|     0|  0|\n|14   |      0|  37|     0|  0|\n|15   |      0|  11|     0|  0|\n|NA   |      0|   0|     0|  9|\n:::\n:::\n\n\nNote, it puts the variable levels in alphabetical order, we will show how to change this later.\n\n# Data Classes\n\n## Overview - Data Classes\n\n1. One dimensional types (i.e., vectors of characters, numeric, logical, or factor values)\n\n2. Two dimensional types (e.g., matrix, data frame, tibble)\n\n3. Special data classes (e.g., lists, dates). \n\n## \t`class()` function\n\nThe `class()` function allows you to evaluate the class of an object.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"numeric\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$gender)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\n\n\n## One dimensional data types\n\n* Character: strings or individual characters, quoted\n* Numeric: any real number(s)\n    - Double: contains fractional values (i.e., double precision) - default numeric\n    - Integer: any integer(s)/whole numbers\n* Logical: variables composed of TRUE or FALSE\n* Factor: categorical/qualitative variables\n\n## Character and numeric\n\nThis can also be a bit tricky. \n\nIf only one character in the whole vector, the class is assumed to be character\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(1, 2, \"tree\")) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\n\nHere because integers are in quotations, it is read as a character class by R.\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(\"1\", \"4\", \"7\")) \n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n:::\n\n\nNote, instead of creating a new vector object (e.g., `x <- c(\"1\", \"4\", \"7\")`) and then feeding the vector object `x` into the first argument of the `class()` function (e.g., `class(x)`), we combined the two steps and directly fed a vector object into the class function.\n\n## Numeric Subclasses\n\nThere are two major numeric subclasses\n\n1. `Double` is a special subset of `numeric` that contains <span style=\"color: red;\">fractional values</span>. `Double` stands for [double-precision](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)\n2. `Integer` is a special subset of `numeric` that contains only <span style=\"color: red;\">whole numbers</span>. \n\n`typeof()` identifies the vector type (double, integer, logical, or character), whereas `class()` identifies the root class. The difference between the two will be more clear when we look at two dimensional classes below.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"numeric\"\n```\n:::\n\n```{.r .cell-code}\nclass(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n\n```{.r .cell-code}\ntypeof(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"double\"\n```\n:::\n\n```{.r .cell-code}\ntypeof(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"integer\"\n```\n:::\n:::\n\n\n\n## Logical\n\nReminder `logical` is a type that only has three possible elements: `TRUE` and `FALSE` and `NA`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(c(TRUE, FALSE, TRUE, TRUE, FALSE))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"logical\"\n```\n:::\n:::\n\n\nNote that when creating `logical` object the `TRUE` and `FALSE` are NOT in quotes. Putting R special classes (e.g., `NA` or `FALSE`) in quotations turns them into character value. \n\n\n## Other useful functions for evaluating/setting classes\n\nThere are two useful functions associated with practically all R classes: \n\n- `is.CLASS_NAME(x)` to **logically check** whether or not `x` is of certain  class.  For example,  `is.integer` or `is.character` or `is.numeric`\n- `as.CLASS_NAME(x)` to **coerce between classes** `x` from current `x` class into a another class. For example, `as.integer` or `as.character` or `as.numeric`.  This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).\n\n## Examples `is.CLASS_NAME(x)`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nis.numeric(df$IgG_concentration)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n\n```{.r .cell-code}\nis.character(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] FALSE\n```\n:::\n\n```{.r .cell-code}\nis.character(df$gender)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] TRUE\n```\n:::\n:::\n\n\n## Examples `as.CLASS_NAME(x)`\n\nIn some cases, coercing is seamless\n\n::: {.cell}\n\n```{.r .cell-code}\nas.character(c(1, 4, 7))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"1\" \"4\" \"7\"\n```\n:::\n\n```{.r .cell-code}\nas.numeric(c(\"1\", \"4\", \"7\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1 4 7\n```\n:::\n\n```{.r .cell-code}\nas.logical(c(\"TRUE\", \"FALSE\", \"FALSE\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE FALSE\n```\n:::\n:::\n\n\nIn some cases the coercing is not possible; if executed, will return `NA`\n\n::: {.cell}\n\n```{.r .cell-code}\nas.numeric(c(\"1\", \"4\", \"7a\"))\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning: NAs introduced by coercion\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  1  4 NA\n```\n:::\n\n```{.r .cell-code}\nas.logical(c(\"TRUE\", \"FALSE\", \"UNKNOWN\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  TRUE FALSE    NA\n```\n:::\n:::\n\n\n\n## Factors\n\nA `factor` is a special `character` vector where the elements have pre-defined groups or 'levels'. You can think of these as qualitative or categorical variables. Use the `factor()` function to create factors from character values. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(df$age_group)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"character\"\n```\n:::\n\n```{.r .cell-code}\ndf$age_group_factor <- factor(df$age_group)\nclass(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"factor\"\n```\n:::\n\n```{.r .cell-code}\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"middle\" \"old\"    \"young\" \n```\n:::\n:::\n\n\nNote 1, that levels are, by default, set to **alphanumerical** order! And, the first is always the \"reference\" group. However, we often prefer a different reference group.\n\nNote 2, we can also make ordered factors using `factor(... ordered=TRUE)`, but we won't talk more about that.\n\n## Reference Groups \n\n**Why do we care about reference groups?** \n\nGeneralized linear regression allows you to compare the outcome of two or more groups. Your reference group is the group that everything else is compared to. Say we want to assess whether being <5 years old is associated with higher IgG antibody concentrations \n\nBy default `middle` is the reference group therefore we will only generate beta coefficients comparing `middle` to `young` AND `middle` to `old`.  But, we want `young` to be the reference group so we will generate beta coefficients comparing `young` to `middle` AND `young` to `old`.\n\n## Changing factor reference \n\nChanging the reference group of a factor variable.\n\n- If the object is already a factor then use `relevel()` function and the `ref` argument to specify the reference.\n- If the object is a character then use `factor()` function and `levels` argument to specify the order of the values, the first being the reference.\n\n\nLet's look at the `relevel()` help file\n\nReorder Levels of Factor\n\nDescription:\n\n     The levels of a factor are re-ordered so that the level specified\n     by 'ref' is first and the others are moved down. This is useful\n     for 'contr.treatment' contrasts which take the first level as the\n     reference.\n\nUsage:\n\n     relevel(x, ref, ...)\n     \nArguments:\n\n       x: an unordered factor.\n\n     ref: the reference level, typically a string.\n\n     ...: additional arguments for future methods.\n\nDetails:\n\n     This, as 'reorder()', is a special case of simply calling\n     'factor(x, levels = levels(x)[....])'.\n\nValue:\n\n     A factor of the same length as 'x'.\n\nSee Also:\n\n     'factor', 'contr.treatment', 'levels', 'reorder'.\n\nExamples:\n\n     warpbreaks$tension <- relevel(warpbreaks$tension, ref = \"M\")\n     summary(lm(breaks ~ wool + tension, data = warpbreaks))\n\n\n</br>\n\nLet's look at the `factor()` help file\n\nFactors\n\nDescription:\n\n     The function 'factor' is used to encode a vector as a factor (the\n     terms 'category' and 'enumerated type' are also used for factors).\n     If argument 'ordered' is 'TRUE', the factor levels are assumed to\n     be ordered.  For compatibility with S there is also a function\n     'ordered'.\n\n     'is.factor', 'is.ordered', 'as.factor' and 'as.ordered' are the\n     membership and coercion functions for these classes.\n\nUsage:\n\n     factor(x = character(), levels, labels = levels,\n            exclude = NA, ordered = is.ordered(x), nmax = NA)\n     \n     ordered(x = character(), ...)\n     \n     is.factor(x)\n     is.ordered(x)\n     \n     as.factor(x)\n     as.ordered(x)\n     \n     addNA(x, ifany = FALSE)\n     \n     .valid.factor(object)\n     \nArguments:\n\n       x: a vector of data, usually taking a small number of distinct\n          values.\n\n  levels: an optional vector of the unique values (as character\n          strings) that 'x' might have taken.  The default is the\n          unique set of values taken by 'as.character(x)', sorted into\n          increasing order _of 'x'_.  Note that this set can be\n          specified as smaller than 'sort(unique(x))'.\n\n  labels: _either_ an optional character vector of labels for the\n          levels (in the same order as 'levels' after removing those in\n          'exclude'), _or_ a character string of length 1.  Duplicated\n          values in 'labels' can be used to map different values of 'x'\n          to the same factor level.\n\n exclude: a vector of values to be excluded when forming the set of\n          levels.  This may be factor with the same level set as 'x' or\n          should be a 'character'.\n\n ordered: logical flag to determine if the levels should be regarded as\n          ordered (in the order given).\n\n    nmax: an upper bound on the number of levels; see 'Details'.\n\n     ...: (in 'ordered(.)'): any of the above, apart from 'ordered'\n          itself.\n\n   ifany: only add an 'NA' level if it is used, i.e.  if\n          'any(is.na(x))'.\n\n  object: an R object.\n\nDetails:\n\n     The type of the vector 'x' is not restricted; it only must have an\n     'as.character' method and be sortable (by 'order').\n\n     Ordered factors differ from factors only in their class, but\n     methods and the model-fitting functions treat the two classes\n     quite differently.\n\n     The encoding of the vector happens as follows.  First all the\n     values in 'exclude' are removed from 'levels'. If 'x[i]' equals\n     'levels[j]', then the 'i'-th element of the result is 'j'.  If no\n     match is found for 'x[i]' in 'levels' (which will happen for\n     excluded values) then the 'i'-th element of the result is set to\n     'NA'.\n\n     Normally the 'levels' used as an attribute of the result are the\n     reduced set of levels after removing those in 'exclude', but this\n     can be altered by supplying 'labels'.  This should either be a set\n     of new labels for the levels, or a character string, in which case\n     the levels are that character string with a sequence number\n     appended.\n\n     'factor(x, exclude = NULL)' applied to a factor without 'NA's is a\n     no-operation unless there are unused levels: in that case, a\n     factor with the reduced level set is returned.  If 'exclude' is\n     used, since R version 3.4.0, excluding non-existing character\n     levels is equivalent to excluding nothing, and when 'exclude' is a\n     'character' vector, that _is_ applied to the levels of 'x'.\n     Alternatively, 'exclude' can be factor with the same level set as\n     'x' and will exclude the levels present in 'exclude'.\n\n     The codes of a factor may contain 'NA'.  For a numeric 'x', set\n     'exclude = NULL' to make 'NA' an extra level (prints as '<NA>');\n     by default, this is the last level.\n\n     If 'NA' is a level, the way to set a code to be missing (as\n     opposed to the code of the missing level) is to use 'is.na' on the\n     left-hand-side of an assignment (as in 'is.na(f)[i] <- TRUE';\n     indexing inside 'is.na' does not work).  Under those circumstances\n     missing values are currently printed as '<NA>', i.e., identical to\n     entries of level 'NA'.\n\n     'is.factor' is generic: you can write methods to handle specific\n     classes of objects, see InternalMethods.\n\n     Where 'levels' is not supplied, 'unique' is called.  Since factors\n     typically have quite a small number of levels, for large vectors\n     'x' it is helpful to supply 'nmax' as an upper bound on the number\n     of unique values.\n\n     When using 'c' to combine a (possibly ordered) factor with other\n     objects, if all objects are (possibly ordered) factors, the result\n     will be a factor with levels the union of the level sets of the\n     elements, in the order the levels occur in the level sets of the\n     elements (which means that if all the elements have the same level\n     set, that is the level set of the result), equivalent to how\n     'unlist' operates on a list of factor objects.\n\nValue:\n\n     'factor' returns an object of class '\"factor\"' which has a set of\n     integer codes the length of 'x' with a '\"levels\"' attribute of\n     mode 'character' and unique ('!anyDuplicated(.)') entries.  If\n     argument 'ordered' is true (or 'ordered()' is used) the result has\n     class 'c(\"ordered\", \"factor\")'.  Undocumentedly for a long time,\n     'factor(x)' loses all 'attributes(x)' but '\"names\"', and resets\n     '\"levels\"' and '\"class\"'.\n\n     Applying 'factor' to an ordered or unordered factor returns a\n     factor (of the same type) with just the levels which occur: see\n     also '[.factor' for a more transparent way to achieve this.\n\n     'is.factor' returns 'TRUE' or 'FALSE' depending on whether its\n     argument is of type factor or not.  Correspondingly, 'is.ordered'\n     returns 'TRUE' when its argument is an ordered factor and 'FALSE'\n     otherwise.\n\n     'as.factor' coerces its argument to a factor.  It is an\n     abbreviated (sometimes faster) form of 'factor'.\n\n     'as.ordered(x)' returns 'x' if this is ordered, and 'ordered(x)'\n     otherwise.\n\n     'addNA' modifies a factor by turning 'NA' into an extra level (so\n     that 'NA' values are counted in tables, for instance).\n\n     '.valid.factor(object)' checks the validity of a factor, currently\n     only 'levels(object)', and returns 'TRUE' if it is valid,\n     otherwise a string describing the validity problem.  This function\n     is used for 'validObject(<factor>)'.\n\nWarning:\n\n     The interpretation of a factor depends on both the codes and the\n     '\"levels\"' attribute.  Be careful only to compare factors with the\n     same set of levels (in the same order).  In particular,\n     'as.numeric' applied to a factor is meaningless, and may happen by\n     implicit coercion.  To transform a factor 'f' to approximately its\n     original numeric values, 'as.numeric(levels(f))[f]' is recommended\n     and slightly more efficient than 'as.numeric(as.character(f))'.\n\n     The levels of a factor are by default sorted, but the sort order\n     may well depend on the locale at the time of creation, and should\n     not be assumed to be ASCII.\n\n     There are some anomalies associated with factors that have 'NA' as\n     a level.  It is suggested to use them sparingly, e.g., only for\n     tabulation purposes.\n\nComparison operators and group generic methods:\n\n     There are '\"factor\"' and '\"ordered\"' methods for the group generic\n     'Ops' which provide methods for the Comparison operators, and for\n     the 'min', 'max', and 'range' generics in 'Summary' of\n     '\"ordered\"'.  (The rest of the groups and the 'Math' group\n     generate an error as they are not meaningful for factors.)\n\n     Only '==' and '!=' can be used for factors: a factor can only be\n     compared to another factor with an identical set of levels (not\n     necessarily in the same ordering) or to a character vector.\n     Ordered factors are compared in the same way, but the general\n     dispatch mechanism precludes comparing ordered and unordered\n     factors.\n\n     All the comparison operators are available for ordered factors.\n     Collation is done by the levels of the operands: if both operands\n     are ordered factors they must have the same level set.\n\nNote:\n\n     In earlier versions of R, storing character data as a factor was\n     more space efficient if there is even a small proportion of\n     repeats.  However, identical character strings now share storage,\n     so the difference is small in most cases.  (Integer values are\n     stored in 4 bytes whereas each reference to a character string\n     needs a pointer of 4 or 8 bytes.)\n\nReferences:\n\n     Chambers, J. M. and Hastie, T. J. (1992) _Statistical Models in\n     S_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     '[.factor' for subsetting of factors.\n\n     'gl' for construction of balanced factors and 'C' for factors with\n     specified contrasts.  'levels' and 'nlevels' for accessing the\n     levels, and 'unclass' to get integer codes.\n\nExamples:\n\n     (ff <- factor(substring(\"statistics\", 1:10, 1:10), levels = letters))\n     as.integer(ff)      # the internal codes\n     (f. <- factor(ff))  # drops the levels that do not occur\n     ff[, drop = TRUE]   # the same, more transparently\n     \n     factor(letters[1:20], labels = \"letter\")\n     \n     class(ordered(4:1)) # \"ordered\", inheriting from \"factor\"\n     z <- factor(LETTERS[3:1], ordered = TRUE)\n     ## and \"relational\" methods work:\n     stopifnot(sort(z)[c(1,3)] == range(z), min(z) < max(z))\n     \n     \n     ## suppose you want \"NA\" as a level, and to allow missing values.\n     (x <- factor(c(1, 2, NA), exclude = NULL))\n     is.na(x)[2] <- TRUE\n     x  # [1] 1    <NA> <NA>\n     is.na(x)\n     # [1] FALSE  TRUE FALSE\n     \n     ## More rational, since R 3.4.0 :\n     factor(c(1:2, NA), exclude =  \"\" ) # keeps <NA> , as\n     factor(c(1:2, NA), exclude = NULL) # always did\n     ## exclude = <character>\n     z # ordered levels 'A < B < C'\n     factor(z, exclude = \"C\") # does exclude\n     factor(z, exclude = \"B\") # ditto\n     \n     ## Now, labels maybe duplicated:\n     ## factor() with duplicated labels allowing to \"merge levels\"\n     x <- c(\"Man\", \"Male\", \"Man\", \"Lady\", \"Female\")\n     ## Map from 4 different values to only two levels:\n     (xf <- factor(x, levels = c(\"Male\", \"Man\" , \"Lady\",   \"Female\"),\n                      labels = c(\"Male\", \"Male\", \"Female\", \"Female\")))\n     #> [1] Male   Male   Male   Female Female\n     #> Levels: Male Female\n     \n     ## Using addNA()\n     Month <- airquality$Month\n     table(addNA(Month))\n     table(addNA(Month, ifany = TRUE))\n\n\n\n## Changing factor reference examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group_factor <- relevel(df$age_group_factor, ref=\"young\")\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"young\"  \"middle\" \"old\"   \n```\n:::\n:::\n\n\nOR\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group_factor <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\nlevels(df$age_group_factor)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"young\"  \"middle\" \"old\"   \n```\n:::\n:::\n\n\nArranging, tabulating, and plotting the data will reflect the new order\n\n\n## Two-dimensional data classes\n\nTwo-dimensional classes are those we would often use to store data read from a file \n\n* a matrix (`matrix` class)\n* a data frame (`data.frame` or `tibble` classes)\n\n\n## Matrices\n\nMatrices, like data frames are also composed of rows and columns. Matrices, unlike `data.frame`, the entire matrix is composed of one R class. **For example: all entries are `numeric`, or all entries are `character`**\n\n`as.matrix()` creates a matrix from a data frame (where all values are the same class).\n\nYou can also create a matrix from scratch using `matrix()` Use `?matrix` to see the arguments.  \n\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix(data=1:6, ncol = 2) \n```\n\n::: {.cell-output-display}\n|   |   |\n|--:|--:|\n|  1|  4|\n|  2|  5|\n|  3|  6|\n:::\n\n```{.r .cell-code}\nmatrix(data=1:6, ncol=2, byrow=TRUE) \n```\n\n::: {.cell-output-display}\n|   |   |\n|--:|--:|\n|  1|  2|\n|  3|  4|\n|  5|  6|\n:::\n:::\n\n\nNote, the first matrix filled in numbers 1-6 by columns first and then rows because default `byrow` argument is FALSE. In the second matrix, we changed the argument `byrow` to `TRUE`, and now numbers 1-6 are filled by rows first and then columns.\n\n## Data frame \n\nYou can transform an existing matrix into data frames using `as.data.frame()`  \n\n\n::: {.cell}\n\n```{.r .cell-code}\nas.data.frame(matrix(1:6, ncol = 2) ) \n```\n\n::: {.cell-output-display}\n| V1| V2|\n|--:|--:|\n|  1|  4|\n|  2|  5|\n|  3|  6|\n:::\n:::\n\n\n\n## Numeric variable data summary\n\nData summarization on numeric vectors/variables:\n\n-\t`mean()`: takes the mean of x\n-\t`sd()`: takes the standard deviation of x\n-\t`median()`: takes the median of x\n-\t`quantile()`: displays sample quantiles of x. Default is min, IQR, max\n-\t`range()`: displays the range. Same as `c(min(), max())`\n-\t`sum()`: sum of x\n-\t`max()`: maximum value in x\n-\t`min()`: minimum value in x\n\nNote, **all have the ** `na.rm` **argument for missing data**\n\n\nArithmetic Mean\n\nDescription:\n\n     Generic function for the (trimmed) arithmetic mean.\n\nUsage:\n\n     mean(x, ...)\n     \n     ## Default S3 method:\n     mean(x, trim = 0, na.rm = FALSE, ...)\n     \nArguments:\n\n       x: An R object.  Currently there are methods for numeric/logical\n          vectors and date, date-time and time interval objects.\n          Complex vectors are allowed for 'trim = 0', only.\n\n    trim: the fraction (0 to 0.5) of observations to be trimmed from\n          each end of 'x' before the mean is computed.  Values of trim\n          outside that range are taken as the nearest endpoint.\n\n   na.rm: a logical evaluating to 'TRUE' or 'FALSE' indicating whether\n          'NA' values should be stripped before the computation\n          proceeds.\n\n     ...: further arguments passed to or from other methods.\n\nValue:\n\n     If 'trim' is zero (the default), the arithmetic mean of the values\n     in 'x' is computed, as a numeric or complex vector of length one.\n     If 'x' is not logical (coerced to numeric), numeric (including\n     integer) or complex, 'NA_real_' is returned, with a warning.\n\n     If 'trim' is non-zero, a symmetrically trimmed mean is computed\n     with a fraction of 'trim' observations deleted from each end\n     before the mean is computed.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'weighted.mean', 'mean.POSIXct', 'colMeans' for row and column\n     means.\n\nExamples:\n\n     x <- c(0:10, 50)\n     xm <- mean(x)\n     c(xm, mean(x, trim = 0.10))\n\n\n## Numeric variable data summary examples\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(df)\n```\n\n::: {.cell-output-display}\n|   |observation_id |IgG_concentration |     age       |   gender        |    slum         |   log_IgG      | seropos      | age_group       |age_group_factor |\n|:--|:--------------|:-----------------|:--------------|:----------------|:----------------|:---------------|:-------------|:----------------|:----------------|\n|   |Min.   :5006   |Min.   :  0.0054  |Min.   : 1.000 |Length:651       |Length:651       |Min.   :-5.2231 |Mode :logical |Length:651       |young :316       |\n|   |1st Qu.:6306   |1st Qu.:  0.3000  |1st Qu.: 3.000 |Class :character |Class :character |1st Qu.:-1.2040 |FALSE:360     |Class :character |middle:179       |\n|   |Median :7495   |Median :  1.6658  |Median : 6.000 |Mode  :character |Mode  :character |Median : 0.5103 |TRUE :281     |Mode  :character |old   :147       |\n|   |Mean   :7492   |Mean   : 87.3683  |Mean   : 6.606 |NA               |NA               |Mean   : 1.6074 |NA's :10      |NA               |NA's  :  9       |\n|   |3rd Qu.:8749   |3rd Qu.:141.4405  |3rd Qu.:10.000 |NA               |NA               |3rd Qu.: 4.9519 |NA            |NA               |NA               |\n|   |Max.   :9982   |Max.   :916.4179  |Max.   :15.000 |NA               |NA               |Max.   : 6.8205 |NA            |NA               |NA               |\n|   |NA             |NA's   :10        |NA's   :9      |NA               |NA               |NA's   :10      |NA            |NA               |NA               |\n:::\n\n```{.r .cell-code}\nrange(df$age)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] NA NA\n```\n:::\n\n```{.r .cell-code}\nrange(df$age, na.rm=TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1]  1 15\n```\n:::\n\n```{.r .cell-code}\nmedian(df$IgG_concentration, na.rm=TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 1.665753\n```\n:::\n:::\n\n\n\n## Character variable data summaries\n\nData summarization on character or factor vectors/variables using `table()`\n\t\t\n\nCross Tabulation and Table Creation\n\nDescription:\n\n     'table' uses cross-classifying factors to build a contingency\n     table of the counts at each combination of factor levels.\n\nUsage:\n\n     table(...,\n           exclude = if (useNA == \"no\") c(NA, NaN),\n           useNA = c(\"no\", \"ifany\", \"always\"),\n           dnn = list.names(...), deparse.level = 1)\n     \n     as.table(x, ...)\n     is.table(x)\n     \n     ## S3 method for class 'table'\n     as.data.frame(x, row.names = NULL, ...,\n                   responseName = \"Freq\", stringsAsFactors = TRUE,\n                   sep = \"\", base = list(LETTERS))\n     \nArguments:\n\n     ...: one or more objects which can be interpreted as factors\n          (including numbers or character strings), or a 'list' (such\n          as a data frame) whose components can be so interpreted.\n          (For 'as.table', arguments passed to specific methods; for\n          'as.data.frame', unused.)\n\n exclude: levels to remove for all factors in '...'.  If it does not\n          contain 'NA' and 'useNA' is not specified, it implies 'useNA\n          = \"ifany\"'.  See 'Details' for its interpretation for\n          non-factor arguments.\n\n   useNA: whether to include 'NA' values in the table.  See 'Details'.\n          Can be abbreviated.\n\n     dnn: the names to be given to the dimensions in the result (the\n          _dimnames names_).\n\ndeparse.level: controls how the default 'dnn' is constructed.  See\n          'Details'.\n\n       x: an arbitrary R object, or an object inheriting from class\n          '\"table\"' for the 'as.data.frame' method. Note that\n          'as.data.frame.table(x, *)' may be called explicitly for\n          non-table 'x' for \"reshaping\" 'array's.\n\nrow.names: a character vector giving the row names for the data frame.\n\nresponseName: The name to be used for the column of table entries,\n          usually counts.\n\nstringsAsFactors: logical: should the classifying factors be returned\n          as factors (the default) or character vectors?\n\nsep, base: passed to 'provideDimnames'.\n\nDetails:\n\n     If the argument 'dnn' is not supplied, the internal function\n     'list.names' is called to compute the 'dimname names' as follows:\n     If '...' is one 'list' with its own 'names()', these 'names' are\n     used.  Otherwise, if the arguments in '...' are named, those names\n     are used.  For the remaining arguments, 'deparse.level = 0' gives\n     an empty name, 'deparse.level = 1' uses the supplied argument if\n     it is a symbol, and 'deparse.level = 2' will deparse the argument.\n\n     Only when 'exclude' is specified (i.e., not by default) and\n     non-empty, will 'table' potentially drop levels of factor\n     arguments.\n\n     'useNA' controls if the table includes counts of 'NA' values: the\n     allowed values correspond to never ('\"no\"'), only if the count is\n     positive ('\"ifany\"') and even for zero counts ('\"always\"').  Note\n     the somewhat \"pathological\" case of two different kinds of 'NA's\n     which are treated differently, depending on both 'useNA' and\n     'exclude', see 'd.patho' in the 'Examples:' below.\n\n     Both 'exclude' and 'useNA' operate on an \"all or none\" basis.  If\n     you want to control the dimensions of a multiway table separately,\n     modify each argument using 'factor' or 'addNA'.\n\n     Non-factor arguments 'a' are coerced via 'factor(a,\n     exclude=exclude)'.  Since R 3.4.0, care is taken _not_ to count\n     the excluded values (where they were included in the 'NA' count,\n     previously).\n\n     The 'summary' method for class '\"table\"' (used for objects created\n     by 'table' or 'xtabs') which gives basic information and performs\n     a chi-squared test for independence of factors (note that the\n     function 'chisq.test' currently only handles 2-d tables).\n\nValue:\n\n     'table()' returns a _contingency table_, an object of class\n     '\"table\"', an array of integer values.  Note that unlike S the\n     result is always an 'array', a 1D array if one factor is given.\n\n     'as.table' and 'is.table' coerce to and test for contingency\n     table, respectively.\n\n     The 'as.data.frame' method for objects inheriting from class\n     '\"table\"' can be used to convert the array-based representation of\n     a contingency table to a data frame containing the classifying\n     factors and the corresponding entries (the latter as component\n     named by 'responseName').  This is the inverse of 'xtabs'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'tabulate' is the underlying function and allows finer control.\n\n     Use 'ftable' for printing (and more) of multidimensional tables.\n     'margin.table', 'prop.table', 'addmargins'.\n\n     'addNA' for constructing factors with 'NA' as a level.\n\n     'xtabs' for cross tabulation of data frames with a formula\n     interface.\n\nExamples:\n\n     require(stats) # for rpois and xtabs\n     ## Simple frequency distribution\n     table(rpois(100, 5))\n     ## Check the design:\n     with(warpbreaks, table(wool, tension))\n     table(state.division, state.region)\n     \n     # simple two-way contingency table\n     with(airquality, table(cut(Temp, quantile(Temp)), Month))\n     \n     a <- letters[1:3]\n     table(a, sample(a))                    # dnn is c(\"a\", \"\")\n     table(a, sample(a), deparse.level = 0) # dnn is c(\"\", \"\")\n     table(a, sample(a), deparse.level = 2) # dnn is c(\"a\", \"sample(a)\")\n     \n     ## xtabs() <-> as.data.frame.table() :\n     UCBAdmissions ## already a contingency table\n     DF <- as.data.frame(UCBAdmissions)\n     class(tab <- xtabs(Freq ~ ., DF)) # xtabs & table\n     ## tab *is* \"the same\" as the original table:\n     all(tab == UCBAdmissions)\n     all.equal(dimnames(tab), dimnames(UCBAdmissions))\n     \n     a <- rep(c(NA, 1/0:3), 10)\n     table(a)                 # does not report NA's\n     table(a, exclude = NULL) # reports NA's\n     b <- factor(rep(c(\"A\",\"B\",\"C\"), 10))\n     table(b)\n     table(b, exclude = \"B\")\n     d <- factor(rep(c(\"A\",\"B\",\"C\"), 10), levels = c(\"A\",\"B\",\"C\",\"D\",\"E\"))\n     table(d, exclude = \"B\")\n     print(table(b, d), zero.print = \".\")\n     \n     ## NA counting:\n     is.na(d) <- 3:4\n     d. <- addNA(d)\n     d.[1:7]\n     table(d.) # \", exclude = NULL\" is not needed\n     ## i.e., if you want to count the NA's of 'd', use\n     table(d, useNA = \"ifany\")\n     \n     ## \"pathological\" case:\n     d.patho <- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) <- 3:4\n     d.patho\n     ## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :\n     as.integer(d.patho) # 1 4 NA NA 1 2\n     ##\n     ## In R >= 3.4.0, table() allows to differentiate:\n     table(d.patho)                   # counts the \"unusual\" NA\n     table(d.patho, useNA = \"ifany\")  # counts all three\n     table(d.patho, exclude = NULL)   #  (ditto)\n     table(d.patho, exclude = NA)     # counts none\n     \n     ## Two-way tables with NA counts. The 3rd variant is absurd, but shows\n     ## something that cannot be done using exclude or useNA.\n     with(airquality,\n        table(OzHi = Ozone > 80, Month, useNA = \"ifany\"))\n     with(airquality,\n        table(OzHi = Ozone > 80, Month, useNA = \"always\"))\n     with(airquality,\n        table(OzHi = Ozone > 80, addNA(Month)))\n\n\n\n## Character variable data summary examples\n\nNumber of observations in each category\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntable(df$gender)\n```\n\n::: {.cell-output-display}\n| Female| Male|\n|------:|----:|\n|    325|  326|\n:::\n\n```{.r .cell-code}\ntable(df$gender, useNA=\"always\")\n```\n\n::: {.cell-output-display}\n| Female| Male| NA|\n|------:|----:|--:|\n|    325|  326|  0|\n:::\n\n```{.r .cell-code}\ntable(df$age_group, useNA=\"always\")\n```\n\n::: {.cell-output-display}\n| middle| old| young| NA|\n|------:|---:|-----:|--:|\n|    179| 147|   316|  9|\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ntable(df$gender)/nrow(df) #if no NA values\n```\n\n::: {.cell-output-display}\n|   Female|     Male|\n|--------:|--------:|\n| 0.499232| 0.500768|\n:::\n\n```{.r .cell-code}\ntable(df$age_group)/nrow(df[!is.na(df$age_group),]) #if there are NA values\n```\n\n::: {.cell-output-display}\n|    middle|      old|     young|\n|---------:|--------:|---------:|\n| 0.2788162| 0.228972| 0.4922118|\n:::\n\n```{.r .cell-code}\ntable(df$age_group)/nrow(subset(df, !is.na(df$age_group),)) #if there are NA values\n```\n\n::: {.cell-output-display}\n|    middle|      old|     young|\n|---------:|--------:|---------:|\n| 0.2788162| 0.228972| 0.4922118|\n:::\n:::\n\n\n\n## Summary\n\n-   Adding (or modifying) columns/variable to a data frame by using `$` \n-   There are two types of numeric class objects: integer and double\n-   Logical class objects only have `TRUE` or `False` (without quotes)\n-   `is.CLASS_NAME(x)` can be used to test the class of an object x\n-   `as.CLASS_NAME(x)` can be used to change the class of an object x\n-   Factors are a special character class that has levels \n-   ...xxamy complete\n\t\t\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module08-DataMergeReshape/execute-results/html.json b/_freeze/modules/Module08-DataMergeReshape/execute-results/html.json
new file mode 100644
index 0000000..90dab95
--- /dev/null
+++ b/_freeze/modules/Module08-DataMergeReshape/execute-results/html.json
@@ -0,0 +1,18 @@
+{
+  "hash": "a098429eb85b4995ea1507646b4860d7",
+  "result": {
+    "markdown": "---\ntitle: \"Module 8: Data Merging and Reshaping\"\nformat:\n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 8, you should be able to...\n\n-   Merge/join data together\n-   Reshape data from wide to long\n-   Reshape data from long to wide\n\n## Joining types\n\nPay close attention to the number of rows in your data set before and after a join. This will help flag when an issue has arisen. This will depend on the type of merge:\n\n-   1:1 merge (one-to-one merge) – Simplest merge (sometimes things go wrong)\n-   1:m merge (one-to-many merge) – More complex (things often go wrong)\n    -   The \"one\" suggests that one dataset has the merging variable (e.g., id) each represented once and the \"many” implies that one dataset has the merging variable represented multiple times\n-   m:m merge (many-to-many merge) – Danger zone (can be unpredictable)\n    \n\n## one-to-one merge\n\n-   This means that each row of data represents a unique unit of analysis that exists in another dataset (e.g,. id variable)\n-   Will likely have variables that don’t exist in the current dataset (that’s why you are trying to merge it in)\n-   The merging variable (e.g., id) each represented a single time\n-   You should try to structure your data so that a 1:1 merge or 1:m merge is possible so that fewer things can go wrong.\n\n## `merge()` function\n\nWe will use the `merge()` function to conduct one-to-one merge\n\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nMerge Two Data Frames\n\nDescription:\n\n     Merge two data frames by common columns or row names, or do other\n     versions of database _join_ operations.\n\nUsage:\n\n     merge(x, y, ...)\n     \n     ## Default S3 method:\n     merge(x, y, ...)\n     \n     ## S3 method for class 'data.frame'\n     merge(x, y, by = intersect(names(x), names(y)),\n           by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,\n           sort = TRUE, suffixes = c(\".x\",\".y\"), no.dups = TRUE,\n           incomparables = NULL, ...)\n     \nArguments:\n\n    x, y: data frames, or objects to be coerced to one.\n\nby, by.x, by.y: specifications of the columns used for merging.  See\n          'Details'.\n\n     all: logical; 'all = L' is shorthand for 'all.x = L' and 'all.y =\n          L', where 'L' is either 'TRUE' or 'FALSE'.\n\n   all.x: logical; if 'TRUE', then extra rows will be added to the\n          output, one for each row in 'x' that has no matching row in\n          'y'.  These rows will have 'NA's in those columns that are\n          usually filled with values from 'y'.  The default is 'FALSE',\n          so that only rows with data from both 'x' and 'y' are\n          included in the output.\n\n   all.y: logical; analogous to 'all.x'.\n\n    sort: logical.  Should the result be sorted on the 'by' columns?\n\nsuffixes: a character vector of length 2 specifying the suffixes to be\n          used for making unique the names of columns in the result\n          which are not used for merging (appearing in 'by' etc).\n\n no.dups: logical indicating that 'suffixes' are appended in more cases\n          to avoid duplicated column names in the result.  This was\n          implicitly false before R version 3.5.0.\n\nincomparables: values which cannot be matched.  See 'match'.  This is\n          intended to be used for merging on one column, so these are\n          incomparable values of that column.\n\n     ...: arguments to be passed to or from methods.\n\nDetails:\n\n     'merge' is a generic function whose principal method is for data\n     frames: the default method coerces its arguments to data frames\n     and calls the '\"data.frame\"' method.\n\n     By default the data frames are merged on the columns with names\n     they both have, but separate specifications of the columns can be\n     given by 'by.x' and 'by.y'.  The rows in the two data frames that\n     match on the specified columns are extracted, and joined together.\n     If there is more than one match, all possible matches contribute\n     one row each.  For the precise meaning of 'match', see 'match'.\n\n     Columns to merge on can be specified by name, number or by a\n     logical vector: the name '\"row.names\"' or the number '0' specifies\n     the row names.  If specified by name it must correspond uniquely\n     to a named column in the input.\n\n     If 'by' or both 'by.x' and 'by.y' are of length 0 (a length zero\n     vector or 'NULL'), the result, 'r', is the _Cartesian product_ of\n     'x' and 'y', i.e., 'dim(r) = c(nrow(x)*nrow(y), ncol(x) +\n     ncol(y))'.\n\n     If 'all.x' is true, all the non matching cases of 'x' are appended\n     to the result as well, with 'NA' filled in the corresponding\n     columns of 'y'; analogously for 'all.y'.\n\n     If the columns in the data frames not used in merging have any\n     common names, these have 'suffixes' ('\".x\"' and '\".y\"' by default)\n     appended to try to make the names of the result unique.  If this\n     is not possible, an error is thrown.\n\n     If a 'by.x' column name matches one of 'y', and if 'no.dups' is\n     true (as by default), the y version gets suffixed as well,\n     avoiding duplicate column names in the result.\n\n     The complexity of the algorithm used is proportional to the length\n     of the answer.\n\n     In SQL database terminology, the default value of 'all = FALSE'\n     gives a _natural join_, a special case of an _inner join_.\n     Specifying 'all.x = TRUE' gives a _left (outer) join_, 'all.y =\n     TRUE' a _right (outer) join_, and both ('all = TRUE') a _(full)\n     outer join_.  DBMSes do not match 'NULL' records, equivalent to\n     'incomparables = NA' in R.\n\nValue:\n\n     A data frame.  The rows are by default lexicographically sorted on\n     the common columns, but for 'sort = FALSE' are in an unspecified\n     order.  The columns are the common columns followed by the\n     remaining columns in 'x' and then those in 'y'.  If the matching\n     involved row names, an extra character column called 'Row.names'\n     is added at the left, and in all cases the result has 'automatic'\n     row names.\n\nNote:\n\n     This is intended to work with data frames with vector-like\n     columns: some aspects work with data frames containing matrices,\n     but not all.\n\n     Currently long vectors are not accepted for inputs, which are thus\n     restricted to less than 2^31 rows. That restriction also applies\n     to the result for 32-bit platforms.\n\nSee Also:\n\n     'data.frame', 'by', 'cbind'.\n\n     'dendrogram' for a class which has a 'merge' method.\n\nExamples:\n\n     authors <- data.frame(\n         ## I(*) : use character columns of names to get sensible sort order\n         surname = I(c(\"Tukey\", \"Venables\", \"Tierney\", \"Ripley\", \"McNeil\")),\n         nationality = c(\"US\", \"Australia\", \"US\", \"UK\", \"Australia\"),\n         deceased = c(\"yes\", rep(\"no\", 4)))\n     authorN <- within(authors, { name <- surname; rm(surname) })\n     books <- data.frame(\n         name = I(c(\"Tukey\", \"Venables\", \"Tierney\",\n                  \"Ripley\", \"Ripley\", \"McNeil\", \"R Core\")),\n         title = c(\"Exploratory Data Analysis\",\n                   \"Modern Applied Statistics ...\",\n                   \"LISP-STAT\",\n                   \"Spatial Statistics\", \"Stochastic Simulation\",\n                   \"Interactive Data Analysis\",\n                   \"An Introduction to R\"),\n         other.author = c(NA, \"Ripley\", NA, NA, NA, NA,\n                          \"Venables & Smith\"))\n     \n     (m0 <- merge(authorN, books))\n     (m1 <- merge(authors, books, by.x = \"surname\", by.y = \"name\"))\n      m2 <- merge(books, authors, by.x = \"name\", by.y = \"surname\")\n     stopifnot(exprs = {\n        identical(m0, m2[, names(m0)])\n        as.character(m1[, 1]) == as.character(m2[, 1])\n        all.equal(m1[, -1], m2[, -1][ names(m1)[-1] ])\n        identical(dim(merge(m1, m2, by = NULL)),\n                  c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))\n     })\n     \n     ## \"R core\" is missing from authors and appears only here :\n     merge(authors, books, by.x = \"surname\", by.y = \"name\", all = TRUE)\n     \n     \n     ## example of using 'incomparables'\n     x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)\n     y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)\n     merge(x, y, by = c(\"k1\",\"k2\")) # NA's match\n     merge(x, y, by = \"k1\") # NA's match, so 6 rows\n     merge(x, y, by = \"k2\", incomparables = NA) # 2 rows\n\n\n    \n## Lets import the new data we want to merge and take a look\n\nThe new data `serodata_new.csv` represents a follow-up serological survey four years later. At this follow-up individuals were retested for IgG antibody concentrations and their ages were collected.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_new <- read.csv(\"data/serodata_new.csv\")\nstr(df_new)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t636 obs. of  3 variables:\n $ observation_id   : int  5772 8095 9784 9338 6369 6885 6252 8913 7332 6941 ...\n $ IgG_concentration: num  0.261 2.981 0.282 136.638 0.381 ...\n $ age              : int  6 8 8 8 5 8 8 NA 8 6 ...\n```\n:::\n\n```{.r .cell-code}\nsummary(df_new)\n```\n\n::: {.cell-output-display}\n|   |observation_id |IgG_concentration |     age      |\n|:--|:--------------|:-----------------|:-------------|\n|   |Min.   :5006   |Min.   :  0.0051  |Min.   : 5.00 |\n|   |1st Qu.:6328   |1st Qu.:  0.2751  |1st Qu.: 7.00 |\n|   |Median :7494   |Median :  1.5477  |Median :10.00 |\n|   |Mean   :7490   |Mean   : 82.7684  |Mean   :10.63 |\n|   |3rd Qu.:8736   |3rd Qu.:129.6389  |3rd Qu.:14.00 |\n|   |Max.   :9982   |Max.   :950.6590  |Max.   :19.00 |\n|   |NA             |NA                |NA's   :9     |\n:::\n:::\n\n\n\n## Merge the new data with the original data\n\nLets load the old data as well and look for a variable, or variables, to merge by.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(\"data/serodata.csv\")\ncolnames(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n```\n:::\n:::\n\n\nWe notice that `observation_id` seems to be the obvious variable by which to merge.  However, we also realize that `IgG_concentration` and `age` are the exact same names.  If we merge now we see that \n\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(merge(df, df_new, all.x=T, all.y=T, by=c('observation_id')))\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration.x| age.x|gender |slum     | IgG_concentration.y| age.y|\n|--------------:|-------------------:|-----:|:------|:--------|-------------------:|-----:|\n|           5006|         164.2979452|     7|Male   |Non slum |         155.5811325|    11|\n|           5024|           0.3000000|     5|Female |Non slum |           0.2918605|     9|\n|           5026|           0.3000000|    10|Female |Non slum |           0.2542945|    14|\n|           5030|           0.0555556|     7|Female |Non slum |           0.0533262|    11|\n|           5035|          26.2112514|    11|Female |Non slum |          22.0159300|    15|\n|           5054|           0.3000000|     3|Male   |Non slum |           0.2709671|     7|\n:::\n:::\n\n\n## Merge the new data with the original data\n\nThe first option is to rename the `IgG_concentration` and `age` variables before the merge, so that it is clear which is time point 1 and time point 2. \n\n::: {.cell}\n\n```{.r .cell-code}\ndf$IgG_concentration_time1 <- df$IgG_concentration\ndf$age_time1 <- df$age\ndf$IgG_concentration <- df$age <- NULL #remove the original variables\n\ndf_new$IgG_concentration_time2 <- df_new$IgG_concentration\ndf_new$age_time2 <- df_new$age\ndf_new$IgG_concentration <- df_new$age <- NULL #remove the original variables\n```\n:::\n\n\nNow, lets merge.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_all_wide <- merge(df, df_new, all.x=T, all.y=T, by=c('observation_id'))\nstr(df_all_wide)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t651 obs. of  7 variables:\n $ observation_id         : int  5006 5024 5026 5030 5035 5054 5057 5063 5064 5080 ...\n $ gender                 : chr  \"Male\" \"Female\" \"Female\" \"Female\" ...\n $ slum                   : chr  \"Non slum\" \"Non slum\" \"Non slum\" \"Non slum\" ...\n $ IgG_concentration_time1: num  164.2979 0.3 0.3 0.0556 26.2113 ...\n $ age_time1              : int  7 5 10 7 11 3 3 12 14 6 ...\n $ IgG_concentration_time2: num  155.5811 0.2919 0.2543 0.0533 22.0159 ...\n $ age_time2              : int  11 9 14 11 15 7 7 16 18 10 ...\n```\n:::\n:::\n\n\n## Merge the new data with the original data\n\nThe second option is to add a time variable to the two data sets and then merge by `observation_id`,`time`,`age`,`IgG_concentration`. Note, I need to read in the data again b/c I removed the `IgG_concentration` and `age` variables.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(\"data/serodata.csv\")\ndf_new <- read.csv(\"data/serodata_new.csv\")\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$time <- 1 #you can put in one number and it will repeat it\ndf_new$time <- 2\nhead(df)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age|gender |slum     | time|\n|--------------:|-----------------:|---:|:------|:--------|----:|\n|           5772|         0.3176895|   2|Female |Non slum |    1|\n|           8095|         3.4368231|   4|Female |Non slum |    1|\n|           9784|         0.3000000|   4|Male   |Non slum |    1|\n|           9338|       143.2363014|   4|Male   |Non slum |    1|\n|           6369|         0.4476534|   1|Male   |Non slum |    1|\n|           6885|         0.0252708|   4|Male   |Non slum |    1|\n:::\n\n```{.r .cell-code}\nhead(df_new)\n```\n\n::: {.cell-output-display}\n| observation_id| IgG_concentration| age| time|\n|--------------:|-----------------:|---:|----:|\n|           5772|         0.2612388|   6|    2|\n|           8095|         2.9809049|   8|    2|\n|           9784|         0.2819489|   8|    2|\n|           9338|       136.6382260|   8|    2|\n|           6369|         0.3810119|   5|    2|\n|           6885|         0.0245951|   8|    2|\n:::\n:::\n\n\nNow, lets merge. Note, \"By default the data frames are merged on the columns with names they both have\" therefore if I don't specify the by argument it will merge on all matching variables.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf_all_long <- merge(df, df_new, all.x=T, all.y=T) \nstr(df_all_long)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t1287 obs. of  6 variables:\n $ observation_id   : int  5006 5006 5024 5024 5026 5026 5030 5030 5035 5035 ...\n $ IgG_concentration: num  155.581 164.298 0.292 0.3 0.254 ...\n $ age              : int  11 7 9 5 14 10 11 7 15 11 ...\n $ time             : num  2 1 2 1 2 1 2 1 2 1 ...\n $ gender           : chr  NA \"Male\" NA \"Female\" ...\n $ slum             : chr  NA \"Non slum\" NA \"Non slum\" ...\n```\n:::\n:::\n\nNote, there are 1287 rows, which is the sum of the number of rows of `df` (651 rows) and `df_new` (636 rows)\n\n\n## What is wide/long data?\n\nAbove, we actually created a wide and long version of the data.\n\nWide: has many columns\n\n- multiple columns per individual, values spread across multiple columns \n- easier for humans to read\n    \nLong: has many rows\n\n- column names become data\n- multiple rows per observation, a single column contains the values\n- easier for R to make plots & do analysis\n\n## `reshape()` function \n\nThe `reshape()` function allows you to toggle between wide and long data\n\n\nReshape Grouped Data\n\nDescription:\n\n     This function reshapes a data frame between 'wide' format (with\n     repeated measurements in separate columns of the same row) and\n     'long' format (with the repeated measurements in separate rows).\n\nUsage:\n\n     reshape(data, varying = NULL, v.names = NULL, timevar = \"time\",\n             idvar = \"id\", ids = 1:NROW(data),\n             times = seq_along(varying[[1]]),\n             drop = NULL, direction, new.row.names = NULL,\n             sep = \".\",\n             split = if (sep == \"\") {\n                 list(regexp = \"[A-Za-z][0-9]\", include = TRUE)\n             } else {\n                 list(regexp = sep, include = FALSE, fixed = TRUE)}\n             )\n     \n     ### Typical usage for converting from long to wide format:\n     \n     # reshape(data, direction = \"wide\",\n     #         idvar = \"___\", timevar = \"___\", # mandatory\n     #         v.names = c(___),    # time-varying variables\n     #         varying = list(___)) # auto-generated if missing\n     \n     ### Typical usage for converting from wide to long format:\n     \n     ### If names of wide-format variables are in a 'nice' format\n     \n     # reshape(data, direction = \"long\",\n     #         varying = c(___), # vector \n     #         sep)              # to help guess 'v.names' and 'times'\n     \n     ### To specify long-format variable names explicitly\n     \n     # reshape(data, direction = \"long\",\n     #         varying = ___,  # list / matrix / vector (use with care)\n     #         v.names = ___,  # vector of variable names in long format\n     #         timevar, times, # name / values of constructed time variable\n     #         idvar, ids)     # name / values of constructed id variable\n     \nArguments:\n\n    data: a data frame\n\n varying: names of sets of variables in the wide format that correspond\n          to single variables in long format ('time-varying').  This is\n          canonically a list of vectors of variable names, but it can\n          optionally be a matrix of names, or a single vector of names.\n          In each case, when 'direction = \"long\"', the names can be\n          replaced by indices which are interpreted as referring to\n          'names(data)'.  See 'Details' for more details and options.\n\n v.names: names of variables in the long format that correspond to\n          multiple variables in the wide format.  See 'Details'.\n\n timevar: the variable in long format that differentiates multiple\n          records from the same group or individual.  If more than one\n          record matches, the first will be taken (with a warning).\n\n   idvar: Names of one or more variables in long format that identify\n          multiple records from the same group/individual.  These\n          variables may also be present in wide format.\n\n     ids: the values to use for a newly created 'idvar' variable in\n          long format.\n\n   times: the values to use for a newly created 'timevar' variable in\n          long format.  See 'Details'.\n\n    drop: a vector of names of variables to drop before reshaping.\n\ndirection: character string, partially matched to either '\"wide\"' to\n          reshape to wide format, or '\"long\"' to reshape to long\n          format.\n\nnew.row.names: character or 'NULL': a non-null value will be used for\n          the row names of the result.\n\n     sep: A character vector of length 1, indicating a separating\n          character in the variable names in the wide format.  This is\n          used for guessing 'v.names' and 'times' arguments based on\n          the names in 'varying'.  If 'sep == \"\"', the split is just\n          before the first numeral that follows an alphabetic\n          character.  This is also used to create variable names when\n          reshaping to wide format.\n\n   split: A list with three components, 'regexp', 'include', and\n          (optionally) 'fixed'.  This allows an extended interface to\n          variable name splitting.  See 'Details'.\n\nDetails:\n\n     Although 'reshape()' can be used in a variety of contexts, the\n     motivating application is data from longitudinal studies, and the\n     arguments of this function are named and described in those terms.\n     A longitudinal study is characterized by repeated measurements of\n     the same variable(s), e.g., height and weight, on each unit being\n     studied (e.g., individual persons) at different time points (which\n     are assumed to be the same for all units). These variables are\n     called time-varying variables. The study may include other\n     variables that are measured only once for each unit and do not\n     vary with time (e.g., gender and race); these are called\n     time-constant variables.\n\n     A 'wide' format representation of a longitudinal dataset will have\n     one record (row) for each unit, typically with some time-constant\n     variables that occupy single columns, and some time-varying\n     variables that occupy multiple columns (one column for each time\n     point).  A 'long' format representation of the same dataset will\n     have multiple records (rows) for each individual, with the\n     time-constant variables being constant across these records and\n     the time-varying variables varying across the records.  The 'long'\n     format dataset will have two additional variables: a 'time'\n     variable identifying which time point each record comes from, and\n     an 'id' variable showing which records refer to the same unit.\n\n     The type of conversion (long to wide or wide to long) is\n     determined by the 'direction' argument, which is mandatory unless\n     the 'data' argument is the result of a previous call to 'reshape'.\n     In that case, the operation can be reversed simply using\n     'reshape(data)' (the other arguments are stored as attributes on\n     the data frame).\n\n     Conversion from long to wide format with 'direction = \"wide\"' is\n     the simpler operation, and is mainly useful in the context of\n     multivariate analysis where data is often expected as a\n     wide-format matrix. In this case, the time variable 'timevar' and\n     id variable 'idvar' must be specified. All other variables are\n     assumed to be time-varying, unless the time-varying variables are\n     explicitly specified via the 'v.names' argument.  A warning is\n     issued if time-constant variables are not actually constant.\n\n     Each time-varying variable is expanded into multiple variables in\n     the wide format.  The names of these expanded variables are\n     generated automatically, unless they are specified as the\n     'varying' argument in the form of a list (or matrix) with one\n     component (or row) for each time-varying variable. If 'varying' is\n     a vector of names, it is implicitly converted into a matrix, with\n     one row for each time-varying variable. Use this option with care\n     if there are multiple time-varying variables, as the ordering (by\n     column, the default in the 'matrix' constructor) may be\n     unintuitive, whereas the explicit list or matrix form is\n     unambiguous.\n\n     Conversion from wide to long with 'direction = \"long\"' is the more\n     common operation as most (univariate) statistical modeling\n     functions expect data in the long format. In the simpler case\n     where there is only one time-varying variable, the corresponding\n     columns in the wide format input can be specified as the 'varying'\n     argument, which can be either a vector of column names or the\n     corresponding column indices. The name of the corresponding\n     variable in the long format output combining these columns can be\n     optionally specified as the 'v.names' argument, and the name of\n     the time variables as the 'timevar' argument. The values to use as\n     the time values corresponding to the different columns in the wide\n     format can be specified as the 'times' argument.  If 'v.names' is\n     unspecified, the function will attempt to guess 'v.names' and\n     'times' from 'varying' (an explicitly specified 'times' argument\n     is unused in that case).  The default expects variable names like\n     'x.1', 'x.2', where 'sep = \".\"' specifies to split at the dot and\n     drop it from the name.  To have alphabetic followed by numeric\n     times use 'sep = \"\"'.\n\n     Multiple time-varying variables can be specified in two ways,\n     either with 'varying' as an atomic vector as above, or as a list\n     (or a matrix). The first form is useful (and mandatory) if the\n     automatic variable name splitting as described above is used; this\n     requires the names of all time-varying variables to be suitably\n     formatted in the same manner, and 'v.names' to be unspecified. If\n     'varying' is a list (with one component for each time-varying\n     variable) or a matrix (one row for each time-varying variable),\n     variable name splitting is not attempted, and 'v.names' and\n     'times' will generally need to be specified, although they will\n     default to, respectively, the first variable name in each set, and\n     sequential times.\n\n     Also, guessing is not attempted if 'v.names' is given explicitly,\n     even if 'varying' is an atomic vector. In that case, the number of\n     time-varying variables is taken to be the length of 'v.names', and\n     'varying' is implicitly converted into a matrix, with one row for\n     each time-varying variable. As in the case of long to wide\n     conversion, the matrix is filled up by column, so careful\n     attention needs to be paid to the order of variable names (or\n     indices) in 'varying', which is taken to be like 'x.1', 'y.1',\n     'x.2', 'y.2' (i.e., variables corresponding to the same time point\n     need to be grouped together).\n\n     The 'split' argument should not usually be necessary.  The\n     'split$regexp' component is passed to either 'strsplit' or\n     'regexpr', where the latter is used if 'split$include' is 'TRUE',\n     in which case the splitting occurs after the first character of\n     the matched string.  In the 'strsplit' case, the separator is not\n     included in the result, and it is possible to specify fixed-string\n     matching using 'split$fixed'.\n\nValue:\n\n     The reshaped data frame with added attributes to simplify\n     reshaping back to the original form.\n\nSee Also:\n\n     'stack', 'aperm'; 'relist' for reshaping the result of 'unlist'.\n     'xtabs' and 'as.data.frame.table' for creating contingency tables\n     and converting them back to data frames.\n\nExamples:\n\n     summary(Indometh) # data in long format\n     \n     ## long to wide (direction = \"wide\") requires idvar and timevar at a minimum\n     reshape(Indometh, direction = \"wide\", idvar = \"Subject\", timevar = \"time\")\n     \n     ## can also explicitly specify name of combined variable\n     wide <- reshape(Indometh, direction = \"wide\", idvar = \"Subject\",\n                     timevar = \"time\", v.names = \"conc\", sep= \"_\")\n     wide\n     \n     ## reverse transformation\n     reshape(wide, direction = \"long\")\n     reshape(wide, idvar = \"Subject\", varying = list(2:12),\n             v.names = \"conc\", direction = \"long\")\n     \n     ## times need not be numeric\n     df <- data.frame(id = rep(1:4, rep(2,4)),\n                      visit = I(rep(c(\"Before\",\"After\"), 4)),\n                      x = rnorm(4), y = runif(4))\n     df\n     reshape(df, timevar = \"visit\", idvar = \"id\", direction = \"wide\")\n     ## warns that y is really varying\n     reshape(df, timevar = \"visit\", idvar = \"id\", direction = \"wide\", v.names = \"x\")\n     \n     \n     ##  unbalanced 'long' data leads to NA fill in 'wide' form\n     df2 <- df[1:7, ]\n     df2\n     reshape(df2, timevar = \"visit\", idvar = \"id\", direction = \"wide\")\n     \n     ## Alternative regular expressions for guessing names\n     df3 <- data.frame(id = 1:4, age = c(40,50,60,50), dose1 = c(1,2,1,2),\n                       dose2 = c(2,1,2,1), dose4 = c(3,3,3,3))\n     reshape(df3, direction = \"long\", varying = 3:5, sep = \"\")\n     \n     \n     ## an example that isn't longitudinal data\n     state.x77 <- as.data.frame(state.x77)\n     long <- reshape(state.x77, idvar = \"state\", ids = row.names(state.x77),\n                     times = names(state.x77), timevar = \"Characteristic\",\n                     varying = list(names(state.x77)), direction = \"long\")\n     \n     reshape(long, direction = \"wide\")\n     \n     reshape(long, direction = \"wide\", new.row.names = unique(long$state))\n     \n     ## multiple id variables\n     df3 <- data.frame(school = rep(1:3, each = 4), class = rep(9:10, 6),\n                       time = rep(c(1,1,2,2), 3), score = rnorm(12))\n     wide <- reshape(df3, idvar = c(\"school\", \"class\"), direction = \"wide\")\n     wide\n     ## transform back\n     reshape(wide)\n\n\n\n## long to wide data\n\nxxzane - help\n\n\n## wide to long data\n\nxxzane - help\n\n\n## Let's get real\n\nUse the `pivot_wider()` and `pivot_longer()` from the tidyr package!\n\n\n\n## Summary\n\n-   ...\n\t\t\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n\n",
+    "supporting": [],
+    "filters": [
+      "rmarkdown/pagebreak.lua"
+    ],
+    "includes": {
+      "include-after-body": [
+        "\n<script>\n  // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n  // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n  // slide changes (different for each slide format).\n  (function () {\n    // dispatch for htmlwidgets\n    function fireSlideEnter() {\n      const event = window.document.createEvent(\"Event\");\n      event.initEvent(\"slideenter\", true, true);\n      window.document.dispatchEvent(event);\n    }\n\n    function fireSlideChanged(previousSlide, currentSlide) {\n      fireSlideEnter();\n\n      // dispatch for shiny\n      if (window.jQuery) {\n        if (previousSlide) {\n          window.jQuery(previousSlide).trigger(\"hidden\");\n        }\n        if (currentSlide) {\n          window.jQuery(currentSlide).trigger(\"shown\");\n        }\n      }\n    }\n\n    // hookup for slidy\n    if (window.w3c_slidy) {\n      window.w3c_slidy.add_observer(function (slide_num) {\n        // slide_num starts at position 1\n        fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n      });\n    }\n\n  })();\n</script>\n\n"
+      ]
+    },
+    "engineDependencies": {},
+    "preserve": {},
+    "postProcess": true
+  }
+}
\ No newline at end of file
diff --git a/_freeze/modules/Module09-DataAnalysis/execute-results/html.json b/_freeze/modules/Module09-DataAnalysis/execute-results/html.json
index 0497888..4af0d22 100644
--- a/_freeze/modules/Module09-DataAnalysis/execute-results/html.json
+++ b/_freeze/modules/Module09-DataAnalysis/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "ebcf08f6d0a895a7c6ee1c74583797e5",
+  "hash": "662b02c140c1e96bfb158859db710b34",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 9: Data Analysis\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Learning Objectives\n\nAfter module 9, you should be able to...\n\n-\t\tDescriptively assess association between two variables\n-\t\tCompute basic statistics \n-\t\tFit a generalized linear model\n\n## Import data for this module\n\nLet's read in our data (again) and take a quick look.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n\n\n:::\n:::\n\n\n\n## Prep data\n\nCreate `age_group` three level factor variable\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \n                              ifelse(df$age>10, \"old\", NA)))\ndf$age_group <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n```\n:::\n\n\n\nCreate `seropos` binary variable representing seropositivity if antibody concentrations are >10 mIUmL.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$seropos <- ifelse(df$IgG_concentration<10, 0, \n\t\t\t\t\t\t\t\t\t\tifelse(df$IgG_concentration>=10, 1, NA))\n```\n:::\n\n\n\n\n## 2 variable contingency tables\n\nWe use `table()` prior to look at one variable, now we can generate frequency tables for 2 plus variables.  To get cell percentages, the `prop.table()` is useful.  \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$age_group, df$seropo)\nprop <- prop.table(freq)\nfreq\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n        \n           0   1\n  young  254  57\n  middle  70 105\n  old     30 116\n```\n\n\n:::\n\n```{.r .cell-code}\nprop\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n        \n                  0          1\n  young  0.40189873 0.09018987\n  middle 0.11075949 0.16613924\n  old    0.04746835 0.18354430\n```\n\n\n:::\n:::\n\n\n\n## Chi-Square test\n\nThe `chisq.test()` function test of independence of factor variables from `stats` package.\n\n\n\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\nPearson's Chi-squared Test for Count Data\n\nDescription:\n\n     'chisq.test' performs chi-squared contingency table tests and\n     goodness-of-fit tests.\n\nUsage:\n\n     chisq.test(x, y = NULL, correct = TRUE,\n                p = rep(1/length(x), length(x)), rescale.p = FALSE,\n                simulate.p.value = FALSE, B = 2000)\n     \nArguments:\n\n       x: a numeric vector or matrix. 'x' and 'y' can also both be\n          factors.\n\n       y: a numeric vector; ignored if 'x' is a matrix.  If 'x' is a\n          factor, 'y' should be a factor of the same length.\n\n correct: a logical indicating whether to apply continuity correction\n          when computing the test statistic for 2 by 2 tables: one half\n          is subtracted from all |O - E| differences; however, the\n          correction will not be bigger than the differences\n          themselves.  No correction is done if 'simulate.p.value =\n          TRUE'.\n\n       p: a vector of probabilities of the same length as 'x'.  An\n          error is given if any entry of 'p' is negative.\n\nrescale.p: a logical scalar; if TRUE then 'p' is rescaled (if\n          necessary) to sum to 1.  If 'rescale.p' is FALSE, and 'p'\n          does not sum to 1, an error is given.\n\nsimulate.p.value: a logical indicating whether to compute p-values by\n          Monte Carlo simulation.\n\n       B: an integer specifying the number of replicates used in the\n          Monte Carlo test.\n\nDetails:\n\n     If 'x' is a matrix with one row or column, or if 'x' is a vector\n     and 'y' is not given, then a _goodness-of-fit test_ is performed\n     ('x' is treated as a one-dimensional contingency table).  The\n     entries of 'x' must be non-negative integers.  In this case, the\n     hypothesis tested is whether the population probabilities equal\n     those in 'p', or are all equal if 'p' is not given.\n\n     If 'x' is a matrix with at least two rows and columns, it is taken\n     as a two-dimensional contingency table: the entries of 'x' must be\n     non-negative integers.  Otherwise, 'x' and 'y' must be vectors or\n     factors of the same length; cases with missing values are removed,\n     the objects are coerced to factors, and the contingency table is\n     computed from these.  Then Pearson's chi-squared test is performed\n     of the null hypothesis that the joint distribution of the cell\n     counts in a 2-dimensional contingency table is the product of the\n     row and column marginals.\n\n     If 'simulate.p.value' is 'FALSE', the p-value is computed from the\n     asymptotic chi-squared distribution of the test statistic;\n     continuity correction is only used in the 2-by-2 case (if\n     'correct' is 'TRUE', the default).  Otherwise the p-value is\n     computed for a Monte Carlo test (Hope, 1968) with 'B' replicates.\n     The default 'B = 2000' implies a minimum p-value of about 0.0005\n     (1/(B+1)).\n\n     In the contingency table case, simulation is done by random\n     sampling from the set of all contingency tables with given\n     marginals, and works only if the marginals are strictly positive.\n     Continuity correction is never used, and the statistic is quoted\n     without it.  Note that this is not the usual sampling situation\n     assumed for the chi-squared test but rather that for Fisher's\n     exact test.\n\n     In the goodness-of-fit case simulation is done by random sampling\n     from the discrete distribution specified by 'p', each sample being\n     of size 'n = sum(x)'.  This simulation is done in R and may be\n     slow.\n\nValue:\n\n     A list with class '\"htest\"' containing the following components:\n\nstatistic: the value the chi-squared test statistic.\n\nparameter: the degrees of freedom of the approximate chi-squared\n          distribution of the test statistic, 'NA' if the p-value is\n          computed by Monte Carlo simulation.\n\n p.value: the p-value for the test.\n\n  method: a character string indicating the type of test performed, and\n          whether Monte Carlo simulation or continuity correction was\n          used.\n\ndata.name: a character string giving the name(s) of the data.\n\nobserved: the observed counts.\n\nexpected: the expected counts under the null hypothesis.\n\nresiduals: the Pearson residuals, '(observed - expected) /\n          sqrt(expected)'.\n\n  stdres: standardized residuals, '(observed - expected) / sqrt(V)',\n          where 'V' is the residual cell variance (Agresti, 2007,\n          section 2.4.5 for the case where 'x' is a matrix, 'n * p * (1\n          - p)' otherwise).\n\nSource:\n\n     The code for Monte Carlo simulation is a C translation of the\n     Fortran algorithm of Patefield (1981).\n\nReferences:\n\n     Hope, A. C. A. (1968).  A simplified Monte Carlo significance test\n     procedure.  _Journal of the Royal Statistical Society Series B_,\n     *30*, 582-598.  doi:10.1111/j.2517-6161.1968.tb00759.x\n     <https://doi.org/10.1111/j.2517-6161.1968.tb00759.x>.\n\n     Patefield, W. M. (1981).  Algorithm AS 159: An efficient method of\n     generating r x c tables with given row and column totals.\n     _Applied Statistics_, *30*, 91-97.  doi:10.2307/2346669\n     <https://doi.org/10.2307/2346669>.\n\n     Agresti, A. (2007).  _An Introduction to Categorical Data\n     Analysis_, 2nd ed.  New York: John Wiley & Sons.  Page 38.\n\nSee Also:\n\n     For goodness-of-fit testing, notably of continuous distributions,\n     'ks.test'.\n\nExamples:\n\n     ## From Agresti(2007) p.39\n     M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))\n     dimnames(M) <- list(gender = c(\"F\", \"M\"),\n                         party = c(\"Democrat\",\"Independent\", \"Republican\"))\n     (Xsq <- chisq.test(M))  # Prints test summary\n     Xsq$observed   # observed counts (same as M)\n     Xsq$expected   # expected counts under the null\n     Xsq$residuals  # Pearson residuals\n     Xsq$stdres     # standardized residuals\n     \n     \n     ## Effect of simulating p-values\n     x <- matrix(c(12, 5, 7, 7), ncol = 2)\n     chisq.test(x)$p.value           # 0.4233\n     chisq.test(x, simulate.p.value = TRUE, B = 10000)$p.value\n                                     # around 0.29!\n     \n     ## Testing for population probabilities\n     ## Case A. Tabulated data\n     x <- c(A = 20, B = 15, C = 25)\n     chisq.test(x)\n     chisq.test(as.table(x))             # the same\n     x <- c(89,37,30,28,2)\n     p <- c(40,20,20,15,5)\n     try(\n     chisq.test(x, p = p)                # gives an error\n     )\n     chisq.test(x, p = p, rescale.p = TRUE)\n                                     # works\n     p <- c(0.40,0.20,0.20,0.19,0.01)\n                                     # Expected count in category 5\n                                     # is 1.86 < 5 ==> chi square approx.\n     chisq.test(x, p = p)            #               maybe doubtful, but is ok!\n     chisq.test(x, p = p, simulate.p.value = TRUE)\n     \n     ## Case B. Raw data\n     x <- trunc(5 * runif(100))\n     chisq.test(table(x))            # NOT 'chisq.test(x)'!\n\n\n\n\n## Chi-Square test\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(freq)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\n\tPearson's Chi-squared test\n\ndata:  freq\nX-squared = 175.85, df = 2, p-value < 2.2e-16\n```\n\n\n:::\n:::\n\n\n\nWe reject the null hypothesis that the proportion of seropositive individuals who are young (<5yo) is the same for individuals who are middle (5-10yo) or old (>10yo).\n\n\n## Correlation\n\nFirst, we compute correlation by providing two vectors.\n\nLike other functions, if there are `NA`s, you get `NA` as the result. But if you specify use only the complete observations, then it will give you correlation using the non-missing data.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncor(df$age, df$IgG_concentration, method=\"pearson\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] NA\n```\n\n\n:::\n\n```{.r .cell-code}\ncor(df$age, df$IgG_concentration, method=\"pearson\", use = \"complete.obs\") #IF have missing data\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 0.2604783\n```\n\n\n:::\n:::\n\n\n\nSmall positive correlation between IgG concentration and age.\n\n## T-test\n\nThe commonly used are:\n\n-   **one-sample t-test** -- used to test mean of a variable in one group (to the null hypothesis mean)\n-   **two-sample t-test** -- used to test difference in means of a variable between two groups (null hypothesis - the group means are the *same*); if \"two groups\" are data of the *same* individuals collected at 2 time points, we say it is two-sample paired t-test\n\n## T-test\n\nWe can use the `t.test()` function from the `stats` package.\n\n\n\nStudent's t-Test\n\nDescription:\n\n     Performs one and two sample t-tests on vectors of data.\n\nUsage:\n\n     t.test(x, ...)\n     \n     ## Default S3 method:\n     t.test(x, y = NULL,\n            alternative = c(\"two.sided\", \"less\", \"greater\"),\n            mu = 0, paired = FALSE, var.equal = FALSE,\n            conf.level = 0.95, ...)\n     \n     ## S3 method for class 'formula'\n     t.test(formula, data, subset, na.action, ...)\n     \nArguments:\n\n       x: a (non-empty) numeric vector of data values.\n\n       y: an optional (non-empty) numeric vector of data values.\n\nalternative: a character string specifying the alternative hypothesis,\n          must be one of '\"two.sided\"' (default), '\"greater\"' or\n          '\"less\"'.  You can specify just the initial letter.\n\n      mu: a number indicating the true value of the mean (or difference\n          in means if you are performing a two sample test).\n\n  paired: a logical indicating whether you want a paired t-test.\n\nvar.equal: a logical variable indicating whether to treat the two\n          variances as being equal. If 'TRUE' then the pooled variance\n          is used to estimate the variance otherwise the Welch (or\n          Satterthwaite) approximation to the degrees of freedom is\n          used.\n\nconf.level: confidence level of the interval.\n\n formula: a formula of the form 'lhs ~ rhs' where 'lhs' is a numeric\n          variable giving the data values and 'rhs' either '1' for a\n          one-sample or paired test or a factor with two levels giving\n          the corresponding groups. If 'lhs' is of class '\"Pair\"' and\n          'rhs' is '1', a paired test is done.\n\n    data: an optional matrix or data frame (or similar: see\n          'model.frame') containing the variables in the formula\n          'formula'.  By default the variables are taken from\n          'environment(formula)'.\n\n  subset: an optional vector specifying a subset of observations to be\n          used.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  Defaults to 'getOption(\"na.action\")'.\n\n     ...: further arguments to be passed to or from methods.\n\nDetails:\n\n     'alternative = \"greater\"' is the alternative that 'x' has a larger\n     mean than 'y'. For the one-sample case: that the mean is positive.\n\n     If 'paired' is 'TRUE' then both 'x' and 'y' must be specified and\n     they must be the same length.  Missing values are silently removed\n     (in pairs if 'paired' is 'TRUE').  If 'var.equal' is 'TRUE' then\n     the pooled estimate of the variance is used.  By default, if\n     'var.equal' is 'FALSE' then the variance is estimated separately\n     for both groups and the Welch modification to the degrees of\n     freedom is used.\n\n     If the input data are effectively constant (compared to the larger\n     of the two means) an error is generated.\n\nValue:\n\n     A list with class '\"htest\"' containing the following components:\n\nstatistic: the value of the t-statistic.\n\nparameter: the degrees of freedom for the t-statistic.\n\n p.value: the p-value for the test.\n\nconf.int: a confidence interval for the mean appropriate to the\n          specified alternative hypothesis.\n\nestimate: the estimated mean or difference in means depending on\n          whether it was a one-sample test or a two-sample test.\n\nnull.value: the specified hypothesized value of the mean or mean\n          difference depending on whether it was a one-sample test or a\n          two-sample test.\n\n  stderr: the standard error of the mean (difference), used as\n          denominator in the t-statistic formula.\n\nalternative: a character string describing the alternative hypothesis.\n\n  method: a character string indicating what type of t-test was\n          performed.\n\ndata.name: a character string giving the name(s) of the data.\n\nSee Also:\n\n     'prop.test'\n\nExamples:\n\n     require(graphics)\n     \n     t.test(1:10, y = c(7:20))      # P = .00001855\n     t.test(1:10, y = c(7:20, 200)) # P = .1245    -- NOT significant anymore\n     \n     ## Classical example: Student's sleep data\n     plot(extra ~ group, data = sleep)\n     ## Traditional interface\n     with(sleep, t.test(extra[group == 1], extra[group == 2]))\n     \n     ## Formula interface\n     t.test(extra ~ group, data = sleep)\n     \n     ## Formula interface to one-sample test\n     t.test(extra ~ 1, data = sleep)\n     \n     ## Formula interface to paired test\n     ## The sleep data are actually paired, so could have been in wide format:\n     sleep2 <- reshape(sleep, direction = \"wide\", \n                       idvar = \"ID\", timevar = \"group\")\n     t.test(Pair(extra.1, extra.2) ~ 1, data = sleep2)\n\n\n\n## Running two-sample t-test\n\nThe **base R** - `t.test()` function from the `stats` package. It tests test difference in means of a variable between two groups. By default:\n\n-   tests whether difference in means of a variable is equal to 0 (default `mu=0`)\n-   uses \"two sided\" alternative (`alternative = \"two.sided\"`)\n-   returns result assuming confidence level 0.95 (`conf.level = 0.95`)\n-   assumes data are not paired (`paired = FALSE`)\n-   assumes true variance in the two groups is not equal (`var.equal = FALSE`)\n\n## Running two-sample t-test\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nIgG_young <- df$IgG_concentration[df$age_group==\"young\"]\nIgG_old <- df$IgG_concentration[df$age_group==\"old\"]\n\nt.test(IgG_young, IgG_old)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\n\tWelch Two Sample t-test\n\ndata:  IgG_young and IgG_old\nt = -6.1969, df = 259.54, p-value = 2.25e-09\nalternative hypothesis: true difference in means is not equal to 0\n95 percent confidence interval:\n -111.09281  -57.51515\nsample estimates:\nmean of x mean of y \n 45.05056 129.35454 \n```\n\n\n:::\n:::\n\n\n\nThe mean IgG concenration of young and old is 45.05 and 129.35 mIU/mL, respectively. We reject null hypothesis that the difference in the mean IgG concentration of young and old is 0 mIU/mL.\n\n## Linear regression fit in R\n\nTo fit regression models in R, we use the function `glm()` (Generalized Linear Model).\n\n\n\n\nFitting Generalized Linear Models\n\nDescription:\n\n     'glm' is used to fit generalized linear models, specified by\n     giving a symbolic description of the linear predictor and a\n     description of the error distribution.\n\nUsage:\n\n     glm(formula, family = gaussian, data, weights, subset,\n         na.action, start = NULL, etastart, mustart, offset,\n         control = list(...), model = TRUE, method = \"glm.fit\",\n         x = FALSE, y = TRUE, singular.ok = TRUE, contrasts = NULL, ...)\n     \n     glm.fit(x, y, weights = rep.int(1, nobs),\n             start = NULL, etastart = NULL, mustart = NULL,\n             offset = rep.int(0, nobs), family = gaussian(),\n             control = list(), intercept = TRUE, singular.ok = TRUE)\n     \n     ## S3 method for class 'glm'\n     weights(object, type = c(\"prior\", \"working\"), ...)\n     \nArguments:\n\n formula: an object of class '\"formula\"' (or one that can be coerced to\n          that class): a symbolic description of the model to be\n          fitted.  The details of model specification are given under\n          'Details'.\n\n  family: a description of the error distribution and link function to\n          be used in the model.  For 'glm' this can be a character\n          string naming a family function, a family function or the\n          result of a call to a family function.  For 'glm.fit' only\n          the third option is supported.  (See 'family' for details of\n          family functions.)\n\n    data: an optional data frame, list or environment (or object\n          coercible by 'as.data.frame' to a data frame) containing the\n          variables in the model.  If not found in 'data', the\n          variables are taken from 'environment(formula)', typically\n          the environment from which 'glm' is called.\n\n weights: an optional vector of 'prior weights' to be used in the\n          fitting process.  Should be 'NULL' or a numeric vector.\n\n  subset: an optional vector specifying a subset of observations to be\n          used in the fitting process.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  The default is set by the 'na.action' setting\n          of 'options', and is 'na.fail' if that is unset.  The\n          'factory-fresh' default is 'na.omit'.  Another possible value\n          is 'NULL', no action.  Value 'na.exclude' can be useful.\n\n   start: starting values for the parameters in the linear predictor.\n\netastart: starting values for the linear predictor.\n\n mustart: starting values for the vector of means.\n\n  offset: this can be used to specify an _a priori_ known component to\n          be included in the linear predictor during fitting.  This\n          should be 'NULL' or a numeric vector of length equal to the\n          number of cases.  One or more 'offset' terms can be included\n          in the formula instead or as well, and if more than one is\n          specified their sum is used.  See 'model.offset'.\n\n control: a list of parameters for controlling the fitting process.\n          For 'glm.fit' this is passed to 'glm.control'.\n\n   model: a logical value indicating whether _model frame_ should be\n          included as a component of the returned value.\n\n  method: the method to be used in fitting the model.  The default\n          method '\"glm.fit\"' uses iteratively reweighted least squares\n          (IWLS): the alternative '\"model.frame\"' returns the model\n          frame and does no fitting.\n\n          User-supplied fitting functions can be supplied either as a\n          function or a character string naming a function, with a\n          function which takes the same arguments as 'glm.fit'.  If\n          specified as a character string it is looked up from within\n          the 'stats' namespace.\n\n    x, y: For 'glm': logical values indicating whether the response\n          vector and model matrix used in the fitting process should be\n          returned as components of the returned value.\n\n          For 'glm.fit': 'x' is a design matrix of dimension 'n * p',\n          and 'y' is a vector of observations of length 'n'.\n\nsingular.ok: logical; if 'FALSE' a singular fit is an error.\n\ncontrasts: an optional list. See the 'contrasts.arg' of\n          'model.matrix.default'.\n\nintercept: logical. Should an intercept be included in the _null_\n          model?\n\n  object: an object inheriting from class '\"glm\"'.\n\n    type: character, partial matching allowed.  Type of weights to\n          extract from the fitted model object.  Can be abbreviated.\n\n     ...: For 'glm': arguments to be used to form the default 'control'\n          argument if it is not supplied directly.\n\n          For 'weights': further arguments passed to or from other\n          methods.\n\nDetails:\n\n     A typical predictor has the form 'response ~ terms' where\n     'response' is the (numeric) response vector and 'terms' is a\n     series of terms which specifies a linear predictor for 'response'.\n     For 'binomial' and 'quasibinomial' families the response can also\n     be specified as a 'factor' (when the first level denotes failure\n     and all others success) or as a two-column matrix with the columns\n     giving the numbers of successes and failures.  A terms\n     specification of the form 'first + second' indicates all the terms\n     in 'first' together with all the terms in 'second' with any\n     duplicates removed.\n\n     A specification of the form 'first:second' indicates the set of\n     terms obtained by taking the interactions of all terms in 'first'\n     with all terms in 'second'.  The specification 'first*second'\n     indicates the _cross_ of 'first' and 'second'.  This is the same\n     as 'first + second + first:second'.\n\n     The terms in the formula will be re-ordered so that main effects\n     come first, followed by the interactions, all second-order, all\n     third-order and so on: to avoid this pass a 'terms' object as the\n     formula.\n\n     Non-'NULL' 'weights' can be used to indicate that different\n     observations have different dispersions (with the values in\n     'weights' being inversely proportional to the dispersions); or\n     equivalently, when the elements of 'weights' are positive integers\n     w_i, that each response y_i is the mean of w_i unit-weight\n     observations.  For a binomial GLM prior weights are used to give\n     the number of trials when the response is the proportion of\n     successes: they would rarely be used for a Poisson GLM.\n\n     'glm.fit' is the workhorse function: it is not normally called\n     directly but can be more efficient where the response vector,\n     design matrix and family have already been calculated.\n\n     If more than one of 'etastart', 'start' and 'mustart' is\n     specified, the first in the list will be used.  It is often\n     advisable to supply starting values for a 'quasi' family, and also\n     for families with unusual links such as 'gaussian(\"log\")'.\n\n     All of 'weights', 'subset', 'offset', 'etastart' and 'mustart' are\n     evaluated in the same way as variables in 'formula', that is first\n     in 'data' and then in the environment of 'formula'.\n\n     For the background to warning messages about 'fitted probabilities\n     numerically 0 or 1 occurred' for binomial GLMs, see Venables &\n     Ripley (2002, pp. 197-8).\n\nValue:\n\n     'glm' returns an object of class inheriting from '\"glm\"' which\n     inherits from the class '\"lm\"'. See later in this section.  If a\n     non-standard 'method' is used, the object will also inherit from\n     the class (if any) returned by that function.\n\n     The function 'summary' (i.e., 'summary.glm') can be used to obtain\n     or print a summary of the results and the function 'anova' (i.e.,\n     'anova.glm') to produce an analysis of variance table.\n\n     The generic accessor functions 'coefficients', 'effects',\n     'fitted.values' and 'residuals' can be used to extract various\n     useful features of the value returned by 'glm'.\n\n     'weights' extracts a vector of weights, one for each case in the\n     fit (after subsetting and 'na.action').\n\n     An object of class '\"glm\"' is a list containing at least the\n     following components:\n\ncoefficients: a named vector of coefficients\n\nresiduals: the _working_ residuals, that is the residuals in the final\n          iteration of the IWLS fit.  Since cases with zero weights are\n          omitted, their working residuals are 'NA'.\n\nfitted.values: the fitted mean values, obtained by transforming the\n          linear predictors by the inverse of the link function.\n\n    rank: the numeric rank of the fitted linear model.\n\n  family: the 'family' object used.\n\nlinear.predictors: the linear fit on link scale.\n\ndeviance: up to a constant, minus twice the maximized log-likelihood.\n          Where sensible, the constant is chosen so that a saturated\n          model has deviance zero.\n\n     aic: A version of Akaike's _An Information Criterion_, minus twice\n          the maximized log-likelihood plus twice the number of\n          parameters, computed via the 'aic' component of the family.\n          For binomial and Poison families the dispersion is fixed at\n          one and the number of parameters is the number of\n          coefficients.  For gaussian, Gamma and inverse gaussian\n          families the dispersion is estimated from the residual\n          deviance, and the number of parameters is the number of\n          coefficients plus one.  For a gaussian family the MLE of the\n          dispersion is used so this is a valid value of AIC, but for\n          Gamma and inverse gaussian families it is not.  For families\n          fitted by quasi-likelihood the value is 'NA'.\n\nnull.deviance: The deviance for the null model, comparable with\n          'deviance'. The null model will include the offset, and an\n          intercept if there is one in the model.  Note that this will\n          be incorrect if the link function depends on the data other\n          than through the fitted mean: specify a zero offset to force\n          a correct calculation.\n\n    iter: the number of iterations of IWLS used.\n\n weights: the _working_ weights, that is the weights in the final\n          iteration of the IWLS fit.\n\nprior.weights: the weights initially supplied, a vector of '1's if none\n          were.\n\ndf.residual: the residual degrees of freedom.\n\n df.null: the residual degrees of freedom for the null model.\n\n       y: if requested (the default) the 'y' vector used. (It is a\n          vector even for a binomial model.)\n\n       x: if requested, the model matrix.\n\n   model: if requested (the default), the model frame.\n\nconverged: logical. Was the IWLS algorithm judged to have converged?\n\nboundary: logical. Is the fitted value on the boundary of the\n          attainable values?\n\n    call: the matched call.\n\n formula: the formula supplied.\n\n   terms: the 'terms' object used.\n\n    data: the 'data argument'.\n\n  offset: the offset vector used.\n\n control: the value of the 'control' argument used.\n\n  method: the name of the fitter function used (when provided as a\n          'character' string to 'glm()') or the fitter 'function' (when\n          provided as that).\n\ncontrasts: (where relevant) the contrasts used.\n\n xlevels: (where relevant) a record of the levels of the factors used\n          in fitting.\n\nna.action: (where relevant) information returned by 'model.frame' on\n          the special handling of 'NA's.\n\n     In addition, non-empty fits will have components 'qr', 'R' and\n     'effects' relating to the final weighted linear fit.\n\n     Objects of class '\"glm\"' are normally of class 'c(\"glm\", \"lm\")',\n     that is inherit from class '\"lm\"', and well-designed methods for\n     class '\"lm\"' will be applied to the weighted linear model at the\n     final iteration of IWLS.  However, care is needed, as extractor\n     functions for class '\"glm\"' such as 'residuals' and 'weights' do\n     *not* just pick out the component of the fit with the same name.\n\n     If a 'binomial' 'glm' model was specified by giving a two-column\n     response, the weights returned by 'prior.weights' are the total\n     numbers of cases (factored by the supplied case weights) and the\n     component 'y' of the result is the proportion of successes.\n\nFitting functions:\n\n     The argument 'method' serves two purposes.  One is to allow the\n     model frame to be recreated with no fitting.  The other is to\n     allow the default fitting function 'glm.fit' to be replaced by a\n     function which takes the same arguments and uses a different\n     fitting algorithm.  If 'glm.fit' is supplied as a character string\n     it is used to search for a function of that name, starting in the\n     'stats' namespace.\n\n     The class of the object return by the fitter (if any) will be\n     prepended to the class returned by 'glm'.\n\nAuthor(s):\n\n     The original R implementation of 'glm' was written by Simon Davies\n     working for Ross Ihaka at the University of Auckland, but has\n     since been extensively re-written by members of the R Core team.\n\n     The design was inspired by the S function of the same name\n     described in Hastie & Pregibon (1992).\n\nReferences:\n\n     Dobson, A. J. (1990) _An Introduction to Generalized Linear\n     Models._ London: Chapman and Hall.\n\n     Hastie, T. J. and Pregibon, D. (1992) _Generalized linear models._\n     Chapter 6 of _Statistical Models in S_ eds J. M. Chambers and T.\n     J. Hastie, Wadsworth & Brooks/Cole.\n\n     McCullagh P. and Nelder, J. A. (1989) _Generalized Linear Models._\n     London: Chapman and Hall.\n\n     Venables, W. N. and Ripley, B. D. (2002) _Modern Applied\n     Statistics with S._ New York: Springer.\n\nSee Also:\n\n     'anova.glm', 'summary.glm', etc. for 'glm' methods, and the\n     generic functions 'anova', 'summary', 'effects', 'fitted.values',\n     and 'residuals'.\n\n     'lm' for non-generalized _linear_ models (which SAS calls GLMs,\n     for 'general' linear models).\n\n     'loglin' and 'loglm' (package 'MASS') for fitting log-linear\n     models (which binomial and Poisson GLMs are) to contingency\n     tables.\n\n     'bigglm' in package 'biglm' for an alternative way to fit GLMs to\n     large datasets (especially those with many cases).\n\n     'esoph', 'infert' and 'predict.glm' have examples of fitting\n     binomial glms.\n\nExamples:\n\n     ## Dobson (1990) Page 93: Randomized Controlled Trial :\n     counts <- c(18,17,15,20,10,20,25,13,12)\n     outcome <- gl(3,1,9)\n     treatment <- gl(3,3)\n     data.frame(treatment, outcome, counts) # showing data\n     glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())\n     anova(glm.D93)\n     summary(glm.D93)\n     ## Computing AIC [in many ways]:\n     (A0 <- AIC(glm.D93))\n     (ll <- logLik(glm.D93))\n     A1 <- -2*c(ll) + 2*attr(ll, \"df\")\n     A2 <- glm.D93$family$aic(counts, mu=fitted(glm.D93), wt=1) +\n             2 * length(coef(glm.D93))\n     stopifnot(exprs = {\n       all.equal(A0, A1)\n       all.equal(A1, A2)\n       all.equal(A1, glm.D93$aic)\n     })\n     \n     \n     ## an example with offsets from Venables & Ripley (2002, p.189)\n     utils::data(anorexia, package = \"MASS\")\n     \n     anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt),\n                     family = gaussian, data = anorexia)\n     summary(anorex.1)\n     \n     \n     # A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)\n     clotting <- data.frame(\n         u = c(5,10,15,20,30,40,60,80,100),\n         lot1 = c(118,58,42,35,27,25,21,19,18),\n         lot2 = c(69,35,26,21,18,16,13,12,12))\n     summary(glm(lot1 ~ log(u), data = clotting, family = Gamma))\n     summary(glm(lot2 ~ log(u), data = clotting, family = Gamma))\n     ## Aliased (\"S\"ingular) -> 1 NA coefficient\n     (fS <- glm(lot2 ~ log(u) + log(u^2), data = clotting, family = Gamma))\n     tools::assertError(update(fS, singular.ok=FALSE), verbose=interactive())\n     ## -> .. \"singular fit encountered\"\n     \n     ## Not run:\n     \n     ## for an example of the use of a terms object as a formula\n     demo(glm.vr)\n     ## End(Not run)\n\n\n\n## Linear regression fit in R\n\nWe tend to focus on three arguments:\n\n-   `formula` -- model formula written using names of columns in our data\n-   `data` -- our data frame\n-\t\t`family` -- error distribution and link function\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfit1 <- glm(IgG_concentration~age+gender+slum, data=df, family=gaussian())\nfit2 <- glm(seropos~age_group+gender+slum, data=df, family = binomial(link = \"logit\"))\n```\n:::\n\n\n\n## `summary.glm()`\n\nThe `summary()` function when applied to a fit object based on a glm is technically the `summary.glm()` function and produces details of the model fit. Note on object oriented code.\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_script.png){width=200%}\n:::\n:::\n\nSummarizing Generalized Linear Model Fits\n\nDescription:\n\n     These functions are all 'methods' for class 'glm' or 'summary.glm'\n     objects.\n\nUsage:\n\n     ## S3 method for class 'glm'\n     summary(object, dispersion = NULL, correlation = FALSE,\n             symbolic.cor = FALSE, ...)\n     \n     ## S3 method for class 'summary.glm'\n     print(x, digits = max(3, getOption(\"digits\") - 3),\n           symbolic.cor = x$symbolic.cor,\n           signif.stars = getOption(\"show.signif.stars\"),\n           show.residuals = FALSE, ...)\n     \nArguments:\n\n  object: an object of class '\"glm\"', usually, a result of a call to\n          'glm'.\n\n       x: an object of class '\"summary.glm\"', usually, a result of a\n          call to 'summary.glm'.\n\ndispersion: the dispersion parameter for the family used.  Either a\n          single numerical value or 'NULL' (the default), when it is\n          inferred from 'object' (see 'Details').\n\ncorrelation: logical; if 'TRUE', the correlation matrix of the\n          estimated parameters is returned and printed.\n\n  digits: the number of significant digits to use when printing.\n\nsymbolic.cor: logical. If 'TRUE', print the correlations in a symbolic\n          form (see 'symnum') rather than as numbers.\n\nsignif.stars: logical. If 'TRUE', 'significance stars' are printed for\n          each coefficient.\n\nshow.residuals: logical. If 'TRUE' then a summary of the deviance\n          residuals is printed at the head of the output.\n\n     ...: further arguments passed to or from other methods.\n\nDetails:\n\n     'print.summary.glm' tries to be smart about formatting the\n     coefficients, standard errors, etc. and additionally gives\n     'significance stars' if 'signif.stars' is 'TRUE'.  The\n     'coefficients' component of the result gives the estimated\n     coefficients and their estimated standard errors, together with\n     their ratio.  This third column is labelled 't ratio' if the\n     dispersion is estimated, and 'z ratio' if the dispersion is known\n     (or fixed by the family).  A fourth column gives the two-tailed\n     p-value corresponding to the t or z ratio based on a Student t or\n     Normal reference distribution.  (It is possible that the\n     dispersion is not known and there are no residual degrees of\n     freedom from which to estimate it.  In that case the estimate is\n     'NaN'.)\n\n     Aliased coefficients are omitted in the returned object but\n     restored by the 'print' method.\n\n     Correlations are printed to two decimal places (or symbolically):\n     to see the actual correlations print 'summary(object)$correlation'\n     directly.\n\n     The dispersion of a GLM is not used in the fitting process, but it\n     is needed to find standard errors.  If 'dispersion' is not\n     supplied or 'NULL', the dispersion is taken as '1' for the\n     'binomial' and 'Poisson' families, and otherwise estimated by the\n     residual Chisquared statistic (calculated from cases with non-zero\n     weights) divided by the residual degrees of freedom.\n\n     'summary' can be used with Gaussian 'glm' fits to handle the case\n     of a linear regression with known error variance, something not\n     handled by 'summary.lm'.\n\nValue:\n\n     'summary.glm' returns an object of class '\"summary.glm\"', a list\n     with components\n\n    call: the component from 'object'.\n\n  family: the component from 'object'.\n\ndeviance: the component from 'object'.\n\ncontrasts: the component from 'object'.\n\ndf.residual: the component from 'object'.\n\nnull.deviance: the component from 'object'.\n\n df.null: the component from 'object'.\n\ndeviance.resid: the deviance residuals: see 'residuals.glm'.\n\ncoefficients: the matrix of coefficients, standard errors, z-values and\n          p-values.  Aliased coefficients are omitted.\n\n aliased: named logical vector showing if the original coefficients are\n          aliased.\n\ndispersion: either the supplied argument or the inferred/estimated\n          dispersion if the former is 'NULL'.\n\n      df: a 3-vector of the rank of the model and the number of\n          residual degrees of freedom, plus number of coefficients\n          (including aliased ones).\n\ncov.unscaled: the unscaled ('dispersion = 1') estimated covariance\n          matrix of the estimated coefficients.\n\ncov.scaled: ditto, scaled by 'dispersion'.\n\ncorrelation: (only if 'correlation' is true.)  The estimated\n          correlations of the estimated coefficients.\n\nsymbolic.cor: (only if 'correlation' is true.)  The value of the\n          argument 'symbolic.cor'.\n\nSee Also:\n\n     'glm', 'summary'.\n\nExamples:\n\n     ## For examples see example(glm)\n\n\n\n\n## Linear regression fit in R\n\nLets look at the output...\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(fit1)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\nCall:\nglm(formula = IgG_concentration ~ age + gender + slum, family = gaussian(), \n    data = df)\n\nCoefficients:\n             Estimate Std. Error t value Pr(>|t|)    \n(Intercept)    46.132     16.774   2.750  0.00613 ** \nage             9.324      1.388   6.718 4.15e-11 ***\ngenderMale     -9.655     11.543  -0.836  0.40321    \nslumNon slum  -20.353     14.299  -1.423  0.15513    \nslumSlum      -29.705     25.009  -1.188  0.23536    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for gaussian family taken to be 20918.39)\n\n    Null deviance: 14141483  on 631  degrees of freedom\nResidual deviance: 13115831  on 627  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 8087.9\n\nNumber of Fisher Scoring iterations: 2\n```\n\n\n:::\n\n```{.r .cell-code}\nsummary(fit2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\nCall:\nglm(formula = seropos ~ age_group + gender + slum, family = binomial(link = \"logit\"), \n    data = df)\n\nCoefficients:\n                Estimate Std. Error z value Pr(>|z|)    \n(Intercept)      -1.3220     0.2516  -5.254 1.49e-07 ***\nage_groupmiddle   1.9020     0.2133   8.916  < 2e-16 ***\nage_groupold      2.8443     0.2522  11.278  < 2e-16 ***\ngenderMale       -0.1725     0.1895  -0.910    0.363    \nslumNon slum     -0.1099     0.2329  -0.472    0.637    \nslumSlum         -0.1073     0.4118  -0.261    0.794    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n    Null deviance: 866.98  on 631  degrees of freedom\nResidual deviance: 679.10  on 626  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 691.1\n\nNumber of Fisher Scoring iterations: 4\n```\n\n\n:::\n:::\n\n\n\n\n\n## Summary\n\n-\t\tUse `cor()` to calculate correlation between two numeric vectors.\n-   `corrplot()` and `ggpairs()` is nice for a quick visualization of correlations\n-   `t.test()` or `t_test()` tests the mean compared to null or difference in means between two groups\n-\t\t... xxamy more\n\n## Acknowledgements\n\nThese are the materials I looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n",
+    "markdown": "---\ntitle: \"Module 9: Data Analysis\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 9, you should be able to...\n\n-\tDescriptively assess association between two variables\n-\tCompute basic statistics \n-\tFit a generalized linear model\n\n## Import data for this module\n\nLet's read in our data (again) and take a quick look.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n:::\n:::\n\n\n## Prep data\n\nCreate `age_group` three level factor variable\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \"old\"))\ndf$age_group <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n```\n:::\n\n\nCreate `seropos` binary variable representing seropositivity if antibody concentrations are >10 IU/mL.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$seropos <- ifelse(df$IgG_concentration<10, 0, 1)\n```\n:::\n\n\n\n## 2 variable contingency tables\n\nWe use `table()` prior to look at one variable, now we can generate frequency tables for 2 plus variables.  To get cell percentages, the `prop.table()` is useful.  \n\n\n::: {.cell}\n\n```{.r .cell-code}\n?prop.table\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(printr)\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n:::\n\n```{.r .cell-code}\n?prop.table\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nExpress Table Entries as Fraction of Marginal Table\n\nDescription:\n\n     Returns conditional proportions given 'margins', i.e. entries of\n     'x', divided by the appropriate marginal sums.\n\nUsage:\n\n     proportions(x, margin = NULL)\n     prop.table(x, margin = NULL)\n     \nArguments:\n\n       x: table\n\n  margin: a vector giving the margins to split by.  E.g., for a matrix\n          '1' indicates rows, '2' indicates columns, 'c(1, 2)'\n          indicates rows and columns.  When 'x' has named dimnames, it\n          can be a character vector selecting dimension names.\n\nValue:\n\n     Table like 'x' expressed relative to 'margin'\n\nNote:\n\n     'prop.table' is an earlier name, retained for back-compatibility.\n\nAuthor(s):\n\n     Peter Dalgaard\n\nSee Also:\n\n     'marginSums'. 'apply', 'sweep' are a more general mechanism for\n     sweeping out marginal statistics.\n\nExamples:\n\n     m <- matrix(1:4, 2)\n     m\n     proportions(m, 1)\n     \n     DF <- as.data.frame(UCBAdmissions)\n     tbl <- xtabs(Freq ~ Gender + Admit, DF)\n     \n     proportions(tbl, \"Gender\")\n```\n:::\n:::\n\n\n## 2 variable contingency tables\n\nLet's practice\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$age_group, df$seropos)\nfreq\n```\n\n::: {.cell-output-display}\n|/      |   0|   1|\n|:------|---:|---:|\n|young  | 254|  57|\n|middle |  70| 105|\n|old    |  30| 116|\n:::\n:::\n\n\nNow, lets move to percentages\n\n::: {.cell}\n\n```{.r .cell-code}\nprop.cell.percentages <- prop.table(freq)\nprop.cell.percentages\n```\n\n::: {.cell-output-display}\n|/      |         0|         1|\n|:------|---------:|---------:|\n|young  | 0.4018987| 0.0901899|\n|middle | 0.1107595| 0.1661392|\n|old    | 0.0474684| 0.1835443|\n:::\n\n```{.r .cell-code}\nprop.column.percentages <- prop.table(freq, margin=2)\nprop.column.percentages\n```\n\n::: {.cell-output-display}\n|/      |         0|         1|\n|:------|---------:|---------:|\n|young  | 0.7175141| 0.2050360|\n|middle | 0.1977401| 0.3776978|\n|old    | 0.0847458| 0.4172662|\n:::\n:::\n\n\n\n## Chi-Square test\n\nThe `chisq.test()` function test of independence of factor variables from `stats` package.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?chisq.test\n```\n:::\n\nPearson's Chi-squared Test for Count Data\n\nDescription:\n\n     'chisq.test' performs chi-squared contingency table tests and\n     goodness-of-fit tests.\n\nUsage:\n\n     chisq.test(x, y = NULL, correct = TRUE,\n                p = rep(1/length(x), length(x)), rescale.p = FALSE,\n                simulate.p.value = FALSE, B = 2000)\n     \nArguments:\n\n       x: a numeric vector or matrix. 'x' and 'y' can also both be\n          factors.\n\n       y: a numeric vector; ignored if 'x' is a matrix.  If 'x' is a\n          factor, 'y' should be a factor of the same length.\n\n correct: a logical indicating whether to apply continuity correction\n          when computing the test statistic for 2 by 2 tables: one half\n          is subtracted from all |O - E| differences; however, the\n          correction will not be bigger than the differences\n          themselves.  No correction is done if 'simulate.p.value =\n          TRUE'.\n\n       p: a vector of probabilities of the same length as 'x'.  An\n          error is given if any entry of 'p' is negative.\n\nrescale.p: a logical scalar; if TRUE then 'p' is rescaled (if\n          necessary) to sum to 1.  If 'rescale.p' is FALSE, and 'p'\n          does not sum to 1, an error is given.\n\nsimulate.p.value: a logical indicating whether to compute p-values by\n          Monte Carlo simulation.\n\n       B: an integer specifying the number of replicates used in the\n          Monte Carlo test.\n\nDetails:\n\n     If 'x' is a matrix with one row or column, or if 'x' is a vector\n     and 'y' is not given, then a _goodness-of-fit test_ is performed\n     ('x' is treated as a one-dimensional contingency table).  The\n     entries of 'x' must be non-negative integers.  In this case, the\n     hypothesis tested is whether the population probabilities equal\n     those in 'p', or are all equal if 'p' is not given.\n\n     If 'x' is a matrix with at least two rows and columns, it is taken\n     as a two-dimensional contingency table: the entries of 'x' must be\n     non-negative integers.  Otherwise, 'x' and 'y' must be vectors or\n     factors of the same length; cases with missing values are removed,\n     the objects are coerced to factors, and the contingency table is\n     computed from these.  Then Pearson's chi-squared test is performed\n     of the null hypothesis that the joint distribution of the cell\n     counts in a 2-dimensional contingency table is the product of the\n     row and column marginals.\n\n     If 'simulate.p.value' is 'FALSE', the p-value is computed from the\n     asymptotic chi-squared distribution of the test statistic;\n     continuity correction is only used in the 2-by-2 case (if\n     'correct' is 'TRUE', the default).  Otherwise the p-value is\n     computed for a Monte Carlo test (Hope, 1968) with 'B' replicates.\n     The default 'B = 2000' implies a minimum p-value of about 0.0005\n     (1/(B+1)).\n\n     In the contingency table case, simulation is done by random\n     sampling from the set of all contingency tables with given\n     marginals, and works only if the marginals are strictly positive.\n     Continuity correction is never used, and the statistic is quoted\n     without it.  Note that this is not the usual sampling situation\n     assumed for the chi-squared test but rather that for Fisher's\n     exact test.\n\n     In the goodness-of-fit case simulation is done by random sampling\n     from the discrete distribution specified by 'p', each sample being\n     of size 'n = sum(x)'.  This simulation is done in R and may be\n     slow.\n\nValue:\n\n     A list with class '\"htest\"' containing the following components:\n\nstatistic: the value the chi-squared test statistic.\n\nparameter: the degrees of freedom of the approximate chi-squared\n          distribution of the test statistic, 'NA' if the p-value is\n          computed by Monte Carlo simulation.\n\n p.value: the p-value for the test.\n\n  method: a character string indicating the type of test performed, and\n          whether Monte Carlo simulation or continuity correction was\n          used.\n\ndata.name: a character string giving the name(s) of the data.\n\nobserved: the observed counts.\n\nexpected: the expected counts under the null hypothesis.\n\nresiduals: the Pearson residuals, '(observed - expected) /\n          sqrt(expected)'.\n\n  stdres: standardized residuals, '(observed - expected) / sqrt(V)',\n          where 'V' is the residual cell variance (Agresti, 2007,\n          section 2.4.5 for the case where 'x' is a matrix, 'n * p * (1\n          - p)' otherwise).\n\nSource:\n\n     The code for Monte Carlo simulation is a C translation of the\n     Fortran algorithm of Patefield (1981).\n\nReferences:\n\n     Hope, A. C. A. (1968).  A simplified Monte Carlo significance test\n     procedure.  _Journal of the Royal Statistical Society Series B_,\n     *30*, 582-598.  doi:10.1111/j.2517-6161.1968.tb00759.x\n     <https://doi.org/10.1111/j.2517-6161.1968.tb00759.x>.\n\n     Patefield, W. M. (1981).  Algorithm AS 159: An efficient method of\n     generating r x c tables with given row and column totals.\n     _Applied Statistics_, *30*, 91-97.  doi:10.2307/2346669\n     <https://doi.org/10.2307/2346669>.\n\n     Agresti, A. (2007).  _An Introduction to Categorical Data\n     Analysis_, 2nd ed.  New York: John Wiley & Sons.  Page 38.\n\nSee Also:\n\n     For goodness-of-fit testing, notably of continuous distributions,\n     'ks.test'.\n\nExamples:\n\n     ## From Agresti(2007) p.39\n     M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))\n     dimnames(M) <- list(gender = c(\"F\", \"M\"),\n                         party = c(\"Democrat\",\"Independent\", \"Republican\"))\n     (Xsq <- chisq.test(M))  # Prints test summary\n     Xsq$observed   # observed counts (same as M)\n     Xsq$expected   # expected counts under the null\n     Xsq$residuals  # Pearson residuals\n     Xsq$stdres     # standardized residuals\n     \n     \n     ## Effect of simulating p-values\n     x <- matrix(c(12, 5, 7, 7), ncol = 2)\n     chisq.test(x)$p.value           # 0.4233\n     chisq.test(x, simulate.p.value = TRUE, B = 10000)$p.value\n                                     # around 0.29!\n     \n     ## Testing for population probabilities\n     ## Case A. Tabulated data\n     x <- c(A = 20, B = 15, C = 25)\n     chisq.test(x)\n     chisq.test(as.table(x))             # the same\n     x <- c(89,37,30,28,2)\n     p <- c(40,20,20,15,5)\n     try(\n     chisq.test(x, p = p)                # gives an error\n     )\n     chisq.test(x, p = p, rescale.p = TRUE)\n                                     # works\n     p <- c(0.40,0.20,0.20,0.19,0.01)\n                                     # Expected count in category 5\n                                     # is 1.86 < 5 ==> chi square approx.\n     chisq.test(x, p = p)            #               maybe doubtful, but is ok!\n     chisq.test(x, p = p, simulate.p.value = TRUE)\n     \n     ## Case B. Raw data\n     x <- trunc(5 * runif(100))\n     chisq.test(table(x))            # NOT 'chisq.test(x)'!\n\n\n\n## Chi-Square test\n\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(freq)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's Chi-squared test\n\ndata:  freq\nX-squared = 175.85, df = 2, p-value < 2.2e-16\n```\n:::\n:::\n\n\nWe reject the null hypothesis that the proportion of seropositive individuals in the young, middle, and old age groups are the same.\n\n\n## Correlation\n\nFirst, we compute correlation by providing two vectors.\n\nLike other functions, if there are `NA`s, you get `NA` as the result. But if you specify use only the complete observations, then it will give you correlation using the non-missing data.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncor(df$age, df$IgG_concentration, method=\"pearson\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] NA\n```\n:::\n\n```{.r .cell-code}\ncor(df$age, df$IgG_concentration, method=\"pearson\", use = \"complete.obs\") #IF have missing data\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 0.2604783\n```\n:::\n:::\n\n\nSmall positive correlation between IgG concentration and age.\n\n## Correlation confidence interval\n\nThe function `cor.test()` also gives you the confidence interval of the correlation statistic. Note, it uses complete observations by default. \n\n\n::: {.cell}\n\n```{.r .cell-code}\ncor.test(df$age, df$IgG_concentration, method=\"pearson\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's product-moment correlation\n\ndata:  df$age and df$IgG_concentration\nt = 6.7717, df = 630, p-value = 2.921e-11\nalternative hypothesis: true correlation is not equal to 0\n95 percent confidence interval:\n 0.1862722 0.3317295\nsample estimates:\n      cor \n0.2604783 \n```\n:::\n:::\n\n\n\n## T-test\n\nThe commonly used are:\n\n-   **one-sample t-test** -- used to test mean of a variable in one group (to the null hypothesis mean)\n-   **two-sample t-test** -- used to test difference in means of a variable between two groups (null hypothesis - the group means are the *same*)\n\n## T-test\n\nWe can use the `t.test()` function from the `stats` package.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?t.test\n```\n:::\n\nStudent's t-Test\n\nDescription:\n\n     Performs one and two sample t-tests on vectors of data.\n\nUsage:\n\n     t.test(x, ...)\n     \n     ## Default S3 method:\n     t.test(x, y = NULL,\n            alternative = c(\"two.sided\", \"less\", \"greater\"),\n            mu = 0, paired = FALSE, var.equal = FALSE,\n            conf.level = 0.95, ...)\n     \n     ## S3 method for class 'formula'\n     t.test(formula, data, subset, na.action, ...)\n     \nArguments:\n\n       x: a (non-empty) numeric vector of data values.\n\n       y: an optional (non-empty) numeric vector of data values.\n\nalternative: a character string specifying the alternative hypothesis,\n          must be one of '\"two.sided\"' (default), '\"greater\"' or\n          '\"less\"'.  You can specify just the initial letter.\n\n      mu: a number indicating the true value of the mean (or difference\n          in means if you are performing a two sample test).\n\n  paired: a logical indicating whether you want a paired t-test.\n\nvar.equal: a logical variable indicating whether to treat the two\n          variances as being equal. If 'TRUE' then the pooled variance\n          is used to estimate the variance otherwise the Welch (or\n          Satterthwaite) approximation to the degrees of freedom is\n          used.\n\nconf.level: confidence level of the interval.\n\n formula: a formula of the form 'lhs ~ rhs' where 'lhs' is a numeric\n          variable giving the data values and 'rhs' either '1' for a\n          one-sample or paired test or a factor with two levels giving\n          the corresponding groups. If 'lhs' is of class '\"Pair\"' and\n          'rhs' is '1', a paired test is done.\n\n    data: an optional matrix or data frame (or similar: see\n          'model.frame') containing the variables in the formula\n          'formula'.  By default the variables are taken from\n          'environment(formula)'.\n\n  subset: an optional vector specifying a subset of observations to be\n          used.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  Defaults to 'getOption(\"na.action\")'.\n\n     ...: further arguments to be passed to or from methods.\n\nDetails:\n\n     'alternative = \"greater\"' is the alternative that 'x' has a larger\n     mean than 'y'. For the one-sample case: that the mean is positive.\n\n     If 'paired' is 'TRUE' then both 'x' and 'y' must be specified and\n     they must be the same length.  Missing values are silently removed\n     (in pairs if 'paired' is 'TRUE').  If 'var.equal' is 'TRUE' then\n     the pooled estimate of the variance is used.  By default, if\n     'var.equal' is 'FALSE' then the variance is estimated separately\n     for both groups and the Welch modification to the degrees of\n     freedom is used.\n\n     If the input data are effectively constant (compared to the larger\n     of the two means) an error is generated.\n\nValue:\n\n     A list with class '\"htest\"' containing the following components:\n\nstatistic: the value of the t-statistic.\n\nparameter: the degrees of freedom for the t-statistic.\n\n p.value: the p-value for the test.\n\nconf.int: a confidence interval for the mean appropriate to the\n          specified alternative hypothesis.\n\nestimate: the estimated mean or difference in means depending on\n          whether it was a one-sample test or a two-sample test.\n\nnull.value: the specified hypothesized value of the mean or mean\n          difference depending on whether it was a one-sample test or a\n          two-sample test.\n\n  stderr: the standard error of the mean (difference), used as\n          denominator in the t-statistic formula.\n\nalternative: a character string describing the alternative hypothesis.\n\n  method: a character string indicating what type of t-test was\n          performed.\n\ndata.name: a character string giving the name(s) of the data.\n\nSee Also:\n\n     'prop.test'\n\nExamples:\n\n     require(graphics)\n     \n     t.test(1:10, y = c(7:20))      # P = .00001855\n     t.test(1:10, y = c(7:20, 200)) # P = .1245    -- NOT significant anymore\n     \n     ## Classical example: Student's sleep data\n     plot(extra ~ group, data = sleep)\n     ## Traditional interface\n     with(sleep, t.test(extra[group == 1], extra[group == 2]))\n     \n     ## Formula interface\n     t.test(extra ~ group, data = sleep)\n     \n     ## Formula interface to one-sample test\n     t.test(extra ~ 1, data = sleep)\n     \n     ## Formula interface to paired test\n     ## The sleep data are actually paired, so could have been in wide format:\n     sleep2 <- reshape(sleep, direction = \"wide\", \n                       idvar = \"ID\", timevar = \"group\")\n     t.test(Pair(extra.1, extra.2) ~ 1, data = sleep2)\n\n\n## Running two-sample t-test\n\nThe **base R** - `t.test()` function from the `stats` package. It tests test difference in means of a variable between two groups. By default:\n\n-   tests whether difference in means of a variable is equal to 0 (default `mu=0`)\n-   uses \"two sided\" alternative (`alternative = \"two.sided\"`)\n-   returns result assuming confidence level 0.95 (`conf.level = 0.95`)\n-   assumes data are not paired (`paired = FALSE`)\n-   assumes true variance in the two groups is not equal (`var.equal = FALSE`)\n\n## Running two-sample t-test\n\n\n::: {.cell}\n\n```{.r .cell-code}\nIgG_young <- df$IgG_concentration[df$age_group==\"young\"]\nIgG_old <- df$IgG_concentration[df$age_group==\"old\"]\n\nt.test(IgG_young, IgG_old)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tWelch Two Sample t-test\n\ndata:  IgG_young and IgG_old\nt = -6.1969, df = 259.54, p-value = 2.25e-09\nalternative hypothesis: true difference in means is not equal to 0\n95 percent confidence interval:\n -111.09281  -57.51515\nsample estimates:\nmean of x mean of y \n 45.05056 129.35454 \n```\n:::\n:::\n\n\nThe mean IgG concenration of young and old is 45.05 and 129.35 IU/mL, respectively. We reject null hypothesis that the difference in the mean IgG concentration of young and old is 0 IU/mL.\n\n## Linear regression fit in R\n\nTo fit regression models in R, we use the function `glm()` (Generalized Linear Model).\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?glm\n```\n:::\n\nFitting Generalized Linear Models\n\nDescription:\n\n     'glm' is used to fit generalized linear models, specified by\n     giving a symbolic description of the linear predictor and a\n     description of the error distribution.\n\nUsage:\n\n     glm(formula, family = gaussian, data, weights, subset,\n         na.action, start = NULL, etastart, mustart, offset,\n         control = list(...), model = TRUE, method = \"glm.fit\",\n         x = FALSE, y = TRUE, singular.ok = TRUE, contrasts = NULL, ...)\n     \n     glm.fit(x, y, weights = rep.int(1, nobs),\n             start = NULL, etastart = NULL, mustart = NULL,\n             offset = rep.int(0, nobs), family = gaussian(),\n             control = list(), intercept = TRUE, singular.ok = TRUE)\n     \n     ## S3 method for class 'glm'\n     weights(object, type = c(\"prior\", \"working\"), ...)\n     \nArguments:\n\n formula: an object of class '\"formula\"' (or one that can be coerced to\n          that class): a symbolic description of the model to be\n          fitted.  The details of model specification are given under\n          'Details'.\n\n  family: a description of the error distribution and link function to\n          be used in the model.  For 'glm' this can be a character\n          string naming a family function, a family function or the\n          result of a call to a family function.  For 'glm.fit' only\n          the third option is supported.  (See 'family' for details of\n          family functions.)\n\n    data: an optional data frame, list or environment (or object\n          coercible by 'as.data.frame' to a data frame) containing the\n          variables in the model.  If not found in 'data', the\n          variables are taken from 'environment(formula)', typically\n          the environment from which 'glm' is called.\n\n weights: an optional vector of 'prior weights' to be used in the\n          fitting process.  Should be 'NULL' or a numeric vector.\n\n  subset: an optional vector specifying a subset of observations to be\n          used in the fitting process.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  The default is set by the 'na.action' setting\n          of 'options', and is 'na.fail' if that is unset.  The\n          'factory-fresh' default is 'na.omit'.  Another possible value\n          is 'NULL', no action.  Value 'na.exclude' can be useful.\n\n   start: starting values for the parameters in the linear predictor.\n\netastart: starting values for the linear predictor.\n\n mustart: starting values for the vector of means.\n\n  offset: this can be used to specify an _a priori_ known component to\n          be included in the linear predictor during fitting.  This\n          should be 'NULL' or a numeric vector of length equal to the\n          number of cases.  One or more 'offset' terms can be included\n          in the formula instead or as well, and if more than one is\n          specified their sum is used.  See 'model.offset'.\n\n control: a list of parameters for controlling the fitting process.\n          For 'glm.fit' this is passed to 'glm.control'.\n\n   model: a logical value indicating whether _model frame_ should be\n          included as a component of the returned value.\n\n  method: the method to be used in fitting the model.  The default\n          method '\"glm.fit\"' uses iteratively reweighted least squares\n          (IWLS): the alternative '\"model.frame\"' returns the model\n          frame and does no fitting.\n\n          User-supplied fitting functions can be supplied either as a\n          function or a character string naming a function, with a\n          function which takes the same arguments as 'glm.fit'.  If\n          specified as a character string it is looked up from within\n          the 'stats' namespace.\n\n    x, y: For 'glm': logical values indicating whether the response\n          vector and model matrix used in the fitting process should be\n          returned as components of the returned value.\n\n          For 'glm.fit': 'x' is a design matrix of dimension 'n * p',\n          and 'y' is a vector of observations of length 'n'.\n\nsingular.ok: logical; if 'FALSE' a singular fit is an error.\n\ncontrasts: an optional list. See the 'contrasts.arg' of\n          'model.matrix.default'.\n\nintercept: logical. Should an intercept be included in the _null_\n          model?\n\n  object: an object inheriting from class '\"glm\"'.\n\n    type: character, partial matching allowed.  Type of weights to\n          extract from the fitted model object.  Can be abbreviated.\n\n     ...: For 'glm': arguments to be used to form the default 'control'\n          argument if it is not supplied directly.\n\n          For 'weights': further arguments passed to or from other\n          methods.\n\nDetails:\n\n     A typical predictor has the form 'response ~ terms' where\n     'response' is the (numeric) response vector and 'terms' is a\n     series of terms which specifies a linear predictor for 'response'.\n     For 'binomial' and 'quasibinomial' families the response can also\n     be specified as a 'factor' (when the first level denotes failure\n     and all others success) or as a two-column matrix with the columns\n     giving the numbers of successes and failures.  A terms\n     specification of the form 'first + second' indicates all the terms\n     in 'first' together with all the terms in 'second' with any\n     duplicates removed.\n\n     A specification of the form 'first:second' indicates the set of\n     terms obtained by taking the interactions of all terms in 'first'\n     with all terms in 'second'.  The specification 'first*second'\n     indicates the _cross_ of 'first' and 'second'.  This is the same\n     as 'first + second + first:second'.\n\n     The terms in the formula will be re-ordered so that main effects\n     come first, followed by the interactions, all second-order, all\n     third-order and so on: to avoid this pass a 'terms' object as the\n     formula.\n\n     Non-'NULL' 'weights' can be used to indicate that different\n     observations have different dispersions (with the values in\n     'weights' being inversely proportional to the dispersions); or\n     equivalently, when the elements of 'weights' are positive integers\n     w_i, that each response y_i is the mean of w_i unit-weight\n     observations.  For a binomial GLM prior weights are used to give\n     the number of trials when the response is the proportion of\n     successes: they would rarely be used for a Poisson GLM.\n\n     'glm.fit' is the workhorse function: it is not normally called\n     directly but can be more efficient where the response vector,\n     design matrix and family have already been calculated.\n\n     If more than one of 'etastart', 'start' and 'mustart' is\n     specified, the first in the list will be used.  It is often\n     advisable to supply starting values for a 'quasi' family, and also\n     for families with unusual links such as 'gaussian(\"log\")'.\n\n     All of 'weights', 'subset', 'offset', 'etastart' and 'mustart' are\n     evaluated in the same way as variables in 'formula', that is first\n     in 'data' and then in the environment of 'formula'.\n\n     For the background to warning messages about 'fitted probabilities\n     numerically 0 or 1 occurred' for binomial GLMs, see Venables &\n     Ripley (2002, pp. 197-8).\n\nValue:\n\n     'glm' returns an object of class inheriting from '\"glm\"' which\n     inherits from the class '\"lm\"'. See later in this section.  If a\n     non-standard 'method' is used, the object will also inherit from\n     the class (if any) returned by that function.\n\n     The function 'summary' (i.e., 'summary.glm') can be used to obtain\n     or print a summary of the results and the function 'anova' (i.e.,\n     'anova.glm') to produce an analysis of variance table.\n\n     The generic accessor functions 'coefficients', 'effects',\n     'fitted.values' and 'residuals' can be used to extract various\n     useful features of the value returned by 'glm'.\n\n     'weights' extracts a vector of weights, one for each case in the\n     fit (after subsetting and 'na.action').\n\n     An object of class '\"glm\"' is a list containing at least the\n     following components:\n\ncoefficients: a named vector of coefficients\n\nresiduals: the _working_ residuals, that is the residuals in the final\n          iteration of the IWLS fit.  Since cases with zero weights are\n          omitted, their working residuals are 'NA'.\n\nfitted.values: the fitted mean values, obtained by transforming the\n          linear predictors by the inverse of the link function.\n\n    rank: the numeric rank of the fitted linear model.\n\n  family: the 'family' object used.\n\nlinear.predictors: the linear fit on link scale.\n\ndeviance: up to a constant, minus twice the maximized log-likelihood.\n          Where sensible, the constant is chosen so that a saturated\n          model has deviance zero.\n\n     aic: A version of Akaike's _An Information Criterion_, minus twice\n          the maximized log-likelihood plus twice the number of\n          parameters, computed via the 'aic' component of the family.\n          For binomial and Poison families the dispersion is fixed at\n          one and the number of parameters is the number of\n          coefficients.  For gaussian, Gamma and inverse gaussian\n          families the dispersion is estimated from the residual\n          deviance, and the number of parameters is the number of\n          coefficients plus one.  For a gaussian family the MLE of the\n          dispersion is used so this is a valid value of AIC, but for\n          Gamma and inverse gaussian families it is not.  For families\n          fitted by quasi-likelihood the value is 'NA'.\n\nnull.deviance: The deviance for the null model, comparable with\n          'deviance'. The null model will include the offset, and an\n          intercept if there is one in the model.  Note that this will\n          be incorrect if the link function depends on the data other\n          than through the fitted mean: specify a zero offset to force\n          a correct calculation.\n\n    iter: the number of iterations of IWLS used.\n\n weights: the _working_ weights, that is the weights in the final\n          iteration of the IWLS fit.\n\nprior.weights: the weights initially supplied, a vector of '1's if none\n          were.\n\ndf.residual: the residual degrees of freedom.\n\n df.null: the residual degrees of freedom for the null model.\n\n       y: if requested (the default) the 'y' vector used. (It is a\n          vector even for a binomial model.)\n\n       x: if requested, the model matrix.\n\n   model: if requested (the default), the model frame.\n\nconverged: logical. Was the IWLS algorithm judged to have converged?\n\nboundary: logical. Is the fitted value on the boundary of the\n          attainable values?\n\n    call: the matched call.\n\n formula: the formula supplied.\n\n   terms: the 'terms' object used.\n\n    data: the 'data argument'.\n\n  offset: the offset vector used.\n\n control: the value of the 'control' argument used.\n\n  method: the name of the fitter function used (when provided as a\n          'character' string to 'glm()') or the fitter 'function' (when\n          provided as that).\n\ncontrasts: (where relevant) the contrasts used.\n\n xlevels: (where relevant) a record of the levels of the factors used\n          in fitting.\n\nna.action: (where relevant) information returned by 'model.frame' on\n          the special handling of 'NA's.\n\n     In addition, non-empty fits will have components 'qr', 'R' and\n     'effects' relating to the final weighted linear fit.\n\n     Objects of class '\"glm\"' are normally of class 'c(\"glm\", \"lm\")',\n     that is inherit from class '\"lm\"', and well-designed methods for\n     class '\"lm\"' will be applied to the weighted linear model at the\n     final iteration of IWLS.  However, care is needed, as extractor\n     functions for class '\"glm\"' such as 'residuals' and 'weights' do\n     *not* just pick out the component of the fit with the same name.\n\n     If a 'binomial' 'glm' model was specified by giving a two-column\n     response, the weights returned by 'prior.weights' are the total\n     numbers of cases (factored by the supplied case weights) and the\n     component 'y' of the result is the proportion of successes.\n\nFitting functions:\n\n     The argument 'method' serves two purposes.  One is to allow the\n     model frame to be recreated with no fitting.  The other is to\n     allow the default fitting function 'glm.fit' to be replaced by a\n     function which takes the same arguments and uses a different\n     fitting algorithm.  If 'glm.fit' is supplied as a character string\n     it is used to search for a function of that name, starting in the\n     'stats' namespace.\n\n     The class of the object return by the fitter (if any) will be\n     prepended to the class returned by 'glm'.\n\nAuthor(s):\n\n     The original R implementation of 'glm' was written by Simon Davies\n     working for Ross Ihaka at the University of Auckland, but has\n     since been extensively re-written by members of the R Core team.\n\n     The design was inspired by the S function of the same name\n     described in Hastie & Pregibon (1992).\n\nReferences:\n\n     Dobson, A. J. (1990) _An Introduction to Generalized Linear\n     Models._ London: Chapman and Hall.\n\n     Hastie, T. J. and Pregibon, D. (1992) _Generalized linear models._\n     Chapter 6 of _Statistical Models in S_ eds J. M. Chambers and T.\n     J. Hastie, Wadsworth & Brooks/Cole.\n\n     McCullagh P. and Nelder, J. A. (1989) _Generalized Linear Models._\n     London: Chapman and Hall.\n\n     Venables, W. N. and Ripley, B. D. (2002) _Modern Applied\n     Statistics with S._ New York: Springer.\n\nSee Also:\n\n     'anova.glm', 'summary.glm', etc. for 'glm' methods, and the\n     generic functions 'anova', 'summary', 'effects', 'fitted.values',\n     and 'residuals'.\n\n     'lm' for non-generalized _linear_ models (which SAS calls GLMs,\n     for 'general' linear models).\n\n     'loglin' and 'loglm' (package 'MASS') for fitting log-linear\n     models (which binomial and Poisson GLMs are) to contingency\n     tables.\n\n     'bigglm' in package 'biglm' for an alternative way to fit GLMs to\n     large datasets (especially those with many cases).\n\n     'esoph', 'infert' and 'predict.glm' have examples of fitting\n     binomial glms.\n\nExamples:\n\n     ## Dobson (1990) Page 93: Randomized Controlled Trial :\n     counts <- c(18,17,15,20,10,20,25,13,12)\n     outcome <- gl(3,1,9)\n     treatment <- gl(3,3)\n     data.frame(treatment, outcome, counts) # showing data\n     glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())\n     anova(glm.D93)\n     summary(glm.D93)\n     ## Computing AIC [in many ways]:\n     (A0 <- AIC(glm.D93))\n     (ll <- logLik(glm.D93))\n     A1 <- -2*c(ll) + 2*attr(ll, \"df\")\n     A2 <- glm.D93$family$aic(counts, mu=fitted(glm.D93), wt=1) +\n             2 * length(coef(glm.D93))\n     stopifnot(exprs = {\n       all.equal(A0, A1)\n       all.equal(A1, A2)\n       all.equal(A1, glm.D93$aic)\n     })\n     \n     \n     ## an example with offsets from Venables & Ripley (2002, p.189)\n     utils::data(anorexia, package = \"MASS\")\n     \n     anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt),\n                     family = gaussian, data = anorexia)\n     summary(anorex.1)\n     \n     \n     # A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)\n     clotting <- data.frame(\n         u = c(5,10,15,20,30,40,60,80,100),\n         lot1 = c(118,58,42,35,27,25,21,19,18),\n         lot2 = c(69,35,26,21,18,16,13,12,12))\n     summary(glm(lot1 ~ log(u), data = clotting, family = Gamma))\n     summary(glm(lot2 ~ log(u), data = clotting, family = Gamma))\n     ## Aliased (\"S\"ingular) -> 1 NA coefficient\n     (fS <- glm(lot2 ~ log(u) + log(u^2), data = clotting, family = Gamma))\n     tools::assertError(update(fS, singular.ok=FALSE), verbose=interactive())\n     ## -> .. \"singular fit encountered\"\n     \n     ## Not run:\n     \n     ## for an example of the use of a terms object as a formula\n     demo(glm.vr)\n     ## End(Not run)\n\n\n## Linear regression fit in R\n\nWe tend to focus on three arguments:\n\n- `formula` -- model formula written using names of columns in our data\n- `data` -- our data frame\n- `family` -- error distribution and link function\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfit1 <- glm(IgG_concentration~age+gender+slum, data=df, family=gaussian())\nfit2 <- glm(seropos~age_group+gender+slum, data=df, family = binomial(link = \"logit\"))\n```\n:::\n\n\n## `summary.glm()`\n\nThe `summary()` function when applied to a fit object based on a glm is technically the `summary.glm()` function and produces details of the model fit. Note on object oriented code.\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/rstudio_script.png){width=200%}\n:::\n:::\n\nSummarizing Generalized Linear Model Fits\n\nDescription:\n\n     These functions are all 'methods' for class 'glm' or 'summary.glm'\n     objects.\n\nUsage:\n\n     ## S3 method for class 'glm'\n     summary(object, dispersion = NULL, correlation = FALSE,\n             symbolic.cor = FALSE, ...)\n     \n     ## S3 method for class 'summary.glm'\n     print(x, digits = max(3, getOption(\"digits\") - 3),\n           symbolic.cor = x$symbolic.cor,\n           signif.stars = getOption(\"show.signif.stars\"),\n           show.residuals = FALSE, ...)\n     \nArguments:\n\n  object: an object of class '\"glm\"', usually, a result of a call to\n          'glm'.\n\n       x: an object of class '\"summary.glm\"', usually, a result of a\n          call to 'summary.glm'.\n\ndispersion: the dispersion parameter for the family used.  Either a\n          single numerical value or 'NULL' (the default), when it is\n          inferred from 'object' (see 'Details').\n\ncorrelation: logical; if 'TRUE', the correlation matrix of the\n          estimated parameters is returned and printed.\n\n  digits: the number of significant digits to use when printing.\n\nsymbolic.cor: logical. If 'TRUE', print the correlations in a symbolic\n          form (see 'symnum') rather than as numbers.\n\nsignif.stars: logical. If 'TRUE', 'significance stars' are printed for\n          each coefficient.\n\nshow.residuals: logical. If 'TRUE' then a summary of the deviance\n          residuals is printed at the head of the output.\n\n     ...: further arguments passed to or from other methods.\n\nDetails:\n\n     'print.summary.glm' tries to be smart about formatting the\n     coefficients, standard errors, etc. and additionally gives\n     'significance stars' if 'signif.stars' is 'TRUE'.  The\n     'coefficients' component of the result gives the estimated\n     coefficients and their estimated standard errors, together with\n     their ratio.  This third column is labelled 't ratio' if the\n     dispersion is estimated, and 'z ratio' if the dispersion is known\n     (or fixed by the family).  A fourth column gives the two-tailed\n     p-value corresponding to the t or z ratio based on a Student t or\n     Normal reference distribution.  (It is possible that the\n     dispersion is not known and there are no residual degrees of\n     freedom from which to estimate it.  In that case the estimate is\n     'NaN'.)\n\n     Aliased coefficients are omitted in the returned object but\n     restored by the 'print' method.\n\n     Correlations are printed to two decimal places (or symbolically):\n     to see the actual correlations print 'summary(object)$correlation'\n     directly.\n\n     The dispersion of a GLM is not used in the fitting process, but it\n     is needed to find standard errors.  If 'dispersion' is not\n     supplied or 'NULL', the dispersion is taken as '1' for the\n     'binomial' and 'Poisson' families, and otherwise estimated by the\n     residual Chisquared statistic (calculated from cases with non-zero\n     weights) divided by the residual degrees of freedom.\n\n     'summary' can be used with Gaussian 'glm' fits to handle the case\n     of a linear regression with known error variance, something not\n     handled by 'summary.lm'.\n\nValue:\n\n     'summary.glm' returns an object of class '\"summary.glm\"', a list\n     with components\n\n    call: the component from 'object'.\n\n  family: the component from 'object'.\n\ndeviance: the component from 'object'.\n\ncontrasts: the component from 'object'.\n\ndf.residual: the component from 'object'.\n\nnull.deviance: the component from 'object'.\n\n df.null: the component from 'object'.\n\ndeviance.resid: the deviance residuals: see 'residuals.glm'.\n\ncoefficients: the matrix of coefficients, standard errors, z-values and\n          p-values.  Aliased coefficients are omitted.\n\n aliased: named logical vector showing if the original coefficients are\n          aliased.\n\ndispersion: either the supplied argument or the inferred/estimated\n          dispersion if the former is 'NULL'.\n\n      df: a 3-vector of the rank of the model and the number of\n          residual degrees of freedom, plus number of coefficients\n          (including aliased ones).\n\ncov.unscaled: the unscaled ('dispersion = 1') estimated covariance\n          matrix of the estimated coefficients.\n\ncov.scaled: ditto, scaled by 'dispersion'.\n\ncorrelation: (only if 'correlation' is true.)  The estimated\n          correlations of the estimated coefficients.\n\nsymbolic.cor: (only if 'correlation' is true.)  The value of the\n          argument 'symbolic.cor'.\n\nSee Also:\n\n     'glm', 'summary'.\n\nExamples:\n\n     ## For examples see example(glm)\n\n\n\n## Linear regression fit in R\n\nLets look at the output...\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsummary(fit1)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\nCall:\nglm(formula = IgG_concentration ~ age + gender + slum, family = gaussian(), \n    data = df)\n\nCoefficients:\n             Estimate Std. Error t value Pr(>|t|)    \n(Intercept)    46.132     16.774   2.750  0.00613 ** \nage             9.324      1.388   6.718 4.15e-11 ***\ngenderMale     -9.655     11.543  -0.836  0.40321    \nslumNon slum  -20.353     14.299  -1.423  0.15513    \nslumSlum      -29.705     25.009  -1.188  0.23536    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for gaussian family taken to be 20918.39)\n\n    Null deviance: 14141483  on 631  degrees of freedom\nResidual deviance: 13115831  on 627  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 8087.9\n\nNumber of Fisher Scoring iterations: 2\n```\n:::\n\n```{.r .cell-code}\nsummary(fit2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\nCall:\nglm(formula = seropos ~ age_group + gender + slum, family = binomial(link = \"logit\"), \n    data = df)\n\nCoefficients:\n                Estimate Std. Error z value Pr(>|z|)    \n(Intercept)      -1.3220     0.2516  -5.254 1.49e-07 ***\nage_groupmiddle   1.9020     0.2133   8.916  < 2e-16 ***\nage_groupold      2.8443     0.2522  11.278  < 2e-16 ***\ngenderMale       -0.1725     0.1895  -0.910    0.363    \nslumNon slum     -0.1099     0.2329  -0.472    0.637    \nslumSlum         -0.1073     0.4118  -0.261    0.794    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n    Null deviance: 866.98  on 631  degrees of freedom\nResidual deviance: 679.10  on 626  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 691.1\n\nNumber of Fisher Scoring iterations: 4\n```\n:::\n:::\n\n\n\n\n## Summary\n\n-\tUse `cor()` or `cor.test()` to calculate correlation between two numeric vectors.\n- `t.test()` tests the mean compared to null or difference in means between two groups\n-\t\t... xxamy more\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Introduction to R for Public Health Researchers\" Johns Hopkins University](https://jhudatascience.org/intro_to_r/)\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"
diff --git a/_freeze/modules/Module10-DataVisualization/execute-results/html.json b/_freeze/modules/Module10-DataVisualization/execute-results/html.json
index 217dc3b..059f6ff 100644
--- a/_freeze/modules/Module10-DataVisualization/execute-results/html.json
+++ b/_freeze/modules/Module10-DataVisualization/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "e546ac5cfa3fec481cca7f255f1ede69",
+  "hash": "f5db9a97f56b293b9a271bfe4464641d",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Module 10: Data Visualization\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n---\n\n\n\n## Learning Objectives\n\nAfter module 10, you should be able to:\n\n- Create Base R plots\n\n## Import data for this module\n\nLet's read in our data (again) and take a quick look.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n\n\n:::\n:::\n\n\n\n## Prep data\n\nCreate `age_group` three level factor variable\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \n                              ifelse(df$age>10, \"old\", NA)))\ndf$age_group <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n```\n:::\n\n\n\nCreate `seropos` binary variable representing seropositivity if antibody concentrations are >10 mIUmL.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$seropos <- ifelse(df$IgG_concentration<10, 0, \n\t\t\t\t\t\t\t\t\t\tifelse(df$IgG_concentration>=10, 1, NA))\n```\n:::\n\n\n\n## Base R data visualizattion functions\n\nThe Base R 'graphics' package has a ton of graphics options. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(help = \"graphics\")\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\t\tInformation on package 'graphics'\n\nDescription:\n\nPackage:            graphics\nVersion:            4.3.1\nPriority:           base\nTitle:              The R Graphics Package\nAuthor:             R Core Team and contributors worldwide\nMaintainer:         R Core Team <do-use-Contact-address@r-project.org>\nContact:            R-help mailing list <r-help@r-project.org>\nDescription:        R functions for base graphics.\nImports:            grDevices\nLicense:            Part of R 4.3.1\nNeedsCompilation:   yes\nBuilt:              R 4.3.1; aarch64-apple-darwin20; 2023-06-16\n                    21:53:01 UTC; unix\n\nIndex:\n\nAxis                    Generic Function to Add an Axis to a Plot\nabline                  Add Straight Lines to a Plot\narrows                  Add Arrows to a Plot\nassocplot               Association Plots\naxTicks                 Compute Axis Tickmark Locations\naxis                    Add an Axis to a Plot\naxis.POSIXct            Date and Date-time Plotting Functions\nbarplot                 Bar Plots\nbox                     Draw a Box around a Plot\nboxplot                 Box Plots\nboxplot.matrix          Draw a Boxplot for each Column (Row) of a\n                        Matrix\nbxp                     Draw Box Plots from Summaries\ncdplot                  Conditional Density Plots\nclip                    Set Clipping Region\ncontour                 Display Contours\ncoplot                  Conditioning Plots\ncurve                   Draw Function Plots\ndotchart                Cleveland's Dot Plots\nfilled.contour          Level (Contour) Plots\nfourfoldplot            Fourfold Plots\nframe                   Create / Start a New Plot Frame\ngraphics-package        The R Graphics Package\ngrconvertX              Convert between Graphics Coordinate Systems\ngrid                    Add Grid to a Plot\nhist                    Histograms\nhist.POSIXt             Histogram of a Date or Date-Time Object\nidentify                Identify Points in a Scatter Plot\nimage                   Display a Color Image\nlayout                  Specifying Complex Plot Arrangements\nlegend                  Add Legends to Plots\nlines                   Add Connected Line Segments to a Plot\nlocator                 Graphical Input\nmatplot                 Plot Columns of Matrices\nmosaicplot              Mosaic Plots\nmtext                   Write Text into the Margins of a Plot\npairs                   Scatterplot Matrices\npanel.smooth            Simple Panel Plot\npar                     Set or Query Graphical Parameters\npersp                   Perspective Plots\npie                     Pie Charts\nplot.data.frame         Plot Method for Data Frames\nplot.default            The Default Scatterplot Function\nplot.design             Plot Univariate Effects of a Design or Model\nplot.factor             Plotting Factor Variables\nplot.formula            Formula Notation for Scatterplots\nplot.histogram          Plot Histograms\nplot.raster             Plotting Raster Images\nplot.table              Plot Methods for 'table' Objects\nplot.window             Set up World Coordinates for Graphics Window\nplot.xy                 Basic Internal Plot Function\npoints                  Add Points to a Plot\npolygon                 Polygon Drawing\npolypath                Path Drawing\nrasterImage             Draw One or More Raster Images\nrect                    Draw One or More Rectangles\nrug                     Add a Rug to a Plot\nscreen                  Creating and Controlling Multiple Screens on a\n                        Single Device\nsegments                Add Line Segments to a Plot\nsmoothScatter           Scatterplots with Smoothed Densities Color\n                        Representation\nspineplot               Spine Plots and Spinograms\nstars                   Star (Spider/Radar) Plots and Segment Diagrams\nstem                    Stem-and-Leaf Plots\nstripchart              1-D Scatter Plots\nstrwidth                Plotting Dimensions of Character Strings and\n                        Math Expressions\nsunflowerplot           Produce a Sunflower Scatter Plot\nsymbols                 Draw Symbols (Circles, Squares, Stars,\n                        Thermometers, Boxplots)\ntext                    Add Text to a Plot\ntitle                   Plot Annotation\nxinch                   Graphical Units\nxspline                 Draw an X-spline\n```\n\n\n:::\n:::\n\n\n\n\n\n## Base R Plotting\n\nTo make a plot you often need to specify the following features:\n\n1. Parameters\n2. Plot attributes\n3. The legend\n\n## 1. Parameters\n\nThe parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via `par()` before you call the plot creates ‘global’ settings for your plot.\n\nIn the example below, we have set two commonly used optional attributes in the global plot settings. \n-\t\tThe `mfrow` specifies that we have one row and two columns of plots — that is, two plots side by side. \n-\t\tThe `mar` attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).\n\n```\npar(mfrow = c(1,2), mar = c(5,5,4,1))\n```\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/par.png){width=70%}\n:::\n:::\n\n\n\n\n## Lots of parameters options\n\nHowever, there are many more parameter options that can be specified in the 'global' settings or specific to a certain plot option. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?par\n```\n:::\n\nSet or Query Graphical Parameters\n\nDescription:\n\n     'par' can be used to set or query graphical parameters.\n     Parameters can be set by specifying them as arguments to 'par' in\n     'tag = value' form, or by passing them as a list of tagged values.\n\nUsage:\n\n     par(..., no.readonly = FALSE)\n     \n     <highlevel plot> (...., <tag> = <value>)\n     \nArguments:\n\n     ...: arguments in 'tag = value' form, a single list of tagged\n          values, or character vectors of parameter names. Supported\n          parameters are described in the 'Graphical Parameters'\n          section.\n\nno.readonly: logical; if 'TRUE' and there are no other arguments, only\n          parameters are returned which can be set by a subsequent\n          'par()' call _on the same device_.\n\nDetails:\n\n     Each device has its own set of graphical parameters.  If the\n     current device is the null device, 'par' will open a new device\n     before querying/setting parameters.  (What device is controlled by\n     'options(\"device\")'.)\n\n     Parameters are queried by giving one or more character vectors of\n     parameter names to 'par'.\n\n     'par()' (no arguments) or 'par(no.readonly = TRUE)' is used to get\n     _all_ the graphical parameters (as a named list).  Their names are\n     currently taken from the unexported variable 'graphics:::.Pars'.\n\n     _*R.O.*_ indicates _*read-only arguments*_: These may only be used\n     in queries and cannot be set.  ('\"cin\"', '\"cra\"', '\"csi\"',\n     '\"cxy\"', '\"din\"' and '\"page\"' are always read-only.)\n\n     Several parameters can only be set by a call to 'par()':\n\n        • '\"ask\"',\n\n        • '\"fig\"', '\"fin\"',\n\n        • '\"lheight\"',\n\n        • '\"mai\"', '\"mar\"', '\"mex\"', '\"mfcol\"', '\"mfrow\"', '\"mfg\"',\n\n        • '\"new\"',\n\n        • '\"oma\"', '\"omd\"', '\"omi\"',\n\n        • '\"pin\"', '\"plt\"', '\"ps\"', '\"pty\"',\n\n        • '\"usr\"',\n\n        • '\"xlog\"', '\"ylog\"',\n\n        • '\"ylbias\"'\n\n     The remaining parameters can also be set as arguments (often via\n     '...') to high-level plot functions such as 'plot.default',\n     'plot.window', 'points', 'lines', 'abline', 'axis', 'title',\n     'text', 'mtext', 'segments', 'symbols', 'arrows', 'polygon',\n     'rect', 'box', 'contour', 'filled.contour' and 'image'.  Such\n     settings will be active during the execution of the function,\n     only.  However, see the comments on 'bg', 'cex', 'col', 'lty',\n     'lwd' and 'pch' which may be taken as _arguments_ to certain plot\n     functions rather than as graphical parameters.\n\n     The meaning of 'character size' is not well-defined: this is set\n     up for the device taking 'pointsize' into account but often not\n     the actual font family in use.  Internally the corresponding pars\n     ('cra', 'cin', 'cxy' and 'csi') are used only to set the\n     inter-line spacing used to convert 'mar' and 'oma' to physical\n     margins.  (The same inter-line spacing multiplied by 'lheight' is\n     used for multi-line strings in 'text' and 'strheight'.)\n\n     Note that graphical parameters are suggestions: plotting functions\n     and devices need not make use of them (and this is particularly\n     true of non-default methods for e.g. 'plot').\n\nValue:\n\n     When parameters are set, their previous values are returned in an\n     invisible named list.  Such a list can be passed as an argument to\n     'par' to restore the parameter values.  Use 'par(no.readonly =\n     TRUE)' for the full list of parameters that can be restored.\n     However, restoring all of these is not wise: see the 'Note'\n     section.\n\n     When just one parameter is queried, the value of that parameter is\n     returned as (atomic) vector.  When two or more parameters are\n     queried, their values are returned in a list, with the list names\n     giving the parameters.\n\n     Note the inconsistency: setting one parameter returns a list, but\n     querying one parameter returns a vector.\n\nGraphical Parameters:\n\n     'adj' The value of 'adj' determines the way in which text strings\n          are justified in 'text', 'mtext' and 'title'.  A value of '0'\n          produces left-justified text, '0.5' (the default) centered\n          text and '1' right-justified text.  (Any value in [0, 1] is\n          allowed, and on most devices values outside that interval\n          will also work.)\n\n          Note that the 'adj' _argument_ of 'text' also allows 'adj =\n          c(x, y)' for different adjustment in x- and y- directions.\n          Note that whereas for 'text' it refers to positioning of text\n          about a point, for 'mtext' and 'title' it controls placement\n          within the plot or device region.\n\n     'ann' If set to 'FALSE', high-level plotting functions calling\n          'plot.default' do not annotate the plots they produce with\n          axis titles and overall titles.  The default is to do\n          annotation.\n\n     'ask' logical.  If 'TRUE' (and the R session is interactive) the\n          user is asked for input, before a new figure is drawn.  As\n          this applies to the device, it also affects output by\n          packages 'grid' and 'lattice'.  It can be set even on\n          non-screen devices but may have no effect there.\n\n          This not really a graphics parameter, and its use is\n          deprecated in favour of 'devAskNewPage'.\n\n     'bg' The color to be used for the background of the device region.\n          When called from 'par()' it also sets 'new = FALSE'. See\n          section 'Color Specification' for suitable values.  For many\n          devices the initial value is set from the 'bg' argument of\n          the device, and for the rest it is normally '\"white\"'.\n\n          Note that some graphics functions such as 'plot.default' and\n          'points' have an _argument_ of this name with a different\n          meaning.\n\n     'bty' A character string which determined the type of 'box' which\n          is drawn about plots.  If 'bty' is one of '\"o\"' (the\n          default), '\"l\"', '\"7\"', '\"c\"', '\"u\"', or '\"]\"' the resulting\n          box resembles the corresponding upper case letter.  A value\n          of '\"n\"' suppresses the box.\n\n     'cex' A numerical value giving the amount by which plotting text\n          and symbols should be magnified relative to the default.\n          This starts as '1' when a device is opened, and is reset when\n          the layout is changed, e.g. by setting 'mfrow'.\n\n          Note that some graphics functions such as 'plot.default' have\n          an _argument_ of this name which _multiplies_ this graphical\n          parameter, and some functions such as 'points' and 'text'\n          accept a vector of values which are recycled.\n\n     'cex.axis' The magnification to be used for axis annotation\n          relative to the current setting of 'cex'.\n\n     'cex.lab' The magnification to be used for x and y labels relative\n          to the current setting of 'cex'.\n\n     'cex.main' The magnification to be used for main titles relative\n          to the current setting of 'cex'.\n\n     'cex.sub' The magnification to be used for sub-titles relative to\n          the current setting of 'cex'.\n\n     'cin' _*R.O.*_; character size '(width, height)' in inches.  These\n          are the same measurements as 'cra', expressed in different\n          units.\n\n     'col' A specification for the default plotting color.  See section\n          'Color Specification'.\n\n          Some functions such as 'lines' and 'text' accept a vector of\n          values which are recycled and may be interpreted slightly\n          differently.\n\n     'col.axis' The color to be used for axis annotation.  Defaults to\n          '\"black\"'.\n\n     'col.lab' The color to be used for x and y labels.  Defaults to\n          '\"black\"'.\n\n     'col.main' The color to be used for plot main titles.  Defaults to\n          '\"black\"'.\n\n     'col.sub' The color to be used for plot sub-titles.  Defaults to\n          '\"black\"'.\n\n     'cra' _*R.O.*_; size of default character '(width, height)' in\n          'rasters' (pixels).  Some devices have no concept of pixels\n          and so assume an arbitrary pixel size, usually 1/72 inch.\n          These are the same measurements as 'cin', expressed in\n          different units.\n\n     'crt' A numerical value specifying (in degrees) how single\n          characters should be rotated.  It is unwise to expect values\n          other than multiples of 90 to work.  Compare with 'srt' which\n          does string rotation.\n\n     'csi' _*R.O.*_; height of (default-sized) characters in inches.\n          The same as 'par(\"cin\")[2]'.\n\n     'cxy' _*R.O.*_; size of default character '(width, height)' in\n          user coordinate units.  'par(\"cxy\")' is\n          'par(\"cin\")/par(\"pin\")' scaled to user coordinates.  Note\n          that 'c(strwidth(ch), strheight(ch))' for a given string 'ch'\n          is usually much more precise.\n\n     'din' _*R.O.*_; the device dimensions, '(width, height)', in\n          inches.  See also 'dev.size', which is updated immediately\n          when an on-screen device windows is re-sized.\n\n     'err' (_Unimplemented_; R is silent when points outside the plot\n          region are _not_ plotted.)  The degree of error reporting\n          desired.\n\n     'family' The name of a font family for drawing text.  The maximum\n          allowed length is 200 bytes.  This name gets mapped by each\n          graphics device to a device-specific font description.  The\n          default value is '\"\"' which means that the default device\n          fonts will be used (and what those are should be listed on\n          the help page for the device).  Standard values are\n          '\"serif\"', '\"sans\"' and '\"mono\"', and the Hershey font\n          families are also available.  (Devices may define others, and\n          some devices will ignore this setting completely.  Names\n          starting with '\"Hershey\"' are treated specially and should\n          only be used for the built-in Hershey font families.)  This\n          can be specified inline for 'text'.\n\n     'fg' The color to be used for the foreground of plots.  This is\n          the default color used for things like axes and boxes around\n          plots.  When called from 'par()' this also sets parameter\n          'col' to the same value.  See section 'Color Specification'.\n          A few devices have an argument to set the initial value,\n          which is otherwise '\"black\"'.\n\n     'fig' A numerical vector of the form 'c(x1, x2, y1, y2)' which\n          gives the (NDC) coordinates of the figure region in the\n          display region of the device. If you set this, unlike S, you\n          start a new plot, so to add to an existing plot use 'new =\n          TRUE' as well.\n\n     'fin' The figure region dimensions, '(width, height)', in inches.\n          If you set this, unlike S, you start a new plot.\n\n     'font' An integer which specifies which font to use for text.  If\n          possible, device drivers arrange so that 1 corresponds to\n          plain text (the default), 2 to bold face, 3 to italic and 4\n          to bold italic.  Also, font 5 is expected to be the symbol\n          font, in Adobe symbol encoding.  On some devices font\n          families can be selected by 'family' to choose different sets\n          of 5 fonts.\n\n     'font.axis' The font to be used for axis annotation.\n\n     'font.lab' The font to be used for x and y labels.\n\n     'font.main' The font to be used for plot main titles.\n\n     'font.sub' The font to be used for plot sub-titles.\n\n     'lab' A numerical vector of the form 'c(x, y, len)' which modifies\n          the default way that axes are annotated.  The values of 'x'\n          and 'y' give the (approximate) number of tickmarks on the x\n          and y axes and 'len' specifies the label length.  The default\n          is 'c(5, 5, 7)'.  'len' _is unimplemented_ in R.\n\n     'las' numeric in {0,1,2,3}; the style of axis labels.\n\n          0: always parallel to the axis [_default_],\n\n          1: always horizontal,\n\n          2: always perpendicular to the axis,\n\n          3: always vertical.\n\n          Also supported by 'mtext'.  Note that string/character\n          rotation _via_ argument 'srt' to 'par' does _not_ affect the\n          axis labels.\n\n     'lend' The line end style.  This can be specified as an integer or\n          string:\n\n          '0' and '\"round\"' mean rounded line caps [_default_];\n\n          '1' and '\"butt\"' mean butt line caps;\n\n          '2' and '\"square\"' mean square line caps.\n\n     'lheight' The line height multiplier.  The height of a line of\n          text (used to vertically space multi-line text) is found by\n          multiplying the character height both by the current\n          character expansion and by the line height multiplier.\n          Default value is 1.  Used in 'text' and 'strheight'.\n\n     'ljoin' The line join style.  This can be specified as an integer\n          or string:\n\n          '0' and '\"round\"' mean rounded line joins [_default_];\n\n          '1' and '\"mitre\"' mean mitred line joins;\n\n          '2' and '\"bevel\"' mean bevelled line joins.\n\n     'lmitre' The line mitre limit.  This controls when mitred line\n          joins are automatically converted into bevelled line joins.\n          The value must be larger than 1 and the default is 10.  Not\n          all devices will honour this setting.\n\n     'lty' The line type.  Line types can either be specified as an\n          integer (0=blank, 1=solid (default), 2=dashed, 3=dotted,\n          4=dotdash, 5=longdash, 6=twodash) or as one of the character\n          strings '\"blank\"', '\"solid\"', '\"dashed\"', '\"dotted\"',\n          '\"dotdash\"', '\"longdash\"', or '\"twodash\"', where '\"blank\"'\n          uses 'invisible lines' (i.e., does not draw them).\n\n          Alternatively, a string of up to 8 characters (from 'c(1:9,\n          \"A\":\"F\")') may be given, giving the length of line segments\n          which are alternatively drawn and skipped.  See section 'Line\n          Type Specification'.\n\n          Functions such as 'lines' and 'segments' accept a vector of\n          values which are recycled.\n\n     'lwd' The line width, a _positive_ number, defaulting to '1'.  The\n          interpretation is device-specific, and some devices do not\n          implement line widths less than one.  (See the help on the\n          device for details of the interpretation.)\n\n          Functions such as 'lines' and 'segments' accept a vector of\n          values which are recycled: in such uses lines corresponding\n          to values 'NA' or 'NaN' are omitted.  The interpretation of\n          '0' is device-specific.\n\n     'mai' A numerical vector of the form 'c(bottom, left, top, right)'\n          which gives the margin size specified in inches.\n\n     'mar' A numerical vector of the form 'c(bottom, left, top, right)'\n          which gives the number of lines of margin to be specified on\n          the four sides of the plot.  The default is 'c(5, 4, 4, 2) +\n          0.1'.\n\n     'mex' 'mex' is a character size expansion factor which is used to\n          describe coordinates in the margins of plots. Note that this\n          does not change the font size, rather specifies the size of\n          font (as a multiple of 'csi') used to convert between 'mar'\n          and 'mai', and between 'oma' and 'omi'.\n\n          This starts as '1' when the device is opened, and is reset\n          when the layout is changed (alongside resetting 'cex').\n\n     'mfcol, mfrow' A vector of the form 'c(nr, nc)'.  Subsequent\n          figures will be drawn in an 'nr'-by-'nc' array on the device\n          by _columns_ ('mfcol'), or _rows_ ('mfrow'), respectively.\n\n          In a layout with exactly two rows and columns the base value\n          of '\"cex\"' is reduced by a factor of 0.83: if there are three\n          or more of either rows or columns, the reduction factor is\n          0.66.\n\n          Setting a layout resets the base value of 'cex' and that of\n          'mex' to '1'.\n\n          If either of these is queried it will give the current\n          layout, so querying cannot tell you the order in which the\n          array will be filled.\n\n          Consider the alternatives, 'layout' and 'split.screen'.\n\n     'mfg' A numerical vector of the form 'c(i, j)' where 'i' and 'j'\n          indicate which figure in an array of figures is to be drawn\n          next (if setting) or is being drawn (if enquiring).  The\n          array must already have been set by 'mfcol' or 'mfrow'.\n\n          For compatibility with S, the form 'c(i, j, nr, nc)' is also\n          accepted, when 'nr' and 'nc' should be the current number of\n          rows and number of columns.  Mismatches will be ignored, with\n          a warning.\n\n     'mgp' The margin line (in 'mex' units) for the axis title, axis\n          labels and axis line.  Note that 'mgp[1]' affects 'title'\n          whereas 'mgp[2:3]' affect 'axis'.  The default is 'c(3, 1,\n          0)'.\n\n     'mkh' The height in inches of symbols to be drawn when the value\n          of 'pch' is an integer. _Completely ignored in R_.\n\n     'new' logical, defaulting to 'FALSE'.  If set to 'TRUE', the next\n          high-level plotting command (actually 'plot.new') should _not\n          clean_ the frame before drawing _as if it were on a *_new_*\n          device_.  It is an error (ignored with a warning) to try to\n          use 'new = TRUE' on a device that does not currently contain\n          a high-level plot.\n\n     'oma' A vector of the form 'c(bottom, left, top, right)' giving\n          the size of the outer margins in lines of text.\n\n     'omd' A vector of the form 'c(x1, x2, y1, y2)' giving the region\n          _inside_ outer margins in NDC (= normalized device\n          coordinates), i.e., as a fraction (in [0, 1]) of the device\n          region.\n\n     'omi' A vector of the form 'c(bottom, left, top, right)' giving\n          the size of the outer margins in inches.\n\n     'page' _*R.O.*_; A boolean value indicating whether the next call\n          to 'plot.new' is going to start a new page.  This value may\n          be 'FALSE' if there are multiple figures on the page.\n\n     'pch' Either an integer specifying a symbol or a single character\n          to be used as the default in plotting points.  See 'points'\n          for possible values and their interpretation.  Note that only\n          integers and single-character strings can be set as a\n          graphics parameter (and not 'NA' nor 'NULL').\n\n          Some functions such as 'points' accept a vector of values\n          which are recycled.\n\n     'pin' The current plot dimensions, '(width, height)', in inches.\n\n     'plt' A vector of the form 'c(x1, x2, y1, y2)' giving the\n          coordinates of the plot region as fractions of the current\n          figure region.\n\n     'ps' integer; the point size of text (but not symbols).  Unlike\n          the 'pointsize' argument of most devices, this does not\n          change the relationship between 'mar' and 'mai' (nor 'oma'\n          and 'omi').\n\n          What is meant by 'point size' is device-specific, but most\n          devices mean a multiple of 1bp, that is 1/72 of an inch.\n\n     'pty' A character specifying the type of plot region to be used;\n          '\"s\"' generates a square plotting region and '\"m\"' generates\n          the maximal plotting region.\n\n     'smo' (_Unimplemented_) a value which indicates how smooth circles\n          and circular arcs should be.\n\n     'srt' The string rotation in degrees.  See the comment about\n          'crt'.  Only supported by 'text'.\n\n     'tck' The length of tick marks as a fraction of the smaller of the\n          width or height of the plotting region.  If 'tck >= 0.5' it\n          is interpreted as a fraction of the relevant side, so if 'tck\n          = 1' grid lines are drawn.  The default setting ('tck = NA')\n          is to use 'tcl = -0.5'.\n\n     'tcl' The length of tick marks as a fraction of the height of a\n          line of text.  The default value is '-0.5'; setting 'tcl =\n          NA' sets 'tck = -0.01' which is S' default.\n\n     'usr' A vector of the form 'c(x1, x2, y1, y2)' giving the extremes\n          of the user coordinates of the plotting region.  When a\n          logarithmic scale is in use (i.e., 'par(\"xlog\")' is true, see\n          below), then the x-limits will be '10 ^ par(\"usr\")[1:2]'.\n          Similarly for the y-axis.\n\n     'xaxp' A vector of the form 'c(x1, x2, n)' giving the coordinates\n          of the extreme tick marks and the number of intervals between\n          tick-marks when 'par(\"xlog\")' is false.  Otherwise, when\n          _log_ coordinates are active, the three values have a\n          different meaning: For a small range, 'n' is _negative_, and\n          the ticks are as in the linear case, otherwise, 'n' is in\n          '1:3', specifying a case number, and 'x1' and 'x2' are the\n          lowest and highest power of 10 inside the user coordinates,\n          '10 ^ par(\"usr\")[1:2]'. (The '\"usr\"' coordinates are\n          log10-transformed here!)\n\n          n = 1 will produce tick marks at 10^j for integer j,\n\n          n = 2 gives marks k 10^j with k in {1,5},\n\n          n = 3 gives marks k 10^j with k in {1,2,5}.\n\n          See 'axTicks()' for a pure R implementation of this.\n\n          This parameter is reset when a user coordinate system is set\n          up, for example by starting a new page or by calling\n          'plot.window' or setting 'par(\"usr\")': 'n' is taken from\n          'par(\"lab\")'.  It affects the default behaviour of subsequent\n          calls to 'axis' for sides 1 or 3.\n\n          It is only relevant to default numeric axis systems, and not\n          for example to dates.\n\n     'xaxs' The style of axis interval calculation to be used for the\n          x-axis.  Possible values are '\"r\"', '\"i\"', '\"e\"', '\"s\"',\n          '\"d\"'.  The styles are generally controlled by the range of\n          data or 'xlim', if given.\n          Style '\"r\"' (regular) first extends the data range by 4\n          percent at each end and then finds an axis with pretty labels\n          that fits within the extended range.\n          Style '\"i\"' (internal) just finds an axis with pretty labels\n          that fits within the original data range.\n          Style '\"s\"' (standard) finds an axis with pretty labels\n          within which the original data range fits.\n          Style '\"e\"' (extended) is like style '\"s\"', except that it is\n          also ensures that there is room for plotting symbols within\n          the bounding box.\n          Style '\"d\"' (direct) specifies that the current axis should\n          be used on subsequent plots.\n          (_Only '\"r\"' and '\"i\"' styles have been implemented in R._)\n\n     'xaxt' A character which specifies the x axis type.  Specifying\n          '\"n\"' suppresses plotting of the axis.  The standard value is\n          '\"s\"': for compatibility with S values '\"l\"' and '\"t\"' are\n          accepted but are equivalent to '\"s\"': any value other than\n          '\"n\"' implies plotting.\n\n     'xlog' A logical value (see 'log' in 'plot.default').  If 'TRUE',\n          a logarithmic scale is in use (e.g., after 'plot(*, log =\n          \"x\")').  For a new device, it defaults to 'FALSE', i.e.,\n          linear scale.\n\n     'xpd' A logical value or 'NA'.  If 'FALSE', all plotting is\n          clipped to the plot region, if 'TRUE', all plotting is\n          clipped to the figure region, and if 'NA', all plotting is\n          clipped to the device region.  See also 'clip'.\n\n     'yaxp' A vector of the form 'c(y1, y2, n)' giving the coordinates\n          of the extreme tick marks and the number of intervals between\n          tick-marks unless for log coordinates, see 'xaxp' above.\n\n     'yaxs' The style of axis interval calculation to be used for the\n          y-axis.  See 'xaxs' above.\n\n     'yaxt' A character which specifies the y axis type.  Specifying\n          '\"n\"' suppresses plotting.\n\n     'ylbias' A positive real value used in the positioning of text in\n          the margins by 'axis' and 'mtext'.  The default is in\n          principle device-specific, but currently '0.2' for all of R's\n          own devices.  Set this to '0.2' for compatibility with R <\n          2.14.0 on 'x11' and 'windows()' devices.\n\n     'ylog' A logical value; see 'xlog' above.\n\nColor Specification:\n\n     Colors can be specified in several different ways. The simplest\n     way is with a character string giving the color name (e.g.,\n     '\"red\"').  A list of the possible colors can be obtained with the\n     function 'colors'.  Alternatively, colors can be specified\n     directly in terms of their RGB components with a string of the\n     form '\"#RRGGBB\"' where each of the pairs 'RR', 'GG', 'BB' consist\n     of two hexadecimal digits giving a value in the range '00' to\n     'FF'.  Colors can also be specified by giving an index into a\n     small table of colors, the 'palette': indices wrap round so with\n     the default palette of size 8, '10' is the same as '2'.  This\n     provides compatibility with S.  Index '0' corresponds to the\n     background color.  Note that the palette (apart from '0' which is\n     per-device) is a per-session setting.\n\n     Negative integer colours are errors.\n\n     Additionally, '\"transparent\"' is _transparent_, useful for filled\n     areas (such as the background!), and just invisible for things\n     like lines or text.  In most circumstances (integer) 'NA' is\n     equivalent to '\"transparent\"' (but not for 'text' and 'mtext').\n\n     Semi-transparent colors are available for use on devices that\n     support them.\n\n     The functions 'rgb', 'hsv', 'hcl', 'gray' and 'rainbow' provide\n     additional ways of generating colors.\n\nLine Type Specification:\n\n     Line types can either be specified by giving an index into a small\n     built-in table of line types (1 = solid, 2 = dashed, etc, see\n     'lty' above) or directly as the lengths of on/off stretches of\n     line.  This is done with a string of an even number (up to eight)\n     of characters, namely _non-zero_ (hexadecimal) digits which give\n     the lengths in consecutive positions in the string.  For example,\n     the string '\"33\"' specifies three units on followed by three off\n     and '\"3313\"' specifies three units on followed by three off\n     followed by one on and finally three off.  The 'units' here are\n     (on most devices) proportional to 'lwd', and with 'lwd = 1' are in\n     pixels or points or 1/96 inch.\n\n     The five standard dash-dot line types ('lty = 2:6') correspond to\n     'c(\"44\", \"13\", \"1343\", \"73\", \"2262\")'.\n\n     Note that 'NA' is not a valid value for 'lty'.\n\nNote:\n\n     The effect of restoring all the (settable) graphics parameters as\n     in the examples is hard to predict if the device has been resized.\n     Several of them are attempting to set the same things in different\n     ways, and those last in the alphabet will win.  In particular, the\n     settings of 'mai', 'mar', 'pin', 'plt' and 'pty' interact, as do\n     the outer margin settings, the figure layout and figure region\n     size.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\n\nSee Also:\n\n     'plot.default' for some high-level plotting parameters; 'colors';\n     'clip'; 'options' for other setup parameters; graphic devices\n     'x11', 'postscript' and setting up device regions by 'layout' and\n     'split.screen'.\n\nExamples:\n\n     op <- par(mfrow = c(2, 2), # 2 x 2 pictures on one plot\n               pty = \"s\")       # square plotting region,\n                                # independent of device size\n     \n     ## At end of plotting, reset to previous settings:\n     par(op)\n     \n     ## Alternatively,\n     op <- par(no.readonly = TRUE) # the whole list of settable par's.\n     ## do lots of plotting and par(.) calls, then reset:\n     par(op)\n     ## Note this is not in general good practice\n     \n     par(\"ylog\") # FALSE\n     plot(1 : 12, log = \"y\")\n     par(\"ylog\") # TRUE\n     \n     plot(1:2, xaxs = \"i\") # 'inner axis' w/o extra space\n     par(c(\"usr\", \"xaxp\"))\n     \n     ( nr.prof <-\n     c(prof.pilots = 16, lawyers = 11, farmers = 10, salesmen = 9, physicians = 9,\n       mechanics = 6, policemen = 6, managers = 6, engineers = 5, teachers = 4,\n       housewives = 3, students = 3, armed.forces = 1))\n     par(las = 3)\n     barplot(rbind(nr.prof)) # R 0.63.2: shows alignment problem\n     par(las = 0)  # reset to default\n     \n     require(grDevices) # for gray\n     ## 'fg' use:\n     plot(1:12, type = \"b\", main = \"'fg' : axes, ticks and box in gray\",\n          fg = gray(0.7), bty = \"7\" , sub = R.version.string)\n     \n     ex <- function() {\n        old.par <- par(no.readonly = TRUE) # all par settings which\n                                           # could be changed.\n        on.exit(par(old.par))\n        ## ...\n        ## ... do lots of par() settings and plots\n        ## ...\n        invisible() #-- now,  par(old.par)  will be executed\n     }\n     ex()\n     \n     ## Line types\n     showLty <- function(ltys, xoff = 0, ...) {\n        stopifnot((n <- length(ltys)) >= 1)\n        op <- par(mar = rep(.5,4)); on.exit(par(op))\n        plot(0:1, 0:1, type = \"n\", axes = FALSE, ann = FALSE)\n        y <- (n:1)/(n+1)\n        clty <- as.character(ltys)\n        mytext <- function(x, y, txt)\n           text(x, y, txt, adj = c(0, -.3), cex = 0.8, ...)\n        abline(h = y, lty = ltys, ...); mytext(xoff, y, clty)\n        y <- y - 1/(3*(n+1))\n        abline(h = y, lty = ltys, lwd = 2, ...)\n        mytext(1/8+xoff, y, paste(clty,\" lwd = 2\"))\n     }\n     showLty(c(\"solid\", \"dashed\", \"dotted\", \"dotdash\", \"longdash\", \"twodash\"))\n     par(new = TRUE)  # the same:\n     showLty(c(\"solid\", \"44\", \"13\", \"1343\", \"73\", \"2262\"), xoff = .2, col = 2)\n     showLty(c(\"11\", \"22\", \"33\", \"44\",   \"12\", \"13\", \"14\",   \"21\", \"31\"))\n\n\n\n## Common parameter options\n\nSix useful parameter arguments help improve the readability of the plot:\n\n- `xlab`: specifies the x-axis label of the plot\n- `ylab`: specifies the y-axis label\n- `main`: titles your graph\n- `pch`: specifies the symbology of your graph\n- `lty`: specifies the line type of your graph\n- `lwd`: specifies line thickness\n-\t`cex` : specifies size\n- `col`: specifies the colors for your graph.\n\nWe will explore use of these arguments below.\n\n## Common parameter options\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/atrributes.png){width=200%}\n:::\n:::\n\n\n\n\n## 2. Plot Attributes\n\nPlot attributes are those that map your data to the plot. This mean this is where you specify what variables in the data frame you want to plot. \n\nWe will only look at four types of plots today:\n\n- `hist()` displays histogram of one variable\n- `plot()` displays x-y plot of two variables\n- `boxplot()` displays boxplot \n- `barplot()` displays barplot\n\n\n## `histogram()` Help File\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?hist\n```\n:::\n\nHistograms\n\nDescription:\n\n     The generic function 'hist' computes a histogram of the given data\n     values.  If 'plot = TRUE', the resulting object of class\n     '\"histogram\"' is plotted by 'plot.histogram', before it is\n     returned.\n\nUsage:\n\n     hist(x, ...)\n     \n     ## Default S3 method:\n     hist(x, breaks = \"Sturges\",\n          freq = NULL, probability = !freq,\n          include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n          density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n          main = paste(\"Histogram of\" , xname),\n          xlim = range(breaks), ylim = NULL,\n          xlab = xname, ylab,\n          axes = TRUE, plot = TRUE, labels = FALSE,\n          nclass = NULL, warn.unused = TRUE, ...)\n     \nArguments:\n\n       x: a vector of values for which the histogram is desired.\n\n  breaks: one of:\n\n            • a vector giving the breakpoints between histogram cells,\n\n            • a function to compute the vector of breakpoints,\n\n            • a single number giving the number of cells for the\n              histogram,\n\n            • a character string naming an algorithm to compute the\n              number of cells (see 'Details'),\n\n            • a function to compute the number of cells.\n\n          In the last three cases the number is a suggestion only; as\n          the breakpoints will be set to 'pretty' values, the number is\n          limited to '1e6' (with a warning if it was larger).  If\n          'breaks' is a function, the 'x' vector is supplied to it as\n          the only argument (and the number of breaks is only limited\n          by the amount of available memory).\n\n    freq: logical; if 'TRUE', the histogram graphic is a representation\n          of frequencies, the 'counts' component of the result; if\n          'FALSE', probability densities, component 'density', are\n          plotted (so that the histogram has a total area of one).\n          Defaults to 'TRUE' _if and only if_ 'breaks' are equidistant\n          (and 'probability' is not specified).\n\nprobability: an _alias_ for '!freq', for S compatibility.\n\ninclude.lowest: logical; if 'TRUE', an 'x[i]' equal to the 'breaks'\n          value will be included in the first (or last, for 'right =\n          FALSE') bar.  This will be ignored (with a warning) unless\n          'breaks' is a vector.\n\n   right: logical; if 'TRUE', the histogram cells are right-closed\n          (left open) intervals.\n\n    fuzz: non-negative number, for the case when the data is \"pretty\"\n          and some observations 'x[.]' are close but not exactly on a\n          'break'.  For counting fuzzy breaks proportional to 'fuzz'\n          are used.  The default is occasionally suboptimal.\n\n density: the density of shading lines, in lines per inch.  The default\n          value of 'NULL' means that no shading lines are drawn.\n          Non-positive values of 'density' also inhibit the drawing of\n          shading lines.\n\n   angle: the slope of shading lines, given as an angle in degrees\n          (counter-clockwise).\n\n     col: a colour to be used to fill the bars.\n\n  border: the color of the border around the bars.  The default is to\n          use the standard foreground color.\n\nmain, xlab, ylab: main title and axis labels: these arguments to\n          'title()' get \"smart\" defaults here, e.g., the default 'ylab'\n          is '\"Frequency\"' iff 'freq' is true.\n\nxlim, ylim: the range of x and y values with sensible defaults.  Note\n          that 'xlim' is _not_ used to define the histogram (breaks),\n          but only for plotting (when 'plot = TRUE').\n\n    axes: logical.  If 'TRUE' (default), axes are draw if the plot is\n          drawn.\n\n    plot: logical.  If 'TRUE' (default), a histogram is plotted,\n          otherwise a list of breaks and counts is returned.  In the\n          latter case, a warning is used if (typically graphical)\n          arguments are specified that only apply to the 'plot = TRUE'\n          case.\n\n  labels: logical or character string.  Additionally draw labels on top\n          of bars, if not 'FALSE'; see 'plot.histogram'.\n\n  nclass: numeric (integer).  For S(-PLUS) compatibility only, 'nclass'\n          is equivalent to 'breaks' for a scalar or character argument.\n\nwarn.unused: logical.  If 'plot = FALSE' and 'warn.unused = TRUE', a\n          warning will be issued when graphical parameters are passed\n          to 'hist.default()'.\n\n     ...: further arguments and graphical parameters passed to\n          'plot.histogram' and thence to 'title' and 'axis' (if 'plot =\n          TRUE').\n\nDetails:\n\n     The definition of _histogram_ differs by source (with\n     country-specific biases).  R's default with equi-spaced breaks\n     (also the default) is to plot the counts in the cells defined by\n     'breaks'.  Thus the height of a rectangle is proportional to the\n     number of points falling into the cell, as is the area _provided_\n     the breaks are equally-spaced.\n\n     The default with non-equi-spaced breaks is to give a plot of area\n     one, in which the _area_ of the rectangles is the fraction of the\n     data points falling in the cells.\n\n     If 'right = TRUE' (default), the histogram cells are intervals of\n     the form (a, b], i.e., they include their right-hand endpoint, but\n     not their left one, with the exception of the first cell when\n     'include.lowest' is 'TRUE'.\n\n     For 'right = FALSE', the intervals are of the form [a, b), and\n     'include.lowest' means '_include highest_'.\n\n     A numerical tolerance of 1e-7 times the median bin size (for more\n     than four bins, otherwise the median is substituted) is applied\n     when counting entries on the edges of bins.  This is not included\n     in the reported 'breaks' nor in the calculation of 'density'.\n\n     The default for 'breaks' is '\"Sturges\"': see 'nclass.Sturges'.\n     Other names for which algorithms are supplied are '\"Scott\"' and\n     '\"FD\"' / '\"Freedman-Diaconis\"' (with corresponding functions\n     'nclass.scott' and 'nclass.FD').  Case is ignored and partial\n     matching is used.  Alternatively, a function can be supplied which\n     will compute the intended number of breaks or the actual\n     breakpoints as a function of 'x'.\n\nValue:\n\n     an object of class '\"histogram\"' which is a list with components:\n\n  breaks: the n+1 cell boundaries (= 'breaks' if that was a vector).\n          These are the nominal breaks, not with the boundary fuzz.\n\n  counts: n integers; for each cell, the number of 'x[]' inside.\n\n density: values f^(x[i]), as estimated density values. If\n          'all(diff(breaks) == 1)', they are the relative frequencies\n          'counts/n' and in general satisfy sum[i; f^(x[i])\n          (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.\n\n    mids: the n cell midpoints.\n\n   xname: a character string with the actual 'x' argument name.\n\nequidist: logical, indicating if the distances between 'breaks' are all\n          the same.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Venables, W. N. and Ripley. B. D. (2002) _Modern Applied\n     Statistics with S_.  Springer.\n\nSee Also:\n\n     'nclass.Sturges', 'stem', 'density', 'truehist' in package 'MASS'.\n\n     Typical plots with vertical bars are _not_ histograms.  Consider\n     'barplot' or 'plot(*, type = \"h\")' for such bar plots.\n\nExamples:\n\n     op <- par(mfrow = c(2, 2))\n     hist(islands)\n     utils::str(hist(islands, col = \"gray\", labels = TRUE))\n     \n     hist(sqrt(islands), breaks = 12, col = \"lightblue\", border = \"pink\")\n     ##-- For non-equidistant breaks, counts should NOT be graphed unscaled:\n     r <- hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140),\n               col = \"blue1\")\n     text(r$mids, r$density, r$counts, adj = c(.5, -.5), col = \"blue3\")\n     sapply(r[2:3], sum)\n     sum(r$density * diff(r$breaks)) # == 1\n     lines(r, lty = 3, border = \"purple\") # -> lines.histogram(*)\n     par(op)\n     \n     require(utils) # for str\n     str(hist(islands, breaks = 12, plot =  FALSE)) #-> 10 (~= 12) breaks\n     str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE))\n     \n     hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE,\n          main = \"WRONG histogram\") # and warning\n     \n     ## Extreme outliers; the \"FD\" rule would take very large number of 'breaks':\n     XXL <- c(1:9, c(-1,1)*1e300)\n     hh <- hist(XXL, \"FD\") # did not work in R <= 3.4.1; now gives warning\n     ## pretty() determines how many counts are used (platform dependently!):\n     length(hh$breaks) ## typically 1 million -- though 1e6 was \"a suggestion only\"\n     \n     ## R >= 4.2.0: no \"*.5\" labels on y-axis:\n     hist(c(2,3,3,5,5,6,6,6,7))\n     \n     require(stats)\n     set.seed(14)\n     x <- rchisq(100, df = 4)\n     \n     ## Histogram with custom x-axis:\n     hist(x, xaxt = \"n\")\n     axis(1, at = 0:17)\n     \n     \n     ## Comparing data with a model distribution should be done with qqplot()!\n     qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2)\n     \n     ## if you really insist on using hist() ... :\n     hist(x, freq = FALSE, ylim = c(0, 0.2))\n     curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)\n\n\n\n## `histogram()` example\n\nReminder\n```\nhist(x, breaks = \"Sturges\",\n     freq = NULL, probability = !freq,\n     include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n     density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n     main = paste(\"Histogram of\" , xname),\n     xlim = range(breaks), ylim = NULL,\n     xlab = xname, ylab,\n     axes = TRUE, plot = TRUE, labels = FALSE,\n     nclass = NULL, warn.unused = TRUE, ...)\n```\n\nLet's practice\n\n\n::: {.cell}\n\n```{.r .cell-code}\nhist(df$age)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-1.png){width=960}\n:::\n\n```{.r .cell-code}\nhist(\n\tdf$age, \n\tfreq=FALSE, \n\tmain=\"Histogram\", \n\txlab=\"Age (years)\"\n\t)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-2.png){width=960}\n:::\n:::\n\n\n\n\n## `plot()` Help File\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?plot\n```\n:::\n\nGeneric X-Y Plotting\n\nDescription:\n\n     Generic function for plotting of R objects.\n\n     For simple scatter plots, 'plot.default' will be used.  However,\n     there are 'plot' methods for many R objects, including\n     'function's, 'data.frame's, 'density' objects, etc.  Use\n     'methods(plot)' and the documentation for these. Most of these\n     methods are implemented using traditional graphics (the 'graphics'\n     package), but this is not mandatory.\n\n     For more details about graphical parameter arguments used by\n     traditional graphics, see 'par'.\n\nUsage:\n\n     plot(x, y, ...)\n     \nArguments:\n\n       x: the coordinates of points in the plot. Alternatively, a\n          single plotting structure, function or _any R object with a\n          'plot' method_ can be provided.\n\n       y: the y coordinates of points in the plot, _optional_ if 'x' is\n          an appropriate structure.\n\n     ...: Arguments to be passed to methods, such as graphical\n          parameters (see 'par').  Many methods will accept the\n          following arguments:\n\n          'type' what type of plot should be drawn.  Possible types are\n\n                • '\"p\"' for *p*oints,\n\n                • '\"l\"' for *l*ines,\n\n                • '\"b\"' for *b*oth,\n\n                • '\"c\"' for the lines part alone of '\"b\"',\n\n                • '\"o\"' for both '*o*verplotted',\n\n                • '\"h\"' for '*h*istogram' like (or 'high-density')\n                  vertical lines,\n\n                • '\"s\"' for stair *s*teps,\n\n                • '\"S\"' for other *s*teps, see 'Details' below,\n\n                • '\"n\"' for no plotting.\n\n              All other 'type's give a warning or an error; using,\n              e.g., 'type = \"punkte\"' being equivalent to 'type = \"p\"'\n              for S compatibility.  Note that some methods, e.g.\n              'plot.factor', do not accept this.\n\n          'main' an overall title for the plot: see 'title'.\n\n          'sub' a subtitle for the plot: see 'title'.\n\n          'xlab' a title for the x axis: see 'title'.\n\n          'ylab' a title for the y axis: see 'title'.\n\n          'asp' the y/x aspect ratio, see 'plot.window'.\n\nDetails:\n\n     The two step types differ in their x-y preference: Going from\n     (x1,y1) to (x2,y2) with x1 < x2, 'type = \"s\"' moves first\n     horizontal, then vertical, whereas 'type = \"S\"' moves the other\n     way around.\n\nNote:\n\n     The 'plot' generic was moved from the 'graphics' package to the\n     'base' package in R 4.0.0. It is currently re-exported from the\n     'graphics' namespace to allow packages importing it from there to\n     continue working, but this may change in future versions of R.\n\nSee Also:\n\n     'plot.default', 'plot.formula' and other methods; 'points',\n     'lines', 'par'.  For thousands of points, consider using\n     'smoothScatter()' instead of 'plot()'.\n\n     For X-Y-Z plotting see 'contour', 'persp' and 'image'.\n\nExamples:\n\n     require(stats) # for lowess, rpois, rnorm\n     require(graphics) # for plot methods\n     plot(cars)\n     lines(lowess(cars))\n     \n     plot(sin, -pi, 2*pi) # see ?plot.function\n     \n     ## Discrete Distribution Plot:\n     plot(table(rpois(100, 5)), type = \"h\", col = \"red\", lwd = 10,\n          main = \"rpois(100, lambda = 5)\")\n     \n     ## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:\n     plot(x <- sort(rnorm(47)), type = \"s\", main = \"plot(x, type = \\\"s\\\")\")\n     points(x, cex = .5, col = \"dark red\")\n\n\n\n\n## `plot()` example\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nplot(df$age, df$IgG_concentration)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-1.png){width=960}\n:::\n\n```{.r .cell-code}\nplot(\n\tdf$age, \n\tdf$IgG_concentration, \n\ttype=\"p\", \n\tmain=\"Age by IgG Concentrations\", \n\txlab=\"Age (years)\", \n\tylab=\"IgG Concentration (mIU/mL)\", \n\tpch=16, \n\tcex=0.9,\n\tcol=\"lightblue\")\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png){width=960}\n:::\n:::\n\n\n\n\n## `boxplot()` Help File\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?boxplot\n```\n:::\n\nBox Plots\n\nDescription:\n\n     Produce box-and-whisker plot(s) of the given (grouped) values.\n\nUsage:\n\n     boxplot(x, ...)\n     \n     ## S3 method for class 'formula'\n     boxplot(formula, data = NULL, ..., subset, na.action = NULL,\n             xlab = mklab(y_var = horizontal),\n             ylab = mklab(y_var =!horizontal),\n             add = FALSE, ann = !add, horizontal = FALSE,\n             drop = FALSE, sep = \".\", lex.order = FALSE)\n     \n     ## Default S3 method:\n     boxplot(x, ..., range = 1.5, width = NULL, varwidth = FALSE,\n             notch = FALSE, outline = TRUE, names, plot = TRUE,\n             border = par(\"fg\"), col = \"lightgray\", log = \"\",\n             pars = list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5),\n              ann = !add, horizontal = FALSE, add = FALSE, at = NULL)\n     \nArguments:\n\n formula: a formula, such as 'y ~ grp', where 'y' is a numeric vector\n          of data values to be split into groups according to the\n          grouping variable 'grp' (usually a factor).  Note that '~ g1\n          + g2' is equivalent to 'g1:g2'.\n\n    data: a data.frame (or list) from which the variables in 'formula'\n          should be taken.\n\n  subset: an optional vector specifying a subset of observations to be\n          used for plotting.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  The default is to ignore missing values in\n          either the response or the group.\n\nxlab, ylab: x- and y-axis annotation, since R 3.6.0 with a non-empty\n          default.  Can be suppressed by 'ann=FALSE'.\n\n     ann: 'logical' indicating if axes should be annotated (by 'xlab'\n          and 'ylab').\n\ndrop, sep, lex.order: passed to 'split.default', see there.\n\n       x: for specifying data from which the boxplots are to be\n          produced. Either a numeric vector, or a single list\n          containing such vectors. Additional unnamed arguments specify\n          further data as separate vectors (each corresponding to a\n          component boxplot).  'NA's are allowed in the data.\n\n     ...: For the 'formula' method, named arguments to be passed to the\n          default method.\n\n          For the default method, unnamed arguments are additional data\n          vectors (unless 'x' is a list when they are ignored), and\n          named arguments are arguments and graphical parameters to be\n          passed to 'bxp' in addition to the ones given by argument\n          'pars' (and override those in 'pars'). Note that 'bxp' may or\n          may not make use of graphical parameters it is passed: see\n          its documentation.\n\n   range: this determines how far the plot whiskers extend out from the\n          box.  If 'range' is positive, the whiskers extend to the most\n          extreme data point which is no more than 'range' times the\n          interquartile range from the box. A value of zero causes the\n          whiskers to extend to the data extremes.\n\n   width: a vector giving the relative widths of the boxes making up\n          the plot.\n\nvarwidth: if 'varwidth' is 'TRUE', the boxes are drawn with widths\n          proportional to the square-roots of the number of\n          observations in the groups.\n\n   notch: if 'notch' is 'TRUE', a notch is drawn in each side of the\n          boxes.  If the notches of two plots do not overlap this is\n          'strong evidence' that the two medians differ (Chambers _et\n          al_, 1983, p. 62).  See 'boxplot.stats' for the calculations\n          used.\n\n outline: if 'outline' is not true, the outliers are not drawn (as\n          points whereas S+ uses lines).\n\n   names: group labels which will be printed under each boxplot.  Can\n          be a character vector or an expression (see plotmath).\n\n  boxwex: a scale factor to be applied to all boxes.  When there are\n          only a few groups, the appearance of the plot can be improved\n          by making the boxes narrower.\n\nstaplewex: staple line width expansion, proportional to box width.\n\n  outwex: outlier line width expansion, proportional to box width.\n\n    plot: if 'TRUE' (the default) then a boxplot is produced.  If not,\n          the summaries which the boxplots are based on are returned.\n\n  border: an optional vector of colors for the outlines of the\n          boxplots.  The values in 'border' are recycled if the length\n          of 'border' is less than the number of plots.\n\n     col: if 'col' is non-null it is assumed to contain colors to be\n          used to colour the bodies of the box plots. By default they\n          are in the background colour.\n\n     log: character indicating if x or y or both coordinates should be\n          plotted in log scale.\n\n    pars: a list of (potentially many) more graphical parameters, e.g.,\n          'boxwex' or 'outpch'; these are passed to 'bxp' (if 'plot' is\n          true); for details, see there.\n\nhorizontal: logical indicating if the boxplots should be horizontal;\n          default 'FALSE' means vertical boxes.\n\n     add: logical, if true _add_ boxplot to current plot.\n\n      at: numeric vector giving the locations where the boxplots should\n          be drawn, particularly when 'add = TRUE'; defaults to '1:n'\n          where 'n' is the number of boxes.\n\nDetails:\n\n     The generic function 'boxplot' currently has a default method\n     ('boxplot.default') and a formula interface ('boxplot.formula').\n\n     If multiple groups are supplied either as multiple arguments or\n     via a formula, parallel boxplots will be plotted, in the order of\n     the arguments or the order of the levels of the factor (see\n     'factor').\n\n     Missing values are ignored when forming boxplots.\n\nValue:\n\n     List with the following components:\n\n   stats: a matrix, each column contains the extreme of the lower\n          whisker, the lower hinge, the median, the upper hinge and the\n          extreme of the upper whisker for one group/plot.  If all the\n          inputs have the same class attribute, so will this component.\n\n       n: a vector with the number of (non-'NA') observations in each\n          group.\n\n    conf: a matrix where each column contains the lower and upper\n          extremes of the notch.\n\n     out: the values of any data points which lie beyond the extremes\n          of the whiskers.\n\n   group: a vector of the same length as 'out' whose elements indicate\n          to which group the outlier belongs.\n\n   names: a vector of names for the groups.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).  _The New\n     S Language_.  Wadsworth & Brooks/Cole.\n\n     Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A.\n     (1983).  _Graphical Methods for Data Analysis_.  Wadsworth &\n     Brooks/Cole.\n\n     Murrell, P. (2005).  _R Graphics_.  Chapman & Hall/CRC Press.\n\n     See also 'boxplot.stats'.\n\nSee Also:\n\n     'boxplot.stats' which does the computation, 'bxp' for the plotting\n     and more examples; and 'stripchart' for an alternative (with small\n     data sets).\n\nExamples:\n\n     ## boxplot on a formula:\n     boxplot(count ~ spray, data = InsectSprays, col = \"lightgray\")\n     # *add* notches (somewhat funny here <--> warning \"notches .. outside hinges\"):\n     boxplot(count ~ spray, data = InsectSprays,\n             notch = TRUE, add = TRUE, col = \"blue\")\n     \n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"y\")\n     ## horizontal=TRUE, switching  y <--> x :\n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"x\", horizontal=TRUE)\n     \n     rb <- boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\")\n     title(\"Comparing boxplot()s and non-robust mean +/- SD\")\n     mn.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)\n     sd.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)\n     xi <- 0.3 + seq(rb$n)\n     points(xi, mn.t, col = \"orange\", pch = 18)\n     arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,\n            code = 3, col = \"pink\", angle = 75, length = .1)\n     \n     ## boxplot on a matrix:\n     mat <- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),\n                  `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))\n     boxplot(mat) # directly, calling boxplot.matrix()\n     \n     ## boxplot on a data frame:\n     df. <- as.data.frame(mat)\n     par(las = 1) # all axis labels horizontal\n     boxplot(df., main = \"boxplot(*, horizontal = TRUE)\", horizontal = TRUE)\n     \n     ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :\n     boxplot(len ~ dose, data = ToothGrowth,\n             boxwex = 0.25, at = 1:3 - 0.2,\n             subset = supp == \"VC\", col = \"yellow\",\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\",\n             ylab = \"tooth length\",\n             xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = \"i\")\n     boxplot(len ~ dose, data = ToothGrowth, add = TRUE,\n             boxwex = 0.25, at = 1:3 + 0.2,\n             subset = supp == \"OJ\", col = \"orange\")\n     legend(2, 9, c(\"Ascorbic acid\", \"Orange juice\"),\n            fill = c(\"yellow\", \"orange\"))\n     \n     ## With less effort (slightly different) using factor *interaction*:\n     boxplot(len ~ dose:supp, data = ToothGrowth,\n             boxwex = 0.5, col = c(\"orange\", \"yellow\"),\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\", ylab = \"tooth length\",\n             sep = \":\", lex.order = TRUE, ylim = c(0, 35), yaxs = \"i\")\n     \n     ## more examples in  help(bxp)\n\n\n\n\n## `boxplot()` example\n\nReminder\n```\nboxplot(formula, data = NULL, ..., subset, na.action = NULL,\n        xlab = mklab(y_var = horizontal),\n        ylab = mklab(y_var =!horizontal),\n        add = FALSE, ann = !add, horizontal = FALSE,\n        drop = FALSE, sep = \".\", lex.order = FALSE)\n```\n\nLet's practice\n\n\n::: {.cell}\n\n```{.r .cell-code}\nboxplot(IgG_concentration~age_group, data=df)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-1.png){width=960}\n:::\n\n```{.r .cell-code}\nboxplot(\n\tlog(df$IgG_concentration)~df$age_group, \n\tmain=\"Age by IgG Concentrations\", \n\txlab=\"Age Group (years)\", \n\tylab=\"log IgG Concentration (mIU/mL)\", \n\tnames=c(\"1-5\",\"6-10\", \"11-15\"), \n\tvarwidth=T\n\t)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-2.png){width=960}\n:::\n:::\n\n\n\n\n## `barplot()` Help File\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?barplot\n```\n:::\n\nBox Plots\n\nDescription:\n\n     Produce box-and-whisker plot(s) of the given (grouped) values.\n\nUsage:\n\n     boxplot(x, ...)\n     \n     ## S3 method for class 'formula'\n     boxplot(formula, data = NULL, ..., subset, na.action = NULL,\n             xlab = mklab(y_var = horizontal),\n             ylab = mklab(y_var =!horizontal),\n             add = FALSE, ann = !add, horizontal = FALSE,\n             drop = FALSE, sep = \".\", lex.order = FALSE)\n     \n     ## Default S3 method:\n     boxplot(x, ..., range = 1.5, width = NULL, varwidth = FALSE,\n             notch = FALSE, outline = TRUE, names, plot = TRUE,\n             border = par(\"fg\"), col = \"lightgray\", log = \"\",\n             pars = list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5),\n              ann = !add, horizontal = FALSE, add = FALSE, at = NULL)\n     \nArguments:\n\n formula: a formula, such as 'y ~ grp', where 'y' is a numeric vector\n          of data values to be split into groups according to the\n          grouping variable 'grp' (usually a factor).  Note that '~ g1\n          + g2' is equivalent to 'g1:g2'.\n\n    data: a data.frame (or list) from which the variables in 'formula'\n          should be taken.\n\n  subset: an optional vector specifying a subset of observations to be\n          used for plotting.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  The default is to ignore missing values in\n          either the response or the group.\n\nxlab, ylab: x- and y-axis annotation, since R 3.6.0 with a non-empty\n          default.  Can be suppressed by 'ann=FALSE'.\n\n     ann: 'logical' indicating if axes should be annotated (by 'xlab'\n          and 'ylab').\n\ndrop, sep, lex.order: passed to 'split.default', see there.\n\n       x: for specifying data from which the boxplots are to be\n          produced. Either a numeric vector, or a single list\n          containing such vectors. Additional unnamed arguments specify\n          further data as separate vectors (each corresponding to a\n          component boxplot).  'NA's are allowed in the data.\n\n     ...: For the 'formula' method, named arguments to be passed to the\n          default method.\n\n          For the default method, unnamed arguments are additional data\n          vectors (unless 'x' is a list when they are ignored), and\n          named arguments are arguments and graphical parameters to be\n          passed to 'bxp' in addition to the ones given by argument\n          'pars' (and override those in 'pars'). Note that 'bxp' may or\n          may not make use of graphical parameters it is passed: see\n          its documentation.\n\n   range: this determines how far the plot whiskers extend out from the\n          box.  If 'range' is positive, the whiskers extend to the most\n          extreme data point which is no more than 'range' times the\n          interquartile range from the box. A value of zero causes the\n          whiskers to extend to the data extremes.\n\n   width: a vector giving the relative widths of the boxes making up\n          the plot.\n\nvarwidth: if 'varwidth' is 'TRUE', the boxes are drawn with widths\n          proportional to the square-roots of the number of\n          observations in the groups.\n\n   notch: if 'notch' is 'TRUE', a notch is drawn in each side of the\n          boxes.  If the notches of two plots do not overlap this is\n          'strong evidence' that the two medians differ (Chambers _et\n          al_, 1983, p. 62).  See 'boxplot.stats' for the calculations\n          used.\n\n outline: if 'outline' is not true, the outliers are not drawn (as\n          points whereas S+ uses lines).\n\n   names: group labels which will be printed under each boxplot.  Can\n          be a character vector or an expression (see plotmath).\n\n  boxwex: a scale factor to be applied to all boxes.  When there are\n          only a few groups, the appearance of the plot can be improved\n          by making the boxes narrower.\n\nstaplewex: staple line width expansion, proportional to box width.\n\n  outwex: outlier line width expansion, proportional to box width.\n\n    plot: if 'TRUE' (the default) then a boxplot is produced.  If not,\n          the summaries which the boxplots are based on are returned.\n\n  border: an optional vector of colors for the outlines of the\n          boxplots.  The values in 'border' are recycled if the length\n          of 'border' is less than the number of plots.\n\n     col: if 'col' is non-null it is assumed to contain colors to be\n          used to colour the bodies of the box plots. By default they\n          are in the background colour.\n\n     log: character indicating if x or y or both coordinates should be\n          plotted in log scale.\n\n    pars: a list of (potentially many) more graphical parameters, e.g.,\n          'boxwex' or 'outpch'; these are passed to 'bxp' (if 'plot' is\n          true); for details, see there.\n\nhorizontal: logical indicating if the boxplots should be horizontal;\n          default 'FALSE' means vertical boxes.\n\n     add: logical, if true _add_ boxplot to current plot.\n\n      at: numeric vector giving the locations where the boxplots should\n          be drawn, particularly when 'add = TRUE'; defaults to '1:n'\n          where 'n' is the number of boxes.\n\nDetails:\n\n     The generic function 'boxplot' currently has a default method\n     ('boxplot.default') and a formula interface ('boxplot.formula').\n\n     If multiple groups are supplied either as multiple arguments or\n     via a formula, parallel boxplots will be plotted, in the order of\n     the arguments or the order of the levels of the factor (see\n     'factor').\n\n     Missing values are ignored when forming boxplots.\n\nValue:\n\n     List with the following components:\n\n   stats: a matrix, each column contains the extreme of the lower\n          whisker, the lower hinge, the median, the upper hinge and the\n          extreme of the upper whisker for one group/plot.  If all the\n          inputs have the same class attribute, so will this component.\n\n       n: a vector with the number of (non-'NA') observations in each\n          group.\n\n    conf: a matrix where each column contains the lower and upper\n          extremes of the notch.\n\n     out: the values of any data points which lie beyond the extremes\n          of the whiskers.\n\n   group: a vector of the same length as 'out' whose elements indicate\n          to which group the outlier belongs.\n\n   names: a vector of names for the groups.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).  _The New\n     S Language_.  Wadsworth & Brooks/Cole.\n\n     Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A.\n     (1983).  _Graphical Methods for Data Analysis_.  Wadsworth &\n     Brooks/Cole.\n\n     Murrell, P. (2005).  _R Graphics_.  Chapman & Hall/CRC Press.\n\n     See also 'boxplot.stats'.\n\nSee Also:\n\n     'boxplot.stats' which does the computation, 'bxp' for the plotting\n     and more examples; and 'stripchart' for an alternative (with small\n     data sets).\n\nExamples:\n\n     ## boxplot on a formula:\n     boxplot(count ~ spray, data = InsectSprays, col = \"lightgray\")\n     # *add* notches (somewhat funny here <--> warning \"notches .. outside hinges\"):\n     boxplot(count ~ spray, data = InsectSprays,\n             notch = TRUE, add = TRUE, col = \"blue\")\n     \n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"y\")\n     ## horizontal=TRUE, switching  y <--> x :\n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"x\", horizontal=TRUE)\n     \n     rb <- boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\")\n     title(\"Comparing boxplot()s and non-robust mean +/- SD\")\n     mn.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)\n     sd.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)\n     xi <- 0.3 + seq(rb$n)\n     points(xi, mn.t, col = \"orange\", pch = 18)\n     arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,\n            code = 3, col = \"pink\", angle = 75, length = .1)\n     \n     ## boxplot on a matrix:\n     mat <- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),\n                  `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))\n     boxplot(mat) # directly, calling boxplot.matrix()\n     \n     ## boxplot on a data frame:\n     df. <- as.data.frame(mat)\n     par(las = 1) # all axis labels horizontal\n     boxplot(df., main = \"boxplot(*, horizontal = TRUE)\", horizontal = TRUE)\n     \n     ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :\n     boxplot(len ~ dose, data = ToothGrowth,\n             boxwex = 0.25, at = 1:3 - 0.2,\n             subset = supp == \"VC\", col = \"yellow\",\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\",\n             ylab = \"tooth length\",\n             xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = \"i\")\n     boxplot(len ~ dose, data = ToothGrowth, add = TRUE,\n             boxwex = 0.25, at = 1:3 + 0.2,\n             subset = supp == \"OJ\", col = \"orange\")\n     legend(2, 9, c(\"Ascorbic acid\", \"Orange juice\"),\n            fill = c(\"yellow\", \"orange\"))\n     \n     ## With less effort (slightly different) using factor *interaction*:\n     boxplot(len ~ dose:supp, data = ToothGrowth,\n             boxwex = 0.5, col = c(\"orange\", \"yellow\"),\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\", ylab = \"tooth length\",\n             sep = \":\", lex.order = TRUE, ylim = c(0, 35), yaxs = \"i\")\n     \n     ## more examples in  help(bxp)\n\n\n\n\n## `barplot()` example\n\nThe function takes the a lot of arguments to control the way the way our data is plotted. \n\nReminder\n```\nbarplot(height, width = 1, space = NULL,\n        names.arg = NULL, legend.text = NULL, beside = FALSE,\n        horiz = FALSE, density = NULL, angle = 45,\n        col = NULL, border = par(\"fg\"),\n        main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n        xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n        axes = TRUE, axisnames = TRUE,\n        cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n        inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n        add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n```\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$seropos, df$age_group)\nbarplot(freq)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-1.png){width=960}\n:::\n\n```{.r .cell-code}\nprop <- prop.table(freq)\nbarplot(prop)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-2.png){width=960}\n:::\n:::\n\n\n\n## 3. Legend!\n\nIn Base R plotting the legend is not automatically generated.  This is nice because it gives you a huge amount of control over how your legend looks, but it is also easy to mislabel your colors, symbols, line types, etc. So, basically be careful.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?legend\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\nAdd Legends to Plots\n\nDescription:\n\n     This function can be used to add legends to plots.  Note that a\n     call to the function 'locator(1)' can be used in place of the 'x'\n     and 'y' arguments.\n\nUsage:\n\n     legend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n            border = \"black\", lty, lwd, pch,\n            angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n            box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n            pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n            xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n            adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n            text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n            plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n            inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n            title.cex = cex[1], title.font = text.font[1],\n            seg.len = 2)\n     \nArguments:\n\n    x, y: the x and y co-ordinates to be used to position the legend.\n          They can be specified by keyword or in any way which is\n          accepted by 'xy.coords': See 'Details'.\n\n  legend: a character or expression vector of length >= 1 to appear in\n          the legend.  Other objects will be coerced by\n          'as.graphicsAnnot'.\n\n    fill: if specified, this argument will cause boxes filled with the\n          specified colors (or shaded in the specified colors) to\n          appear beside the legend text.\n\n     col: the color of points or lines appearing in the legend.\n\n  border: the border color for the boxes (used only if 'fill' is\n          specified).\n\nlty, lwd: the line types and widths for lines appearing in the legend.\n          One of these two _must_ be specified for line drawing.\n\n     pch: the plotting symbols appearing in the legend, as numeric\n          vector or a vector of 1-character strings (see 'points').\n          Unlike 'points', this can all be specified as a single\n          multi-character string.  _Must_ be specified for symbol\n          drawing.\n\n   angle: angle of shading lines.\n\n density: the density of shading lines, if numeric and positive. If\n          'NULL' or negative or 'NA' color filling is assumed.\n\n     bty: the type of box to be drawn around the legend.  The allowed\n          values are '\"o\"' (the default) and '\"n\"'.\n\n      bg: the background color for the legend box.  (Note that this is\n          only used if 'bty != \"n\"'.)\n\nbox.lty, box.lwd, box.col: the line type, width and color for the\n          legend box (if 'bty = \"o\"').\n\n   pt.bg: the background color for the 'points', corresponding to its\n          argument 'bg'.\n\n     cex: character expansion factor *relative* to current\n          'par(\"cex\")'.  Used for text, and provides the default for\n          'pt.cex'.\n\n  pt.cex: expansion factor(s) for the points.\n\n  pt.lwd: line width for the points, defaults to the one for lines, or\n          if that is not set, to 'par(\"lwd\")'.\n\n   xjust: how the legend is to be justified relative to the legend x\n          location.  A value of 0 means left justified, 0.5 means\n          centered and 1 means right justified.\n\n   yjust: the same as 'xjust' for the legend y location.\n\nx.intersp: character interspacing factor for horizontal (x) spacing\n          between symbol and legend text.\n\ny.intersp: vertical (y) distances (in lines of text shared above/below\n          each legend entry).  A vector with one element for each row\n          of the legend can be used.\n\n     adj: numeric of length 1 or 2; the string adjustment for legend\n          text.  Useful for y-adjustment when 'labels' are plotmath\n          expressions.\n\ntext.width: the width of the legend text in x ('\"user\"') coordinates.\n          (Should be positive even for a reversed x axis.)  Can be a\n          single positive numeric value (same width for each column of\n          the legend), a vector (one element for each column of the\n          legend), 'NULL' (default) for computing a proper maximum\n          value of 'strwidth(legend)'), or 'NA' for computing a proper\n          column wise maximum value of 'strwidth(legend)').\n\ntext.col: the color used for the legend text.\n\ntext.font: the font used for the legend text, see 'text'.\n\n   merge: logical; if 'TRUE', merge points and lines but not filled\n          boxes.  Defaults to 'TRUE' if there are points and lines.\n\n   trace: logical; if 'TRUE', shows how 'legend' does all its magical\n          computations.\n\n    plot: logical.  If 'FALSE', nothing is plotted but the sizes are\n          returned.\n\n    ncol: the number of columns in which to set the legend items\n          (default is 1, a vertical legend).\n\n   horiz: logical; if 'TRUE', set the legend horizontally rather than\n          vertically (specifying 'horiz' overrides the 'ncol'\n          specification).\n\n   title: a character string or length-one expression giving a title to\n          be placed at the top of the legend.  Other objects will be\n          coerced by 'as.graphicsAnnot'.\n\n   inset: inset distance(s) from the margins as a fraction of the plot\n          region when legend is placed by keyword.\n\n     xpd: if supplied, a value of the graphical parameter 'xpd' to be\n          used while the legend is being drawn.\n\ntitle.col: color for 'title', defaults to 'text.col[1]'.\n\ntitle.adj: horizontal adjustment for 'title': see the help for\n          'par(\"adj\")'.\n\ntitle.cex: expansion factor(s) for the title, defaults to 'cex[1]'.\n\ntitle.font: the font used for the legend title, defaults to\n          'text.font[1]', see 'text'.\n\n seg.len: the length of lines drawn to illustrate 'lty' and/or 'lwd'\n          (in units of character widths).\n\nDetails:\n\n     Arguments 'x', 'y', 'legend' are interpreted in a non-standard way\n     to allow the coordinates to be specified _via_ one or two\n     arguments.  If 'legend' is missing and 'y' is not numeric, it is\n     assumed that the second argument is intended to be 'legend' and\n     that the first argument specifies the coordinates.\n\n     The coordinates can be specified in any way which is accepted by\n     'xy.coords'.  If this gives the coordinates of one point, it is\n     used as the top-left coordinate of the rectangle containing the\n     legend.  If it gives the coordinates of two points, these specify\n     opposite corners of the rectangle (either pair of corners, in any\n     order).\n\n     The location may also be specified by setting 'x' to a single\n     keyword from the list '\"bottomright\"', '\"bottom\"', '\"bottomleft\"',\n     '\"left\"', '\"topleft\"', '\"top\"', '\"topright\"', '\"right\"' and\n     '\"center\"'. This places the legend on the inside of the plot frame\n     at the given location. Partial argument matching is used.  The\n     optional 'inset' argument specifies how far the legend is inset\n     from the plot margins.  If a single value is given, it is used for\n     both margins; if two values are given, the first is used for 'x'-\n     distance, the second for 'y'-distance.\n\n     Attribute arguments such as 'col', 'pch', 'lty', etc, are recycled\n     if necessary: 'merge' is not.  Set entries of 'lty' to '0' or set\n     entries of 'lwd' to 'NA' to suppress lines in corresponding legend\n     entries; set 'pch' values to 'NA' to suppress points.\n\n     Points are drawn _after_ lines in order that they can cover the\n     line with their background color 'pt.bg', if applicable.\n\n     See the examples for how to right-justify labels.\n\n     Since they are not used for Unicode code points, values '-31:-1'\n     are silently omitted, as are 'NA' and '\"\"' values.\n\nValue:\n\n     A list with list components\n\n    rect: a list with components\n\n          'w', 'h' positive numbers giving *w*idth and *h*eight of the\n              legend's box.\n\n          'left', 'top' x and y coordinates of upper left corner of the\n              box.\n\n    text: a list with components\n\n          'x, y' numeric vectors of length 'length(legend)', giving the\n              x and y coordinates of the legend's text(s).\n\n     returned invisibly.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\n\nSee Also:\n\n     'plot', 'barplot' which uses 'legend()', and 'text' for more\n     examples of math expressions.\n\nExamples:\n\n     ## Run the example in '?matplot' or the following:\n     leg.txt <- c(\"Setosa     Petals\", \"Setosa     Sepals\",\n                  \"Versicolor Petals\", \"Versicolor Sepals\")\n     y.leg <- c(4.5, 3, 2.1, 1.4, .7)\n     cexv  <- c(1.2, 1, 4/5, 2/3, 1/2)\n     matplot(c(1, 8), c(0, 4.5), type = \"n\", xlab = \"Length\", ylab = \"Width\",\n             main = \"Petal and Sepal Dimensions in Iris Blossoms\")\n     for (i in seq(cexv)) {\n       text  (1, y.leg[i] - 0.1, paste(\"cex=\", formatC(cexv[i])), cex = 0.8, adj = 0)\n       legend(3, y.leg[i], leg.txt, pch = \"sSvV\", col = c(1, 3), cex = cexv[i])\n     }\n     ## cex *vector* [in R <= 3.5.1 has 'if(xc < 0)' w/ length(xc) == 2]\n     legend(\"right\", leg.txt, pch = \"sSvV\", col = c(1, 3),\n            cex = 1+(-1:2)/8, trace = TRUE)# trace: show computed lengths & coords\n     \n     ## 'merge = TRUE' for merging lines & points:\n     x <- seq(-pi, pi, length.out = 65)\n     for(reverse in c(FALSE, TRUE)) {  ## normal *and* reverse axes:\n       F <- if(reverse) rev else identity\n       plot(x, sin(x), type = \"l\", col = 3, lty = 2,\n            xlim = F(range(x)), ylim = F(c(-1.2, 1.8)))\n       points(x, cos(x), pch = 3, col = 4)\n       lines(x, tan(x), type = \"b\", lty = 1, pch = 4, col = 6)\n       title(\"legend('top', lty = c(2, -1, 1), pch = c(NA, 3, 4), merge = TRUE)\",\n             cex.main = 1.1)\n       legend(\"top\", c(\"sin\", \"cos\", \"tan\"), col = c(3, 4, 6),\n            text.col = \"green4\", lty = c(2, -1, 1), pch = c(NA, 3, 4),\n            merge = TRUE, bg = \"gray90\", trace=TRUE)\n       \n     } # for(..)\n     \n     ## right-justifying a set of labels: thanks to Uwe Ligges\n     x <- 1:5; y1 <- 1/x; y2 <- 2/x\n     plot(rep(x, 2), c(y1, y2), type = \"n\", xlab = \"x\", ylab = \"y\")\n     lines(x, y1); lines(x, y2, lty = 2)\n     temp <- legend(\"topright\", legend = c(\" \", \" \"),\n                    text.width = strwidth(\"1,000,000\"),\n                    lty = 1:2, xjust = 1, yjust = 1, inset = 1/10,\n                    title = \"Line Types\", title.cex = 0.5, trace=TRUE)\n     text(temp$rect$left + temp$rect$w, temp$text$y,\n          c(\"1,000\", \"1,000,000\"), pos = 2)\n     \n     \n     ##--- log scaled Examples ------------------------------\n     leg.txt <- c(\"a one\", \"a two\")\n     \n     par(mfrow = c(2, 2))\n     for(ll in c(\"\",\"x\",\"y\",\"xy\")) {\n       plot(2:10, log = ll, main = paste0(\"log = '\", ll, \"'\"))\n       abline(1, 1)\n       lines(2:3, 3:4, col = 2)\n       points(2, 2, col = 3)\n       rect(2, 3, 3, 2, col = 4)\n       text(c(3,3), 2:3, c(\"rect(2,3,3,2, col=4)\",\n                           \"text(c(3,3),2:3,\\\"c(rect(...)\\\")\"), adj = c(0, 0.3))\n       legend(list(x = 2,y = 8), legend = leg.txt, col = 2:3, pch = 1:2,\n              lty = 1)  #, trace = TRUE)\n     } #      ^^^^^^^ to force lines -> automatic merge=TRUE\n     par(mfrow = c(1,1))\n     \n     ##-- Math expressions:  ------------------------------\n     x <- seq(-pi, pi, length.out = 65)\n     plot(x, sin(x), type = \"l\", col = 2, xlab = expression(phi),\n          ylab = expression(f(phi)))\n     abline(h = -1:1, v = pi/2*(-6:6), col = \"gray90\")\n     lines(x, cos(x), col = 3, lty = 2)\n     ex.cs1 <- expression(plain(sin) * phi,  paste(\"cos\", phi))  # 2 ways\n     utils::str(legend(-3, .9, ex.cs1, lty = 1:2, plot = FALSE,\n                adj = c(0, 0.6)))  # adj y !\n     legend(-3, 0.9, ex.cs1, lty = 1:2, col = 2:3,  adj = c(0, 0.6))\n     \n     require(stats)\n     x <- rexp(100, rate = .5)\n     hist(x, main = \"Mean and Median of a Skewed Distribution\")\n     abline(v = mean(x),   col = 2, lty = 2, lwd = 2)\n     abline(v = median(x), col = 3, lty = 3, lwd = 2)\n     ex12 <- expression(bar(x) == sum(over(x[i], n), i == 1, n),\n                        hat(x) == median(x[i], i == 1, n))\n     utils::str(legend(4.1, 30, ex12, col = 2:3, lty = 2:3, lwd = 2))\n     \n     ## 'Filled' boxes -- see also example(barplot) which may call legend(*, fill=)\n     barplot(VADeaths)\n     legend(\"topright\", rownames(VADeaths), fill = gray.colors(nrow(VADeaths)))\n     \n     ## Using 'ncol'\n     x <- 0:64/64\n     for(R in c(identity, rev)) { # normal *and* reverse x-axis works fine:\n       xl <- R(range(x)); x1 <- xl[1]\n     matplot(x, outer(x, 1:7, function(x, k) sin(k * pi * x)), xlim=xl,\n             type = \"o\", col = 1:7, ylim = c(-1, 1.5), pch = \"*\")\n     op <- par(bg = \"antiquewhite1\")\n     legend(x1, 1.5, paste(\"sin(\", 1:7, \"pi * x)\"), col = 1:7, lty = 1:7,\n            pch = \"*\", ncol = 4, cex = 0.8)\n     legend(\"bottomright\", paste(\"sin(\", 1:7, \"pi * x)\"), col = 1:7, lty = 1:7,\n            pch = \"*\", cex = 0.8)\n     legend(x1, -.1, paste(\"sin(\", 1:4, \"pi * x)\"), col = 1:4, lty = 1:4,\n            ncol = 2, cex = 0.8)\n     legend(x1, -.4, paste(\"sin(\", 5:7, \"pi * x)\"), col = 4:6,  pch = 24,\n            ncol = 2, cex = 1.5, lwd = 2, pt.bg = \"pink\", pt.cex = 1:3)\n     par(op)\n       \n     } # for(..)\n     \n     ## point covering line :\n     y <- sin(3*pi*x)\n     plot(x, y, type = \"l\", col = \"blue\",\n         main = \"points with bg & legend(*, pt.bg)\")\n     points(x, y, pch = 21, bg = \"white\")\n     legend(.4,1, \"sin(c x)\", pch = 21, pt.bg = \"white\", lty = 1, col = \"blue\")\n     \n     ## legends with titles at different locations\n     plot(x, y, type = \"n\")\n     legend(\"bottomright\", \"(x,y)\", pch=1, title= \"bottomright\")\n     legend(\"bottom\",      \"(x,y)\", pch=1, title= \"bottom\")\n     legend(\"bottomleft\",  \"(x,y)\", pch=1, title= \"bottomleft\")\n     legend(\"left\",        \"(x,y)\", pch=1, title= \"left\")\n     legend(\"topleft\",     \"(x,y)\", pch=1, title= \"topleft, inset = .05\", inset = .05)\n     legend(\"top\",         \"(x,y)\", pch=1, title= \"top\")\n     legend(\"topright\",    \"(x,y)\", pch=1, title= \"topright, inset = .02\",inset = .02)\n     legend(\"right\",       \"(x,y)\", pch=1, title= \"right\")\n     legend(\"center\",      \"(x,y)\", pch=1, title= \"center\")\n     \n     # using text.font (and text.col):\n     op <- par(mfrow = c(2, 2), mar = rep(2.1, 4))\n     c6 <- terrain.colors(10)[1:6]\n     for(i in 1:4) {\n        plot(1, type = \"n\", axes = FALSE, ann = FALSE); title(paste(\"text.font =\",i))\n        legend(\"top\", legend = LETTERS[1:6], col = c6,\n               ncol = 2, cex = 2, lwd = 3, text.font = i, text.col = c6)\n     }\n     par(op)\n     \n     # using text.width for several columns\n     plot(1, type=\"n\")\n     legend(\"topleft\", c(\"This legend\", \"has\", \"equally sized\", \"columns.\"),\n            pch = 1:4, ncol = 4)\n     legend(\"bottomleft\", c(\"This legend\", \"has\", \"optimally sized\", \"columns.\"),\n            pch = 1:4, ncol = 4, text.width = NA)\n     legend(\"right\", letters[1:4], pch = 1:4, ncol = 4,\n            text.width = 1:4 / 50)\n```\n\n\n:::\n:::\n\n\n\n\n\n## Add legend to the plot\n\nReminder\n```\nlegend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n       border = \"black\", lty, lwd, pch,\n       angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n       box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n       pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n       xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n       adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n       text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n       plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n       inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n       title.cex = cex[1], title.font = text.font[1],\n       seg.len = 2)\n```\n\nLet's practice\n\n\n::: {.cell}\n\n```{.r .cell-code}\nbarplot(prop, col=c(\"darkblue\",\"red\"), ylim=c(0,0.7), main=\"Seropositivity by Age Group\")\nlegend(x=2.5, y=0.7,\n\t\t\t fill=c(\"darkblue\",\"red\"), \n\t\t\t legend = c(\"seronegative\", \"seropositive\"))\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-24-1.png){width=960}\n:::\n:::\n\n\n\n## `barplot()` example\n\nGetting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$seropos, df$age_group)\ntot.per.age.group <- colSums(freq)\nage.seropos.matrix <- t(t(freq)/tot.per.age.group)\ncolnames(age.seropos.matrix) <- c(\"1-5 yo\", \"6-10 yo\", \"11-15 yo\")\n\nbarplot(age.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=2.8, y=1.35,\n\t\t\t fill=c(\"darkblue\",\"red\"), \n\t\t\t legend = c(\"seronegative\", \"seropositive\"))\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png){width=960}\n:::\n:::\n\n\n\n## `barplot()` example\n\nNow, let look at seropositivity by two individual level characteristics in the same plot. \n\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\npar(mfrow = c(1,2))\nbarplot(age.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=1, y=1.35, fill=c(\"darkblue\",\"red\"), legend = c(\"seronegative\", \"seropositive\"))\n\nbarplot(slum.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Residence\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=1, y=1.35, fill=c(\"darkblue\",\"red\"),  legend = c(\"seronegative\", \"seropositive\"))\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png){width=960}\n:::\n:::\n\n\n\n\n\n## Summary\n\n-\t\t\n\n## Acknowledgements\n\nThese are the materials I looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Base Plotting in R\" by Medium](https://towardsdatascience.com/base-plotting-in-r-eb365da06b22)\n-\t\t[\"Base R margins: a cheatsheet\"](https://r-graph-gallery.com/74-margin-and-oma-cheatsheet.html)\n",
+    "markdown": "---\ntitle: \"Module 10: Data Visualization\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 10, you should be able to:\n\n- Create Base R plots\n\n## Import data for this module\n\nLet's read in our data (again) and take a quick look.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndf <- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n```\n:::\n:::\n\n\n## Prep data\n\nCreate `age_group` three level factor variable\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$age_group <- ifelse(df$age <= 5, \"young\", \n                       ifelse(df$age<=10 & df$age>5, \"middle\", \"old\")) \ndf$age_group <- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n```\n:::\n\n\nCreate `seropos` binary variable representing seropositivity if antibody concentrations are >10 IU/mL.\n\n::: {.cell}\n\n```{.r .cell-code}\ndf$seropos <- ifelse(df$IgG_concentration<10, 0, 1)\n```\n:::\n\n\n## Base R data visualizattion functions\n\nThe Base R 'graphics' package has a ton of graphics options. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nhelp(package = \"graphics\")\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n```\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\n\t\tInformation on package 'graphics'\n\nDescription:\n\nPackage:            graphics\nVersion:            4.3.1\nPriority:           base\nTitle:              The R Graphics Package\nAuthor:             R Core Team and contributors worldwide\nMaintainer:         R Core Team <do-use-Contact-address@r-project.org>\nContact:            R-help mailing list <r-help@r-project.org>\nDescription:        R functions for base graphics.\nImports:            grDevices\nLicense:            Part of R 4.3.1\nNeedsCompilation:   yes\nBuilt:              R 4.3.1; aarch64-apple-darwin20; 2023-06-16\n                    21:53:01 UTC; unix\n\nIndex:\n\nAxis                    Generic Function to Add an Axis to a Plot\nabline                  Add Straight Lines to a Plot\narrows                  Add Arrows to a Plot\nassocplot               Association Plots\naxTicks                 Compute Axis Tickmark Locations\naxis                    Add an Axis to a Plot\naxis.POSIXct            Date and Date-time Plotting Functions\nbarplot                 Bar Plots\nbox                     Draw a Box around a Plot\nboxplot                 Box Plots\nboxplot.matrix          Draw a Boxplot for each Column (Row) of a\n                        Matrix\nbxp                     Draw Box Plots from Summaries\ncdplot                  Conditional Density Plots\nclip                    Set Clipping Region\ncontour                 Display Contours\ncoplot                  Conditioning Plots\ncurve                   Draw Function Plots\ndotchart                Cleveland's Dot Plots\nfilled.contour          Level (Contour) Plots\nfourfoldplot            Fourfold Plots\nframe                   Create / Start a New Plot Frame\ngraphics-package        The R Graphics Package\ngrconvertX              Convert between Graphics Coordinate Systems\ngrid                    Add Grid to a Plot\nhist                    Histograms\nhist.POSIXt             Histogram of a Date or Date-Time Object\nidentify                Identify Points in a Scatter Plot\nimage                   Display a Color Image\nlayout                  Specifying Complex Plot Arrangements\nlegend                  Add Legends to Plots\nlines                   Add Connected Line Segments to a Plot\nlocator                 Graphical Input\nmatplot                 Plot Columns of Matrices\nmosaicplot              Mosaic Plots\nmtext                   Write Text into the Margins of a Plot\npairs                   Scatterplot Matrices\npanel.smooth            Simple Panel Plot\npar                     Set or Query Graphical Parameters\npersp                   Perspective Plots\npie                     Pie Charts\nplot.data.frame         Plot Method for Data Frames\nplot.default            The Default Scatterplot Function\nplot.design             Plot Univariate Effects of a Design or Model\nplot.factor             Plotting Factor Variables\nplot.formula            Formula Notation for Scatterplots\nplot.histogram          Plot Histograms\nplot.raster             Plotting Raster Images\nplot.table              Plot Methods for 'table' Objects\nplot.window             Set up World Coordinates for Graphics Window\nplot.xy                 Basic Internal Plot Function\npoints                  Add Points to a Plot\npolygon                 Polygon Drawing\npolypath                Path Drawing\nrasterImage             Draw One or More Raster Images\nrect                    Draw One or More Rectangles\nrug                     Add a Rug to a Plot\nscreen                  Creating and Controlling Multiple Screens on a\n                        Single Device\nsegments                Add Line Segments to a Plot\nsmoothScatter           Scatterplots with Smoothed Densities Color\n                        Representation\nspineplot               Spine Plots and Spinograms\nstars                   Star (Spider/Radar) Plots and Segment Diagrams\nstem                    Stem-and-Leaf Plots\nstripchart              1-D Scatter Plots\nstrwidth                Plotting Dimensions of Character Strings and\n                        Math Expressions\nsunflowerplot           Produce a Sunflower Scatter Plot\nsymbols                 Draw Symbols (Circles, Squares, Stars,\n                        Thermometers, Boxplots)\ntext                    Add Text to a Plot\ntitle                   Plot Annotation\nxinch                   Graphical Units\nxspline                 Draw an X-spline\n```\n:::\n:::\n\n\n\n\n## Base R Plotting\n\nTo make a plot you often need to specify the following features:\n\n1. Parameters\n2. Plot attributes\n3. The legend\n\n## 1. Parameters\n\nThe parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via `par()` before you call the plot creates ‘global’ settings for your plot.\n\nIn the example below, we have set two commonly used optional attributes in the global plot settings.\n\n-\tThe `mfrow` specifies that we have one row and two columns of plots — that is, two plots side by side. \n-\tThe `mar` attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).\n\n```\npar(mfrow = c(1,2), mar = c(5,5,4,1))\n```\n\n\n## 1. Parameters\n\n\n::: {.cell figwidth='100%'}\n::: {.cell-output-display}\n![](images/par.png)\n:::\n:::\n\n\n\n## Lots of parameters options\n\nHowever, there are many more parameter options that can be specified in the 'global' settings or specific to a certain plot option. \n\n\n::: {.cell}\n\n```{.r .cell-code}\n?par\n```\n:::\n\nSet or Query Graphical Parameters\n\nDescription:\n\n     'par' can be used to set or query graphical parameters.\n     Parameters can be set by specifying them as arguments to 'par' in\n     'tag = value' form, or by passing them as a list of tagged values.\n\nUsage:\n\n     par(..., no.readonly = FALSE)\n     \n     <highlevel plot> (...., <tag> = <value>)\n     \nArguments:\n\n     ...: arguments in 'tag = value' form, a single list of tagged\n          values, or character vectors of parameter names. Supported\n          parameters are described in the 'Graphical Parameters'\n          section.\n\nno.readonly: logical; if 'TRUE' and there are no other arguments, only\n          parameters are returned which can be set by a subsequent\n          'par()' call _on the same device_.\n\nDetails:\n\n     Each device has its own set of graphical parameters.  If the\n     current device is the null device, 'par' will open a new device\n     before querying/setting parameters.  (What device is controlled by\n     'options(\"device\")'.)\n\n     Parameters are queried by giving one or more character vectors of\n     parameter names to 'par'.\n\n     'par()' (no arguments) or 'par(no.readonly = TRUE)' is used to get\n     _all_ the graphical parameters (as a named list).  Their names are\n     currently taken from the unexported variable 'graphics:::.Pars'.\n\n     _*R.O.*_ indicates _*read-only arguments*_: These may only be used\n     in queries and cannot be set.  ('\"cin\"', '\"cra\"', '\"csi\"',\n     '\"cxy\"', '\"din\"' and '\"page\"' are always read-only.)\n\n     Several parameters can only be set by a call to 'par()':\n\n        • '\"ask\"',\n\n        • '\"fig\"', '\"fin\"',\n\n        • '\"lheight\"',\n\n        • '\"mai\"', '\"mar\"', '\"mex\"', '\"mfcol\"', '\"mfrow\"', '\"mfg\"',\n\n        • '\"new\"',\n\n        • '\"oma\"', '\"omd\"', '\"omi\"',\n\n        • '\"pin\"', '\"plt\"', '\"ps\"', '\"pty\"',\n\n        • '\"usr\"',\n\n        • '\"xlog\"', '\"ylog\"',\n\n        • '\"ylbias\"'\n\n     The remaining parameters can also be set as arguments (often via\n     '...') to high-level plot functions such as 'plot.default',\n     'plot.window', 'points', 'lines', 'abline', 'axis', 'title',\n     'text', 'mtext', 'segments', 'symbols', 'arrows', 'polygon',\n     'rect', 'box', 'contour', 'filled.contour' and 'image'.  Such\n     settings will be active during the execution of the function,\n     only.  However, see the comments on 'bg', 'cex', 'col', 'lty',\n     'lwd' and 'pch' which may be taken as _arguments_ to certain plot\n     functions rather than as graphical parameters.\n\n     The meaning of 'character size' is not well-defined: this is set\n     up for the device taking 'pointsize' into account but often not\n     the actual font family in use.  Internally the corresponding pars\n     ('cra', 'cin', 'cxy' and 'csi') are used only to set the\n     inter-line spacing used to convert 'mar' and 'oma' to physical\n     margins.  (The same inter-line spacing multiplied by 'lheight' is\n     used for multi-line strings in 'text' and 'strheight'.)\n\n     Note that graphical parameters are suggestions: plotting functions\n     and devices need not make use of them (and this is particularly\n     true of non-default methods for e.g. 'plot').\n\nValue:\n\n     When parameters are set, their previous values are returned in an\n     invisible named list.  Such a list can be passed as an argument to\n     'par' to restore the parameter values.  Use 'par(no.readonly =\n     TRUE)' for the full list of parameters that can be restored.\n     However, restoring all of these is not wise: see the 'Note'\n     section.\n\n     When just one parameter is queried, the value of that parameter is\n     returned as (atomic) vector.  When two or more parameters are\n     queried, their values are returned in a list, with the list names\n     giving the parameters.\n\n     Note the inconsistency: setting one parameter returns a list, but\n     querying one parameter returns a vector.\n\nGraphical Parameters:\n\n     'adj' The value of 'adj' determines the way in which text strings\n          are justified in 'text', 'mtext' and 'title'.  A value of '0'\n          produces left-justified text, '0.5' (the default) centered\n          text and '1' right-justified text.  (Any value in [0, 1] is\n          allowed, and on most devices values outside that interval\n          will also work.)\n\n          Note that the 'adj' _argument_ of 'text' also allows 'adj =\n          c(x, y)' for different adjustment in x- and y- directions.\n          Note that whereas for 'text' it refers to positioning of text\n          about a point, for 'mtext' and 'title' it controls placement\n          within the plot or device region.\n\n     'ann' If set to 'FALSE', high-level plotting functions calling\n          'plot.default' do not annotate the plots they produce with\n          axis titles and overall titles.  The default is to do\n          annotation.\n\n     'ask' logical.  If 'TRUE' (and the R session is interactive) the\n          user is asked for input, before a new figure is drawn.  As\n          this applies to the device, it also affects output by\n          packages 'grid' and 'lattice'.  It can be set even on\n          non-screen devices but may have no effect there.\n\n          This not really a graphics parameter, and its use is\n          deprecated in favour of 'devAskNewPage'.\n\n     'bg' The color to be used for the background of the device region.\n          When called from 'par()' it also sets 'new = FALSE'. See\n          section 'Color Specification' for suitable values.  For many\n          devices the initial value is set from the 'bg' argument of\n          the device, and for the rest it is normally '\"white\"'.\n\n          Note that some graphics functions such as 'plot.default' and\n          'points' have an _argument_ of this name with a different\n          meaning.\n\n     'bty' A character string which determined the type of 'box' which\n          is drawn about plots.  If 'bty' is one of '\"o\"' (the\n          default), '\"l\"', '\"7\"', '\"c\"', '\"u\"', or '\"]\"' the resulting\n          box resembles the corresponding upper case letter.  A value\n          of '\"n\"' suppresses the box.\n\n     'cex' A numerical value giving the amount by which plotting text\n          and symbols should be magnified relative to the default.\n          This starts as '1' when a device is opened, and is reset when\n          the layout is changed, e.g. by setting 'mfrow'.\n\n          Note that some graphics functions such as 'plot.default' have\n          an _argument_ of this name which _multiplies_ this graphical\n          parameter, and some functions such as 'points' and 'text'\n          accept a vector of values which are recycled.\n\n     'cex.axis' The magnification to be used for axis annotation\n          relative to the current setting of 'cex'.\n\n     'cex.lab' The magnification to be used for x and y labels relative\n          to the current setting of 'cex'.\n\n     'cex.main' The magnification to be used for main titles relative\n          to the current setting of 'cex'.\n\n     'cex.sub' The magnification to be used for sub-titles relative to\n          the current setting of 'cex'.\n\n     'cin' _*R.O.*_; character size '(width, height)' in inches.  These\n          are the same measurements as 'cra', expressed in different\n          units.\n\n     'col' A specification for the default plotting color.  See section\n          'Color Specification'.\n\n          Some functions such as 'lines' and 'text' accept a vector of\n          values which are recycled and may be interpreted slightly\n          differently.\n\n     'col.axis' The color to be used for axis annotation.  Defaults to\n          '\"black\"'.\n\n     'col.lab' The color to be used for x and y labels.  Defaults to\n          '\"black\"'.\n\n     'col.main' The color to be used for plot main titles.  Defaults to\n          '\"black\"'.\n\n     'col.sub' The color to be used for plot sub-titles.  Defaults to\n          '\"black\"'.\n\n     'cra' _*R.O.*_; size of default character '(width, height)' in\n          'rasters' (pixels).  Some devices have no concept of pixels\n          and so assume an arbitrary pixel size, usually 1/72 inch.\n          These are the same measurements as 'cin', expressed in\n          different units.\n\n     'crt' A numerical value specifying (in degrees) how single\n          characters should be rotated.  It is unwise to expect values\n          other than multiples of 90 to work.  Compare with 'srt' which\n          does string rotation.\n\n     'csi' _*R.O.*_; height of (default-sized) characters in inches.\n          The same as 'par(\"cin\")[2]'.\n\n     'cxy' _*R.O.*_; size of default character '(width, height)' in\n          user coordinate units.  'par(\"cxy\")' is\n          'par(\"cin\")/par(\"pin\")' scaled to user coordinates.  Note\n          that 'c(strwidth(ch), strheight(ch))' for a given string 'ch'\n          is usually much more precise.\n\n     'din' _*R.O.*_; the device dimensions, '(width, height)', in\n          inches.  See also 'dev.size', which is updated immediately\n          when an on-screen device windows is re-sized.\n\n     'err' (_Unimplemented_; R is silent when points outside the plot\n          region are _not_ plotted.)  The degree of error reporting\n          desired.\n\n     'family' The name of a font family for drawing text.  The maximum\n          allowed length is 200 bytes.  This name gets mapped by each\n          graphics device to a device-specific font description.  The\n          default value is '\"\"' which means that the default device\n          fonts will be used (and what those are should be listed on\n          the help page for the device).  Standard values are\n          '\"serif\"', '\"sans\"' and '\"mono\"', and the Hershey font\n          families are also available.  (Devices may define others, and\n          some devices will ignore this setting completely.  Names\n          starting with '\"Hershey\"' are treated specially and should\n          only be used for the built-in Hershey font families.)  This\n          can be specified inline for 'text'.\n\n     'fg' The color to be used for the foreground of plots.  This is\n          the default color used for things like axes and boxes around\n          plots.  When called from 'par()' this also sets parameter\n          'col' to the same value.  See section 'Color Specification'.\n          A few devices have an argument to set the initial value,\n          which is otherwise '\"black\"'.\n\n     'fig' A numerical vector of the form 'c(x1, x2, y1, y2)' which\n          gives the (NDC) coordinates of the figure region in the\n          display region of the device. If you set this, unlike S, you\n          start a new plot, so to add to an existing plot use 'new =\n          TRUE' as well.\n\n     'fin' The figure region dimensions, '(width, height)', in inches.\n          If you set this, unlike S, you start a new plot.\n\n     'font' An integer which specifies which font to use for text.  If\n          possible, device drivers arrange so that 1 corresponds to\n          plain text (the default), 2 to bold face, 3 to italic and 4\n          to bold italic.  Also, font 5 is expected to be the symbol\n          font, in Adobe symbol encoding.  On some devices font\n          families can be selected by 'family' to choose different sets\n          of 5 fonts.\n\n     'font.axis' The font to be used for axis annotation.\n\n     'font.lab' The font to be used for x and y labels.\n\n     'font.main' The font to be used for plot main titles.\n\n     'font.sub' The font to be used for plot sub-titles.\n\n     'lab' A numerical vector of the form 'c(x, y, len)' which modifies\n          the default way that axes are annotated.  The values of 'x'\n          and 'y' give the (approximate) number of tickmarks on the x\n          and y axes and 'len' specifies the label length.  The default\n          is 'c(5, 5, 7)'.  'len' _is unimplemented_ in R.\n\n     'las' numeric in {0,1,2,3}; the style of axis labels.\n\n          0: always parallel to the axis [_default_],\n\n          1: always horizontal,\n\n          2: always perpendicular to the axis,\n\n          3: always vertical.\n\n          Also supported by 'mtext'.  Note that string/character\n          rotation _via_ argument 'srt' to 'par' does _not_ affect the\n          axis labels.\n\n     'lend' The line end style.  This can be specified as an integer or\n          string:\n\n          '0' and '\"round\"' mean rounded line caps [_default_];\n\n          '1' and '\"butt\"' mean butt line caps;\n\n          '2' and '\"square\"' mean square line caps.\n\n     'lheight' The line height multiplier.  The height of a line of\n          text (used to vertically space multi-line text) is found by\n          multiplying the character height both by the current\n          character expansion and by the line height multiplier.\n          Default value is 1.  Used in 'text' and 'strheight'.\n\n     'ljoin' The line join style.  This can be specified as an integer\n          or string:\n\n          '0' and '\"round\"' mean rounded line joins [_default_];\n\n          '1' and '\"mitre\"' mean mitred line joins;\n\n          '2' and '\"bevel\"' mean bevelled line joins.\n\n     'lmitre' The line mitre limit.  This controls when mitred line\n          joins are automatically converted into bevelled line joins.\n          The value must be larger than 1 and the default is 10.  Not\n          all devices will honour this setting.\n\n     'lty' The line type.  Line types can either be specified as an\n          integer (0=blank, 1=solid (default), 2=dashed, 3=dotted,\n          4=dotdash, 5=longdash, 6=twodash) or as one of the character\n          strings '\"blank\"', '\"solid\"', '\"dashed\"', '\"dotted\"',\n          '\"dotdash\"', '\"longdash\"', or '\"twodash\"', where '\"blank\"'\n          uses 'invisible lines' (i.e., does not draw them).\n\n          Alternatively, a string of up to 8 characters (from 'c(1:9,\n          \"A\":\"F\")') may be given, giving the length of line segments\n          which are alternatively drawn and skipped.  See section 'Line\n          Type Specification'.\n\n          Functions such as 'lines' and 'segments' accept a vector of\n          values which are recycled.\n\n     'lwd' The line width, a _positive_ number, defaulting to '1'.  The\n          interpretation is device-specific, and some devices do not\n          implement line widths less than one.  (See the help on the\n          device for details of the interpretation.)\n\n          Functions such as 'lines' and 'segments' accept a vector of\n          values which are recycled: in such uses lines corresponding\n          to values 'NA' or 'NaN' are omitted.  The interpretation of\n          '0' is device-specific.\n\n     'mai' A numerical vector of the form 'c(bottom, left, top, right)'\n          which gives the margin size specified in inches.\n\n     'mar' A numerical vector of the form 'c(bottom, left, top, right)'\n          which gives the number of lines of margin to be specified on\n          the four sides of the plot.  The default is 'c(5, 4, 4, 2) +\n          0.1'.\n\n     'mex' 'mex' is a character size expansion factor which is used to\n          describe coordinates in the margins of plots. Note that this\n          does not change the font size, rather specifies the size of\n          font (as a multiple of 'csi') used to convert between 'mar'\n          and 'mai', and between 'oma' and 'omi'.\n\n          This starts as '1' when the device is opened, and is reset\n          when the layout is changed (alongside resetting 'cex').\n\n     'mfcol, mfrow' A vector of the form 'c(nr, nc)'.  Subsequent\n          figures will be drawn in an 'nr'-by-'nc' array on the device\n          by _columns_ ('mfcol'), or _rows_ ('mfrow'), respectively.\n\n          In a layout with exactly two rows and columns the base value\n          of '\"cex\"' is reduced by a factor of 0.83: if there are three\n          or more of either rows or columns, the reduction factor is\n          0.66.\n\n          Setting a layout resets the base value of 'cex' and that of\n          'mex' to '1'.\n\n          If either of these is queried it will give the current\n          layout, so querying cannot tell you the order in which the\n          array will be filled.\n\n          Consider the alternatives, 'layout' and 'split.screen'.\n\n     'mfg' A numerical vector of the form 'c(i, j)' where 'i' and 'j'\n          indicate which figure in an array of figures is to be drawn\n          next (if setting) or is being drawn (if enquiring).  The\n          array must already have been set by 'mfcol' or 'mfrow'.\n\n          For compatibility with S, the form 'c(i, j, nr, nc)' is also\n          accepted, when 'nr' and 'nc' should be the current number of\n          rows and number of columns.  Mismatches will be ignored, with\n          a warning.\n\n     'mgp' The margin line (in 'mex' units) for the axis title, axis\n          labels and axis line.  Note that 'mgp[1]' affects 'title'\n          whereas 'mgp[2:3]' affect 'axis'.  The default is 'c(3, 1,\n          0)'.\n\n     'mkh' The height in inches of symbols to be drawn when the value\n          of 'pch' is an integer. _Completely ignored in R_.\n\n     'new' logical, defaulting to 'FALSE'.  If set to 'TRUE', the next\n          high-level plotting command (actually 'plot.new') should _not\n          clean_ the frame before drawing _as if it were on a *_new_*\n          device_.  It is an error (ignored with a warning) to try to\n          use 'new = TRUE' on a device that does not currently contain\n          a high-level plot.\n\n     'oma' A vector of the form 'c(bottom, left, top, right)' giving\n          the size of the outer margins in lines of text.\n\n     'omd' A vector of the form 'c(x1, x2, y1, y2)' giving the region\n          _inside_ outer margins in NDC (= normalized device\n          coordinates), i.e., as a fraction (in [0, 1]) of the device\n          region.\n\n     'omi' A vector of the form 'c(bottom, left, top, right)' giving\n          the size of the outer margins in inches.\n\n     'page' _*R.O.*_; A boolean value indicating whether the next call\n          to 'plot.new' is going to start a new page.  This value may\n          be 'FALSE' if there are multiple figures on the page.\n\n     'pch' Either an integer specifying a symbol or a single character\n          to be used as the default in plotting points.  See 'points'\n          for possible values and their interpretation.  Note that only\n          integers and single-character strings can be set as a\n          graphics parameter (and not 'NA' nor 'NULL').\n\n          Some functions such as 'points' accept a vector of values\n          which are recycled.\n\n     'pin' The current plot dimensions, '(width, height)', in inches.\n\n     'plt' A vector of the form 'c(x1, x2, y1, y2)' giving the\n          coordinates of the plot region as fractions of the current\n          figure region.\n\n     'ps' integer; the point size of text (but not symbols).  Unlike\n          the 'pointsize' argument of most devices, this does not\n          change the relationship between 'mar' and 'mai' (nor 'oma'\n          and 'omi').\n\n          What is meant by 'point size' is device-specific, but most\n          devices mean a multiple of 1bp, that is 1/72 of an inch.\n\n     'pty' A character specifying the type of plot region to be used;\n          '\"s\"' generates a square plotting region and '\"m\"' generates\n          the maximal plotting region.\n\n     'smo' (_Unimplemented_) a value which indicates how smooth circles\n          and circular arcs should be.\n\n     'srt' The string rotation in degrees.  See the comment about\n          'crt'.  Only supported by 'text'.\n\n     'tck' The length of tick marks as a fraction of the smaller of the\n          width or height of the plotting region.  If 'tck >= 0.5' it\n          is interpreted as a fraction of the relevant side, so if 'tck\n          = 1' grid lines are drawn.  The default setting ('tck = NA')\n          is to use 'tcl = -0.5'.\n\n     'tcl' The length of tick marks as a fraction of the height of a\n          line of text.  The default value is '-0.5'; setting 'tcl =\n          NA' sets 'tck = -0.01' which is S' default.\n\n     'usr' A vector of the form 'c(x1, x2, y1, y2)' giving the extremes\n          of the user coordinates of the plotting region.  When a\n          logarithmic scale is in use (i.e., 'par(\"xlog\")' is true, see\n          below), then the x-limits will be '10 ^ par(\"usr\")[1:2]'.\n          Similarly for the y-axis.\n\n     'xaxp' A vector of the form 'c(x1, x2, n)' giving the coordinates\n          of the extreme tick marks and the number of intervals between\n          tick-marks when 'par(\"xlog\")' is false.  Otherwise, when\n          _log_ coordinates are active, the three values have a\n          different meaning: For a small range, 'n' is _negative_, and\n          the ticks are as in the linear case, otherwise, 'n' is in\n          '1:3', specifying a case number, and 'x1' and 'x2' are the\n          lowest and highest power of 10 inside the user coordinates,\n          '10 ^ par(\"usr\")[1:2]'. (The '\"usr\"' coordinates are\n          log10-transformed here!)\n\n          n = 1 will produce tick marks at 10^j for integer j,\n\n          n = 2 gives marks k 10^j with k in {1,5},\n\n          n = 3 gives marks k 10^j with k in {1,2,5}.\n\n          See 'axTicks()' for a pure R implementation of this.\n\n          This parameter is reset when a user coordinate system is set\n          up, for example by starting a new page or by calling\n          'plot.window' or setting 'par(\"usr\")': 'n' is taken from\n          'par(\"lab\")'.  It affects the default behaviour of subsequent\n          calls to 'axis' for sides 1 or 3.\n\n          It is only relevant to default numeric axis systems, and not\n          for example to dates.\n\n     'xaxs' The style of axis interval calculation to be used for the\n          x-axis.  Possible values are '\"r\"', '\"i\"', '\"e\"', '\"s\"',\n          '\"d\"'.  The styles are generally controlled by the range of\n          data or 'xlim', if given.\n          Style '\"r\"' (regular) first extends the data range by 4\n          percent at each end and then finds an axis with pretty labels\n          that fits within the extended range.\n          Style '\"i\"' (internal) just finds an axis with pretty labels\n          that fits within the original data range.\n          Style '\"s\"' (standard) finds an axis with pretty labels\n          within which the original data range fits.\n          Style '\"e\"' (extended) is like style '\"s\"', except that it is\n          also ensures that there is room for plotting symbols within\n          the bounding box.\n          Style '\"d\"' (direct) specifies that the current axis should\n          be used on subsequent plots.\n          (_Only '\"r\"' and '\"i\"' styles have been implemented in R._)\n\n     'xaxt' A character which specifies the x axis type.  Specifying\n          '\"n\"' suppresses plotting of the axis.  The standard value is\n          '\"s\"': for compatibility with S values '\"l\"' and '\"t\"' are\n          accepted but are equivalent to '\"s\"': any value other than\n          '\"n\"' implies plotting.\n\n     'xlog' A logical value (see 'log' in 'plot.default').  If 'TRUE',\n          a logarithmic scale is in use (e.g., after 'plot(*, log =\n          \"x\")').  For a new device, it defaults to 'FALSE', i.e.,\n          linear scale.\n\n     'xpd' A logical value or 'NA'.  If 'FALSE', all plotting is\n          clipped to the plot region, if 'TRUE', all plotting is\n          clipped to the figure region, and if 'NA', all plotting is\n          clipped to the device region.  See also 'clip'.\n\n     'yaxp' A vector of the form 'c(y1, y2, n)' giving the coordinates\n          of the extreme tick marks and the number of intervals between\n          tick-marks unless for log coordinates, see 'xaxp' above.\n\n     'yaxs' The style of axis interval calculation to be used for the\n          y-axis.  See 'xaxs' above.\n\n     'yaxt' A character which specifies the y axis type.  Specifying\n          '\"n\"' suppresses plotting.\n\n     'ylbias' A positive real value used in the positioning of text in\n          the margins by 'axis' and 'mtext'.  The default is in\n          principle device-specific, but currently '0.2' for all of R's\n          own devices.  Set this to '0.2' for compatibility with R <\n          2.14.0 on 'x11' and 'windows()' devices.\n\n     'ylog' A logical value; see 'xlog' above.\n\nColor Specification:\n\n     Colors can be specified in several different ways. The simplest\n     way is with a character string giving the color name (e.g.,\n     '\"red\"').  A list of the possible colors can be obtained with the\n     function 'colors'.  Alternatively, colors can be specified\n     directly in terms of their RGB components with a string of the\n     form '\"#RRGGBB\"' where each of the pairs 'RR', 'GG', 'BB' consist\n     of two hexadecimal digits giving a value in the range '00' to\n     'FF'.  Colors can also be specified by giving an index into a\n     small table of colors, the 'palette': indices wrap round so with\n     the default palette of size 8, '10' is the same as '2'.  This\n     provides compatibility with S.  Index '0' corresponds to the\n     background color.  Note that the palette (apart from '0' which is\n     per-device) is a per-session setting.\n\n     Negative integer colours are errors.\n\n     Additionally, '\"transparent\"' is _transparent_, useful for filled\n     areas (such as the background!), and just invisible for things\n     like lines or text.  In most circumstances (integer) 'NA' is\n     equivalent to '\"transparent\"' (but not for 'text' and 'mtext').\n\n     Semi-transparent colors are available for use on devices that\n     support them.\n\n     The functions 'rgb', 'hsv', 'hcl', 'gray' and 'rainbow' provide\n     additional ways of generating colors.\n\nLine Type Specification:\n\n     Line types can either be specified by giving an index into a small\n     built-in table of line types (1 = solid, 2 = dashed, etc, see\n     'lty' above) or directly as the lengths of on/off stretches of\n     line.  This is done with a string of an even number (up to eight)\n     of characters, namely _non-zero_ (hexadecimal) digits which give\n     the lengths in consecutive positions in the string.  For example,\n     the string '\"33\"' specifies three units on followed by three off\n     and '\"3313\"' specifies three units on followed by three off\n     followed by one on and finally three off.  The 'units' here are\n     (on most devices) proportional to 'lwd', and with 'lwd = 1' are in\n     pixels or points or 1/96 inch.\n\n     The five standard dash-dot line types ('lty = 2:6') correspond to\n     'c(\"44\", \"13\", \"1343\", \"73\", \"2262\")'.\n\n     Note that 'NA' is not a valid value for 'lty'.\n\nNote:\n\n     The effect of restoring all the (settable) graphics parameters as\n     in the examples is hard to predict if the device has been resized.\n     Several of them are attempting to set the same things in different\n     ways, and those last in the alphabet will win.  In particular, the\n     settings of 'mai', 'mar', 'pin', 'plt' and 'pty' interact, as do\n     the outer margin settings, the figure layout and figure region\n     size.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\n\nSee Also:\n\n     'plot.default' for some high-level plotting parameters; 'colors';\n     'clip'; 'options' for other setup parameters; graphic devices\n     'x11', 'postscript' and setting up device regions by 'layout' and\n     'split.screen'.\n\nExamples:\n\n     op <- par(mfrow = c(2, 2), # 2 x 2 pictures on one plot\n               pty = \"s\")       # square plotting region,\n                                # independent of device size\n     \n     ## At end of plotting, reset to previous settings:\n     par(op)\n     \n     ## Alternatively,\n     op <- par(no.readonly = TRUE) # the whole list of settable par's.\n     ## do lots of plotting and par(.) calls, then reset:\n     par(op)\n     ## Note this is not in general good practice\n     \n     par(\"ylog\") # FALSE\n     plot(1 : 12, log = \"y\")\n     par(\"ylog\") # TRUE\n     \n     plot(1:2, xaxs = \"i\") # 'inner axis' w/o extra space\n     par(c(\"usr\", \"xaxp\"))\n     \n     ( nr.prof <-\n     c(prof.pilots = 16, lawyers = 11, farmers = 10, salesmen = 9, physicians = 9,\n       mechanics = 6, policemen = 6, managers = 6, engineers = 5, teachers = 4,\n       housewives = 3, students = 3, armed.forces = 1))\n     par(las = 3)\n     barplot(rbind(nr.prof)) # R 0.63.2: shows alignment problem\n     par(las = 0)  # reset to default\n     \n     require(grDevices) # for gray\n     ## 'fg' use:\n     plot(1:12, type = \"b\", main = \"'fg' : axes, ticks and box in gray\",\n          fg = gray(0.7), bty = \"7\" , sub = R.version.string)\n     \n     ex <- function() {\n        old.par <- par(no.readonly = TRUE) # all par settings which\n                                           # could be changed.\n        on.exit(par(old.par))\n        ## ...\n        ## ... do lots of par() settings and plots\n        ## ...\n        invisible() #-- now,  par(old.par)  will be executed\n     }\n     ex()\n     \n     ## Line types\n     showLty <- function(ltys, xoff = 0, ...) {\n        stopifnot((n <- length(ltys)) >= 1)\n        op <- par(mar = rep(.5,4)); on.exit(par(op))\n        plot(0:1, 0:1, type = \"n\", axes = FALSE, ann = FALSE)\n        y <- (n:1)/(n+1)\n        clty <- as.character(ltys)\n        mytext <- function(x, y, txt)\n           text(x, y, txt, adj = c(0, -.3), cex = 0.8, ...)\n        abline(h = y, lty = ltys, ...); mytext(xoff, y, clty)\n        y <- y - 1/(3*(n+1))\n        abline(h = y, lty = ltys, lwd = 2, ...)\n        mytext(1/8+xoff, y, paste(clty,\" lwd = 2\"))\n     }\n     showLty(c(\"solid\", \"dashed\", \"dotted\", \"dotdash\", \"longdash\", \"twodash\"))\n     par(new = TRUE)  # the same:\n     showLty(c(\"solid\", \"44\", \"13\", \"1343\", \"73\", \"2262\"), xoff = .2, col = 2)\n     showLty(c(\"11\", \"22\", \"33\", \"44\",   \"12\", \"13\", \"14\",   \"21\", \"31\"))\n\n\n## Common parameter options\n\nEight useful parameter arguments help improve the readability of the plot:\n\n- `xlab`: specifies the x-axis label of the plot\n- `ylab`: specifies the y-axis label\n- `main`: titles your graph\n- `pch`: specifies the symbology of your graph\n- `lty`: specifies the line type of your graph\n- `lwd`: specifies line thickness\n-\t`cex` : specifies size\n- `col`: specifies the colors for your graph.\n\nWe will explore use of these arguments below.\n\n## Common parameter options\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](images/atrributes.png){width=200%}\n:::\n:::\n\n\n\n## 2. Plot Attributes\n\nPlot attributes are those that map your data to the plot. This mean this is where you specify what variables in the data frame you want to plot. \n\nWe will only look at four types of plots today:\n\n- `hist()` displays histogram of one variable\n- `plot()` displays x-y plot of two variables\n- `boxplot()` displays boxplot \n- `barplot()` displays barplot\n\n\n## `histogram()` Help File\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?hist\n```\n:::\n\nHistograms\n\nDescription:\n\n     The generic function 'hist' computes a histogram of the given data\n     values.  If 'plot = TRUE', the resulting object of class\n     '\"histogram\"' is plotted by 'plot.histogram', before it is\n     returned.\n\nUsage:\n\n     hist(x, ...)\n     \n     ## Default S3 method:\n     hist(x, breaks = \"Sturges\",\n          freq = NULL, probability = !freq,\n          include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n          density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n          main = paste(\"Histogram of\" , xname),\n          xlim = range(breaks), ylim = NULL,\n          xlab = xname, ylab,\n          axes = TRUE, plot = TRUE, labels = FALSE,\n          nclass = NULL, warn.unused = TRUE, ...)\n     \nArguments:\n\n       x: a vector of values for which the histogram is desired.\n\n  breaks: one of:\n\n            • a vector giving the breakpoints between histogram cells,\n\n            • a function to compute the vector of breakpoints,\n\n            • a single number giving the number of cells for the\n              histogram,\n\n            • a character string naming an algorithm to compute the\n              number of cells (see 'Details'),\n\n            • a function to compute the number of cells.\n\n          In the last three cases the number is a suggestion only; as\n          the breakpoints will be set to 'pretty' values, the number is\n          limited to '1e6' (with a warning if it was larger).  If\n          'breaks' is a function, the 'x' vector is supplied to it as\n          the only argument (and the number of breaks is only limited\n          by the amount of available memory).\n\n    freq: logical; if 'TRUE', the histogram graphic is a representation\n          of frequencies, the 'counts' component of the result; if\n          'FALSE', probability densities, component 'density', are\n          plotted (so that the histogram has a total area of one).\n          Defaults to 'TRUE' _if and only if_ 'breaks' are equidistant\n          (and 'probability' is not specified).\n\nprobability: an _alias_ for '!freq', for S compatibility.\n\ninclude.lowest: logical; if 'TRUE', an 'x[i]' equal to the 'breaks'\n          value will be included in the first (or last, for 'right =\n          FALSE') bar.  This will be ignored (with a warning) unless\n          'breaks' is a vector.\n\n   right: logical; if 'TRUE', the histogram cells are right-closed\n          (left open) intervals.\n\n    fuzz: non-negative number, for the case when the data is \"pretty\"\n          and some observations 'x[.]' are close but not exactly on a\n          'break'.  For counting fuzzy breaks proportional to 'fuzz'\n          are used.  The default is occasionally suboptimal.\n\n density: the density of shading lines, in lines per inch.  The default\n          value of 'NULL' means that no shading lines are drawn.\n          Non-positive values of 'density' also inhibit the drawing of\n          shading lines.\n\n   angle: the slope of shading lines, given as an angle in degrees\n          (counter-clockwise).\n\n     col: a colour to be used to fill the bars.\n\n  border: the color of the border around the bars.  The default is to\n          use the standard foreground color.\n\nmain, xlab, ylab: main title and axis labels: these arguments to\n          'title()' get \"smart\" defaults here, e.g., the default 'ylab'\n          is '\"Frequency\"' iff 'freq' is true.\n\nxlim, ylim: the range of x and y values with sensible defaults.  Note\n          that 'xlim' is _not_ used to define the histogram (breaks),\n          but only for plotting (when 'plot = TRUE').\n\n    axes: logical.  If 'TRUE' (default), axes are draw if the plot is\n          drawn.\n\n    plot: logical.  If 'TRUE' (default), a histogram is plotted,\n          otherwise a list of breaks and counts is returned.  In the\n          latter case, a warning is used if (typically graphical)\n          arguments are specified that only apply to the 'plot = TRUE'\n          case.\n\n  labels: logical or character string.  Additionally draw labels on top\n          of bars, if not 'FALSE'; see 'plot.histogram'.\n\n  nclass: numeric (integer).  For S(-PLUS) compatibility only, 'nclass'\n          is equivalent to 'breaks' for a scalar or character argument.\n\nwarn.unused: logical.  If 'plot = FALSE' and 'warn.unused = TRUE', a\n          warning will be issued when graphical parameters are passed\n          to 'hist.default()'.\n\n     ...: further arguments and graphical parameters passed to\n          'plot.histogram' and thence to 'title' and 'axis' (if 'plot =\n          TRUE').\n\nDetails:\n\n     The definition of _histogram_ differs by source (with\n     country-specific biases).  R's default with equi-spaced breaks\n     (also the default) is to plot the counts in the cells defined by\n     'breaks'.  Thus the height of a rectangle is proportional to the\n     number of points falling into the cell, as is the area _provided_\n     the breaks are equally-spaced.\n\n     The default with non-equi-spaced breaks is to give a plot of area\n     one, in which the _area_ of the rectangles is the fraction of the\n     data points falling in the cells.\n\n     If 'right = TRUE' (default), the histogram cells are intervals of\n     the form (a, b], i.e., they include their right-hand endpoint, but\n     not their left one, with the exception of the first cell when\n     'include.lowest' is 'TRUE'.\n\n     For 'right = FALSE', the intervals are of the form [a, b), and\n     'include.lowest' means '_include highest_'.\n\n     A numerical tolerance of 1e-7 times the median bin size (for more\n     than four bins, otherwise the median is substituted) is applied\n     when counting entries on the edges of bins.  This is not included\n     in the reported 'breaks' nor in the calculation of 'density'.\n\n     The default for 'breaks' is '\"Sturges\"': see 'nclass.Sturges'.\n     Other names for which algorithms are supplied are '\"Scott\"' and\n     '\"FD\"' / '\"Freedman-Diaconis\"' (with corresponding functions\n     'nclass.scott' and 'nclass.FD').  Case is ignored and partial\n     matching is used.  Alternatively, a function can be supplied which\n     will compute the intended number of breaks or the actual\n     breakpoints as a function of 'x'.\n\nValue:\n\n     an object of class '\"histogram\"' which is a list with components:\n\n  breaks: the n+1 cell boundaries (= 'breaks' if that was a vector).\n          These are the nominal breaks, not with the boundary fuzz.\n\n  counts: n integers; for each cell, the number of 'x[]' inside.\n\n density: values f^(x[i]), as estimated density values. If\n          'all(diff(breaks) == 1)', they are the relative frequencies\n          'counts/n' and in general satisfy sum[i; f^(x[i])\n          (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.\n\n    mids: the n cell midpoints.\n\n   xname: a character string with the actual 'x' argument name.\n\nequidist: logical, indicating if the distances between 'breaks' are all\n          the same.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Venables, W. N. and Ripley. B. D. (2002) _Modern Applied\n     Statistics with S_.  Springer.\n\nSee Also:\n\n     'nclass.Sturges', 'stem', 'density', 'truehist' in package 'MASS'.\n\n     Typical plots with vertical bars are _not_ histograms.  Consider\n     'barplot' or 'plot(*, type = \"h\")' for such bar plots.\n\nExamples:\n\n     op <- par(mfrow = c(2, 2))\n     hist(islands)\n     utils::str(hist(islands, col = \"gray\", labels = TRUE))\n     \n     hist(sqrt(islands), breaks = 12, col = \"lightblue\", border = \"pink\")\n     ##-- For non-equidistant breaks, counts should NOT be graphed unscaled:\n     r <- hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140),\n               col = \"blue1\")\n     text(r$mids, r$density, r$counts, adj = c(.5, -.5), col = \"blue3\")\n     sapply(r[2:3], sum)\n     sum(r$density * diff(r$breaks)) # == 1\n     lines(r, lty = 3, border = \"purple\") # -> lines.histogram(*)\n     par(op)\n     \n     require(utils) # for str\n     str(hist(islands, breaks = 12, plot =  FALSE)) #-> 10 (~= 12) breaks\n     str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE))\n     \n     hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE,\n          main = \"WRONG histogram\") # and warning\n     \n     ## Extreme outliers; the \"FD\" rule would take very large number of 'breaks':\n     XXL <- c(1:9, c(-1,1)*1e300)\n     hh <- hist(XXL, \"FD\") # did not work in R <= 3.4.1; now gives warning\n     ## pretty() determines how many counts are used (platform dependently!):\n     length(hh$breaks) ## typically 1 million -- though 1e6 was \"a suggestion only\"\n     \n     ## R >= 4.2.0: no \"*.5\" labels on y-axis:\n     hist(c(2,3,3,5,5,6,6,6,7))\n     \n     require(stats)\n     set.seed(14)\n     x <- rchisq(100, df = 4)\n     \n     ## Histogram with custom x-axis:\n     hist(x, xaxt = \"n\")\n     axis(1, at = 0:17)\n     \n     \n     ## Comparing data with a model distribution should be done with qqplot()!\n     qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2)\n     \n     ## if you really insist on using hist() ... :\n     hist(x, freq = FALSE, ylim = c(0, 0.2))\n     curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)\n\n\n## `histogram()` example\n\nReminder function signature\n```\nhist(x, breaks = \"Sturges\",\n     freq = NULL, probability = !freq,\n     include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n     density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n     main = paste(\"Histogram of\" , xname),\n     xlim = range(breaks), ylim = NULL,\n     xlab = xname, ylab,\n     axes = TRUE, plot = TRUE, labels = FALSE,\n     nclass = NULL, warn.unused = TRUE, ...)\n```\n\nLet's practice\n\n::: {.cell}\n\n```{.r .cell-code}\nhist(df$age)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-1.png){width=960}\n:::\n\n```{.r .cell-code}\nhist(\n\tdf$age, \n\tfreq=FALSE, \n\tmain=\"Histogram\", \n\txlab=\"Age (years)\"\n\t)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-2.png){width=960}\n:::\n:::\n\n\n\n## `plot()` Help File\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?plot\n```\n:::\n\nGeneric X-Y Plotting\n\nDescription:\n\n     Generic function for plotting of R objects.\n\n     For simple scatter plots, 'plot.default' will be used.  However,\n     there are 'plot' methods for many R objects, including\n     'function's, 'data.frame's, 'density' objects, etc.  Use\n     'methods(plot)' and the documentation for these. Most of these\n     methods are implemented using traditional graphics (the 'graphics'\n     package), but this is not mandatory.\n\n     For more details about graphical parameter arguments used by\n     traditional graphics, see 'par'.\n\nUsage:\n\n     plot(x, y, ...)\n     \nArguments:\n\n       x: the coordinates of points in the plot. Alternatively, a\n          single plotting structure, function or _any R object with a\n          'plot' method_ can be provided.\n\n       y: the y coordinates of points in the plot, _optional_ if 'x' is\n          an appropriate structure.\n\n     ...: Arguments to be passed to methods, such as graphical\n          parameters (see 'par').  Many methods will accept the\n          following arguments:\n\n          'type' what type of plot should be drawn.  Possible types are\n\n                • '\"p\"' for *p*oints,\n\n                • '\"l\"' for *l*ines,\n\n                • '\"b\"' for *b*oth,\n\n                • '\"c\"' for the lines part alone of '\"b\"',\n\n                • '\"o\"' for both '*o*verplotted',\n\n                • '\"h\"' for '*h*istogram' like (or 'high-density')\n                  vertical lines,\n\n                • '\"s\"' for stair *s*teps,\n\n                • '\"S\"' for other *s*teps, see 'Details' below,\n\n                • '\"n\"' for no plotting.\n\n              All other 'type's give a warning or an error; using,\n              e.g., 'type = \"punkte\"' being equivalent to 'type = \"p\"'\n              for S compatibility.  Note that some methods, e.g.\n              'plot.factor', do not accept this.\n\n          'main' an overall title for the plot: see 'title'.\n\n          'sub' a subtitle for the plot: see 'title'.\n\n          'xlab' a title for the x axis: see 'title'.\n\n          'ylab' a title for the y axis: see 'title'.\n\n          'asp' the y/x aspect ratio, see 'plot.window'.\n\nDetails:\n\n     The two step types differ in their x-y preference: Going from\n     (x1,y1) to (x2,y2) with x1 < x2, 'type = \"s\"' moves first\n     horizontal, then vertical, whereas 'type = \"S\"' moves the other\n     way around.\n\nNote:\n\n     The 'plot' generic was moved from the 'graphics' package to the\n     'base' package in R 4.0.0. It is currently re-exported from the\n     'graphics' namespace to allow packages importing it from there to\n     continue working, but this may change in future versions of R.\n\nSee Also:\n\n     'plot.default', 'plot.formula' and other methods; 'points',\n     'lines', 'par'.  For thousands of points, consider using\n     'smoothScatter()' instead of 'plot()'.\n\n     For X-Y-Z plotting see 'contour', 'persp' and 'image'.\n\nExamples:\n\n     require(stats) # for lowess, rpois, rnorm\n     require(graphics) # for plot methods\n     plot(cars)\n     lines(lowess(cars))\n     \n     plot(sin, -pi, 2*pi) # see ?plot.function\n     \n     ## Discrete Distribution Plot:\n     plot(table(rpois(100, 5)), type = \"h\", col = \"red\", lwd = 10,\n          main = \"rpois(100, lambda = 5)\")\n     \n     ## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:\n     plot(x <- sort(rnorm(47)), type = \"s\", main = \"plot(x, type = \\\"s\\\")\")\n     points(x, cex = .5, col = \"dark red\")\n\n\n\n## `plot()` example\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nplot(df$age, df$IgG_concentration)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-1.png){width=960}\n:::\n\n```{.r .cell-code}\nplot(\n\tdf$age, \n\tdf$IgG_concentration, \n\ttype=\"p\", \n\tmain=\"Age by IgG Concentrations\", \n\txlab=\"Age (years)\", \n\tylab=\"IgG Concentration (IU/mL)\", \n\tpch=16, \n\tcex=0.9,\n\tcol=\"lightblue\")\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png){width=960}\n:::\n:::\n\n\n\n## `boxplot()` Help File\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?boxplot\n```\n:::\n\nBox Plots\n\nDescription:\n\n     Produce box-and-whisker plot(s) of the given (grouped) values.\n\nUsage:\n\n     boxplot(x, ...)\n     \n     ## S3 method for class 'formula'\n     boxplot(formula, data = NULL, ..., subset, na.action = NULL,\n             xlab = mklab(y_var = horizontal),\n             ylab = mklab(y_var =!horizontal),\n             add = FALSE, ann = !add, horizontal = FALSE,\n             drop = FALSE, sep = \".\", lex.order = FALSE)\n     \n     ## Default S3 method:\n     boxplot(x, ..., range = 1.5, width = NULL, varwidth = FALSE,\n             notch = FALSE, outline = TRUE, names, plot = TRUE,\n             border = par(\"fg\"), col = \"lightgray\", log = \"\",\n             pars = list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5),\n              ann = !add, horizontal = FALSE, add = FALSE, at = NULL)\n     \nArguments:\n\n formula: a formula, such as 'y ~ grp', where 'y' is a numeric vector\n          of data values to be split into groups according to the\n          grouping variable 'grp' (usually a factor).  Note that '~ g1\n          + g2' is equivalent to 'g1:g2'.\n\n    data: a data.frame (or list) from which the variables in 'formula'\n          should be taken.\n\n  subset: an optional vector specifying a subset of observations to be\n          used for plotting.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA's.  The default is to ignore missing values in\n          either the response or the group.\n\nxlab, ylab: x- and y-axis annotation, since R 3.6.0 with a non-empty\n          default.  Can be suppressed by 'ann=FALSE'.\n\n     ann: 'logical' indicating if axes should be annotated (by 'xlab'\n          and 'ylab').\n\ndrop, sep, lex.order: passed to 'split.default', see there.\n\n       x: for specifying data from which the boxplots are to be\n          produced. Either a numeric vector, or a single list\n          containing such vectors. Additional unnamed arguments specify\n          further data as separate vectors (each corresponding to a\n          component boxplot).  'NA's are allowed in the data.\n\n     ...: For the 'formula' method, named arguments to be passed to the\n          default method.\n\n          For the default method, unnamed arguments are additional data\n          vectors (unless 'x' is a list when they are ignored), and\n          named arguments are arguments and graphical parameters to be\n          passed to 'bxp' in addition to the ones given by argument\n          'pars' (and override those in 'pars'). Note that 'bxp' may or\n          may not make use of graphical parameters it is passed: see\n          its documentation.\n\n   range: this determines how far the plot whiskers extend out from the\n          box.  If 'range' is positive, the whiskers extend to the most\n          extreme data point which is no more than 'range' times the\n          interquartile range from the box. A value of zero causes the\n          whiskers to extend to the data extremes.\n\n   width: a vector giving the relative widths of the boxes making up\n          the plot.\n\nvarwidth: if 'varwidth' is 'TRUE', the boxes are drawn with widths\n          proportional to the square-roots of the number of\n          observations in the groups.\n\n   notch: if 'notch' is 'TRUE', a notch is drawn in each side of the\n          boxes.  If the notches of two plots do not overlap this is\n          'strong evidence' that the two medians differ (Chambers _et\n          al_, 1983, p. 62).  See 'boxplot.stats' for the calculations\n          used.\n\n outline: if 'outline' is not true, the outliers are not drawn (as\n          points whereas S+ uses lines).\n\n   names: group labels which will be printed under each boxplot.  Can\n          be a character vector or an expression (see plotmath).\n\n  boxwex: a scale factor to be applied to all boxes.  When there are\n          only a few groups, the appearance of the plot can be improved\n          by making the boxes narrower.\n\nstaplewex: staple line width expansion, proportional to box width.\n\n  outwex: outlier line width expansion, proportional to box width.\n\n    plot: if 'TRUE' (the default) then a boxplot is produced.  If not,\n          the summaries which the boxplots are based on are returned.\n\n  border: an optional vector of colors for the outlines of the\n          boxplots.  The values in 'border' are recycled if the length\n          of 'border' is less than the number of plots.\n\n     col: if 'col' is non-null it is assumed to contain colors to be\n          used to colour the bodies of the box plots. By default they\n          are in the background colour.\n\n     log: character indicating if x or y or both coordinates should be\n          plotted in log scale.\n\n    pars: a list of (potentially many) more graphical parameters, e.g.,\n          'boxwex' or 'outpch'; these are passed to 'bxp' (if 'plot' is\n          true); for details, see there.\n\nhorizontal: logical indicating if the boxplots should be horizontal;\n          default 'FALSE' means vertical boxes.\n\n     add: logical, if true _add_ boxplot to current plot.\n\n      at: numeric vector giving the locations where the boxplots should\n          be drawn, particularly when 'add = TRUE'; defaults to '1:n'\n          where 'n' is the number of boxes.\n\nDetails:\n\n     The generic function 'boxplot' currently has a default method\n     ('boxplot.default') and a formula interface ('boxplot.formula').\n\n     If multiple groups are supplied either as multiple arguments or\n     via a formula, parallel boxplots will be plotted, in the order of\n     the arguments or the order of the levels of the factor (see\n     'factor').\n\n     Missing values are ignored when forming boxplots.\n\nValue:\n\n     List with the following components:\n\n   stats: a matrix, each column contains the extreme of the lower\n          whisker, the lower hinge, the median, the upper hinge and the\n          extreme of the upper whisker for one group/plot.  If all the\n          inputs have the same class attribute, so will this component.\n\n       n: a vector with the number of (non-'NA') observations in each\n          group.\n\n    conf: a matrix where each column contains the lower and upper\n          extremes of the notch.\n\n     out: the values of any data points which lie beyond the extremes\n          of the whiskers.\n\n   group: a vector of the same length as 'out' whose elements indicate\n          to which group the outlier belongs.\n\n   names: a vector of names for the groups.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).  _The New\n     S Language_.  Wadsworth & Brooks/Cole.\n\n     Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A.\n     (1983).  _Graphical Methods for Data Analysis_.  Wadsworth &\n     Brooks/Cole.\n\n     Murrell, P. (2005).  _R Graphics_.  Chapman & Hall/CRC Press.\n\n     See also 'boxplot.stats'.\n\nSee Also:\n\n     'boxplot.stats' which does the computation, 'bxp' for the plotting\n     and more examples; and 'stripchart' for an alternative (with small\n     data sets).\n\nExamples:\n\n     ## boxplot on a formula:\n     boxplot(count ~ spray, data = InsectSprays, col = \"lightgray\")\n     # *add* notches (somewhat funny here <--> warning \"notches .. outside hinges\"):\n     boxplot(count ~ spray, data = InsectSprays,\n             notch = TRUE, add = TRUE, col = \"blue\")\n     \n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"y\")\n     ## horizontal=TRUE, switching  y <--> x :\n     boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n             log = \"x\", horizontal=TRUE)\n     \n     rb <- boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\")\n     title(\"Comparing boxplot()s and non-robust mean +/- SD\")\n     mn.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)\n     sd.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)\n     xi <- 0.3 + seq(rb$n)\n     points(xi, mn.t, col = \"orange\", pch = 18)\n     arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,\n            code = 3, col = \"pink\", angle = 75, length = .1)\n     \n     ## boxplot on a matrix:\n     mat <- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),\n                  `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))\n     boxplot(mat) # directly, calling boxplot.matrix()\n     \n     ## boxplot on a data frame:\n     df. <- as.data.frame(mat)\n     par(las = 1) # all axis labels horizontal\n     boxplot(df., main = \"boxplot(*, horizontal = TRUE)\", horizontal = TRUE)\n     \n     ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :\n     boxplot(len ~ dose, data = ToothGrowth,\n             boxwex = 0.25, at = 1:3 - 0.2,\n             subset = supp == \"VC\", col = \"yellow\",\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\",\n             ylab = \"tooth length\",\n             xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = \"i\")\n     boxplot(len ~ dose, data = ToothGrowth, add = TRUE,\n             boxwex = 0.25, at = 1:3 + 0.2,\n             subset = supp == \"OJ\", col = \"orange\")\n     legend(2, 9, c(\"Ascorbic acid\", \"Orange juice\"),\n            fill = c(\"yellow\", \"orange\"))\n     \n     ## With less effort (slightly different) using factor *interaction*:\n     boxplot(len ~ dose:supp, data = ToothGrowth,\n             boxwex = 0.5, col = c(\"orange\", \"yellow\"),\n             main = \"Guinea Pigs' Tooth Growth\",\n             xlab = \"Vitamin C dose mg\", ylab = \"tooth length\",\n             sep = \":\", lex.order = TRUE, ylim = c(0, 35), yaxs = \"i\")\n     \n     ## more examples in  help(bxp)\n\n\n\n## `boxplot()` example\n\nReminder function signature\n```\nboxplot(formula, data = NULL, ..., subset, na.action = NULL,\n        xlab = mklab(y_var = horizontal),\n        ylab = mklab(y_var =!horizontal),\n        add = FALSE, ann = !add, horizontal = FALSE,\n        drop = FALSE, sep = \".\", lex.order = FALSE)\n```\n\nLet's practice\n\n::: {.cell}\n\n```{.r .cell-code}\nboxplot(IgG_concentration~age_group, data=df)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-1.png){width=960}\n:::\n\n```{.r .cell-code}\nboxplot(\n\tlog(df$IgG_concentration)~df$age_group, \n\tmain=\"Age by IgG Concentrations\", \n\txlab=\"Age Group (years)\", \n\tylab=\"log IgG Concentration (mIU/mL)\", \n\tnames=c(\"1-5\",\"6-10\", \"11-15\"), \n\tvarwidth=T\n\t)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-2.png){width=960}\n:::\n:::\n\n\n\n## `barplot()` Help File\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?barplot\n```\n:::\n\nBar Plots\n\nDescription:\n\n     Creates a bar plot with vertical or horizontal bars.\n\nUsage:\n\n     barplot(height, ...)\n     \n     ## Default S3 method:\n     barplot(height, width = 1, space = NULL,\n             names.arg = NULL, legend.text = NULL, beside = FALSE,\n             horiz = FALSE, density = NULL, angle = 45,\n             col = NULL, border = par(\"fg\"),\n             main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n             xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n             axes = TRUE, axisnames = TRUE,\n             cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n             inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n             add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n     \n     ## S3 method for class 'formula'\n     barplot(formula, data, subset, na.action,\n             horiz = FALSE, xlab = NULL, ylab = NULL, ...)\n     \nArguments:\n\n  height: either a vector or matrix of values describing the bars which\n          make up the plot.  If 'height' is a vector, the plot consists\n          of a sequence of rectangular bars with heights given by the\n          values in the vector.  If 'height' is a matrix and 'beside'\n          is 'FALSE' then each bar of the plot corresponds to a column\n          of 'height', with the values in the column giving the heights\n          of stacked sub-bars making up the bar.  If 'height' is a\n          matrix and 'beside' is 'TRUE', then the values in each column\n          are juxtaposed rather than stacked.\n\n   width: optional vector of bar widths. Re-cycled to length the number\n          of bars drawn.  Specifying a single value will have no\n          visible effect unless 'xlim' is specified.\n\n   space: the amount of space (as a fraction of the average bar width)\n          left before each bar.  May be given as a single number or one\n          number per bar.  If 'height' is a matrix and 'beside' is\n          'TRUE', 'space' may be specified by two numbers, where the\n          first is the space between bars in the same group, and the\n          second the space between the groups.  If not given\n          explicitly, it defaults to 'c(0,1)' if 'height' is a matrix\n          and 'beside' is 'TRUE', and to 0.2 otherwise.\n\nnames.arg: a vector of names to be plotted below each bar or group of\n          bars.  If this argument is omitted, then the names are taken\n          from the 'names' attribute of 'height' if this is a vector,\n          or the column names if it is a matrix.\n\nlegend.text: a vector of text used to construct a legend for the plot,\n          or a logical indicating whether a legend should be included.\n          This is only useful when 'height' is a matrix.  In that case\n          given legend labels should correspond to the rows of\n          'height'; if 'legend.text' is true, the row names of 'height'\n          will be used as labels if they are non-null.\n\n  beside: a logical value.  If 'FALSE', the columns of 'height' are\n          portrayed as stacked bars, and if 'TRUE' the columns are\n          portrayed as juxtaposed bars.\n\n   horiz: a logical value.  If 'FALSE', the bars are drawn vertically\n          with the first bar to the left.  If 'TRUE', the bars are\n          drawn horizontally with the first at the bottom.\n\n density: a vector giving the density of shading lines, in lines per\n          inch, for the bars or bar components.  The default value of\n          'NULL' means that no shading lines are drawn. Non-positive\n          values of 'density' also inhibit the drawing of shading\n          lines.\n\n   angle: the slope of shading lines, given as an angle in degrees\n          (counter-clockwise), for the bars or bar components.\n\n     col: a vector of colors for the bars or bar components.  By\n          default, '\"grey\"' is used if 'height' is a vector, and a\n          gamma-corrected grey palette if 'height' is a matrix; see\n          'grey.colors'.\n\n  border: the color to be used for the border of the bars.  Use 'border\n          = NA' to omit borders.  If there are shading lines, 'border =\n          TRUE' means use the same colour for the border as for the\n          shading lines.\n\nmain,sub: main title and subtitle for the plot.\n\n    xlab: a label for the x axis.\n\n    ylab: a label for the y axis.\n\n    xlim: limits for the x axis.\n\n    ylim: limits for the y axis.\n\n     xpd: logical. Should bars be allowed to go outside region?\n\n     log: string specifying if axis scales should be logarithmic; see\n          'plot.default'.\n\n    axes: logical.  If 'TRUE', a vertical (or horizontal, if 'horiz' is\n          true) axis is drawn.\n\naxisnames: logical.  If 'TRUE', and if there are 'names.arg' (see\n          above), the other axis is drawn (with 'lty = 0') and labeled.\n\ncex.axis: expansion factor for numeric axis labels (see 'par('cex')').\n\ncex.names: expansion factor for axis names (bar labels).\n\n  inside: logical.  If 'TRUE', the lines which divide adjacent\n          (non-stacked!) bars will be drawn.  Only applies when 'space\n          = 0' (which it partly is when 'beside = TRUE').\n\n    plot: logical.  If 'FALSE', nothing is plotted.\n\naxis.lty: the graphics parameter 'lty' (see 'par('lty')') applied to\n          the axis and tick marks of the categorical (default\n          horizontal) axis.  Note that by default the axis is\n          suppressed.\n\n  offset: a vector indicating how much the bars should be shifted\n          relative to the x axis.\n\n     add: logical specifying if bars should be added to an already\n          existing plot; defaults to 'FALSE'.\n\n     ann: logical specifying if the default annotation ('main', 'sub',\n          'xlab', 'ylab') should appear on the plot, see 'title'.\n\nargs.legend: list of additional arguments to pass to 'legend()'; names\n          of the list are used as argument names.  Only used if\n          'legend.text' is supplied.\n\n formula: a formula where the 'y' variables are numeric data to plot\n          against the categorical 'x' variables.  The formula can have\n          one of three forms:\n\n                y ~ x\n                y ~ x1 + x2\n                cbind(y1, y2) ~ x\n          \n          (see the examples).\n\n    data: a data frame (or list) from which the variables in formula\n          should be taken.\n\n  subset: an optional vector specifying a subset of observations to be\n          used.\n\nna.action: a function which indicates what should happen when the data\n          contain 'NA' values.  The default is to ignore missing values\n          in the given variables.\n\n     ...: arguments to be passed to/from other methods.  For the\n          default method these can include further arguments (such as\n          'axes', 'asp' and 'main') and graphical parameters (see\n          'par') which are passed to 'plot.window()', 'title()' and\n          'axis'.\n\nValue:\n\n     A numeric vector (or matrix, when 'beside = TRUE'), say 'mp',\n     giving the coordinates of _all_ the bar midpoints drawn, useful\n     for adding to the graph.\n\n     If 'beside' is true, use 'colMeans(mp)' for the midpoints of each\n     _group_ of bars, see example.\n\nAuthor(s):\n\n     R Core, with a contribution by Arni Magnusson.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\n\nSee Also:\n\n     'plot(..., type = \"h\")', 'dotchart'; 'hist' for bars of a\n     _continuous_ variable.  'mosaicplot()', more sophisticated to\n     visualize _several_ categorical variables.\n\nExamples:\n\n     # Formula method\n     barplot(GNP ~ Year, data = longley)\n     barplot(cbind(Employed, Unemployed) ~ Year, data = longley)\n     \n     ## 3rd form of formula - 2 categories :\n     op <- par(mfrow = 2:1, mgp = c(3,1,0)/2, mar = .1+c(3,3:1))\n     summary(d.Titanic <- as.data.frame(Titanic))\n     barplot(Freq ~ Class + Survived, data = d.Titanic,\n             subset = Age == \"Adult\" & Sex == \"Male\",\n             main = \"barplot(Freq ~ Class + Survived, *)\", ylab = \"# {passengers}\", legend.text = TRUE)\n     # Corresponding table :\n     (xt <- xtabs(Freq ~ Survived + Class + Sex, d.Titanic, subset = Age==\"Adult\"))\n     # Alternatively, a mosaic plot :\n     mosaicplot(xt[,,\"Male\"], main = \"mosaicplot(Freq ~ Class + Survived, *)\", color=TRUE)\n     par(op)\n     \n     \n     # Default method\n     require(grDevices) # for colours\n     tN <- table(Ni <- stats::rpois(100, lambda = 5))\n     r <- barplot(tN, col = rainbow(20))\n     #- type = \"h\" plotting *is* 'bar'plot\n     lines(r, tN, type = \"h\", col = \"red\", lwd = 2)\n     \n     barplot(tN, space = 1.5, axisnames = FALSE,\n             sub = \"barplot(..., space= 1.5, axisnames = FALSE)\")\n     \n     barplot(VADeaths, plot = FALSE)\n     barplot(VADeaths, plot = FALSE, beside = TRUE)\n     \n     mp <- barplot(VADeaths) # default\n     tot <- colMeans(VADeaths)\n     text(mp, tot + 3, format(tot), xpd = TRUE, col = \"blue\")\n     barplot(VADeaths, beside = TRUE,\n             col = c(\"lightblue\", \"mistyrose\", \"lightcyan\",\n                     \"lavender\", \"cornsilk\"),\n             legend.text = rownames(VADeaths), ylim = c(0, 100))\n     title(main = \"Death Rates in Virginia\", font.main = 4)\n     \n     hh <- t(VADeaths)[, 5:1]\n     mybarcol <- \"gray20\"\n     mp <- barplot(hh, beside = TRUE,\n             col = c(\"lightblue\", \"mistyrose\",\n                     \"lightcyan\", \"lavender\"),\n             legend.text = colnames(VADeaths), ylim = c(0,100),\n             main = \"Death Rates in Virginia\", font.main = 4,\n             sub = \"Faked upper 2*sigma error bars\", col.sub = mybarcol,\n             cex.names = 1.5)\n     segments(mp, hh, mp, hh + 2*sqrt(1000*hh/100), col = mybarcol, lwd = 1.5)\n     stopifnot(dim(mp) == dim(hh))  # corresponding matrices\n     mtext(side = 1, at = colMeans(mp), line = -2,\n           text = paste(\"Mean\", formatC(colMeans(hh))), col = \"red\")\n     \n     # Bar shading example\n     barplot(VADeaths, angle = 15+10*1:5, density = 20, col = \"black\",\n             legend.text = rownames(VADeaths))\n     title(main = list(\"Death Rates in Virginia\", font = 4))\n     \n     # Border color\n     barplot(VADeaths, border = \"dark blue\") \n     \n     \n     # Log scales (not much sense here)\n     barplot(tN, col = heat.colors(12), log = \"y\")\n     barplot(tN, col = gray.colors(20), log = \"xy\")\n     \n     # Legend location\n     barplot(height = cbind(x = c(465, 91) / 465 * 100,\n                            y = c(840, 200) / 840 * 100,\n                            z = c(37, 17) / 37 * 100),\n             beside = FALSE,\n             width = c(465, 840, 37),\n             col = c(1, 2),\n             legend.text = c(\"A\", \"B\"),\n             args.legend = list(x = \"topleft\"))\n\n\n\n## `barplot()` example\n\nThe function takes the a lot of arguments to control the way the way our data is plotted. \n\nReminder function signature\n```\nbarplot(height, width = 1, space = NULL,\n        names.arg = NULL, legend.text = NULL, beside = FALSE,\n        horiz = FALSE, density = NULL, angle = 45,\n        col = NULL, border = par(\"fg\"),\n        main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n        xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n        axes = TRUE, axisnames = TRUE,\n        cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n        inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n        add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n```\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$seropos, df$age_group)\nbarplot(freq)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-1.png){width=960}\n:::\n\n```{.r .cell-code}\nprop.cell.percentages <- prop.table(freq)\nbarplot(prop.cell.percentages)\n```\n\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-2.png){width=960}\n:::\n:::\n\n\n## 3. Legend!\n\nIn Base R plotting the legend is not automatically generated.  This is nice because it gives you a huge amount of control over how your legend looks, but it is also easy to mislabel your colors, symbols, line types, etc. So, basically be careful.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?legend\n```\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n```\nAdd Legends to Plots\n\nDescription:\n\n     This function can be used to add legends to plots.  Note that a\n     call to the function 'locator(1)' can be used in place of the 'x'\n     and 'y' arguments.\n\nUsage:\n\n     legend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n            border = \"black\", lty, lwd, pch,\n            angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n            box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n            pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n            xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n            adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n            text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n            plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n            inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n            title.cex = cex[1], title.font = text.font[1],\n            seg.len = 2)\n     \nArguments:\n\n    x, y: the x and y co-ordinates to be used to position the legend.\n          They can be specified by keyword or in any way which is\n          accepted by 'xy.coords': See 'Details'.\n\n  legend: a character or expression vector of length >= 1 to appear in\n          the legend.  Other objects will be coerced by\n          'as.graphicsAnnot'.\n\n    fill: if specified, this argument will cause boxes filled with the\n          specified colors (or shaded in the specified colors) to\n          appear beside the legend text.\n\n     col: the color of points or lines appearing in the legend.\n\n  border: the border color for the boxes (used only if 'fill' is\n          specified).\n\nlty, lwd: the line types and widths for lines appearing in the legend.\n          One of these two _must_ be specified for line drawing.\n\n     pch: the plotting symbols appearing in the legend, as numeric\n          vector or a vector of 1-character strings (see 'points').\n          Unlike 'points', this can all be specified as a single\n          multi-character string.  _Must_ be specified for symbol\n          drawing.\n\n   angle: angle of shading lines.\n\n density: the density of shading lines, if numeric and positive. If\n          'NULL' or negative or 'NA' color filling is assumed.\n\n     bty: the type of box to be drawn around the legend.  The allowed\n          values are '\"o\"' (the default) and '\"n\"'.\n\n      bg: the background color for the legend box.  (Note that this is\n          only used if 'bty != \"n\"'.)\n\nbox.lty, box.lwd, box.col: the line type, width and color for the\n          legend box (if 'bty = \"o\"').\n\n   pt.bg: the background color for the 'points', corresponding to its\n          argument 'bg'.\n\n     cex: character expansion factor *relative* to current\n          'par(\"cex\")'.  Used for text, and provides the default for\n          'pt.cex'.\n\n  pt.cex: expansion factor(s) for the points.\n\n  pt.lwd: line width for the points, defaults to the one for lines, or\n          if that is not set, to 'par(\"lwd\")'.\n\n   xjust: how the legend is to be justified relative to the legend x\n          location.  A value of 0 means left justified, 0.5 means\n          centered and 1 means right justified.\n\n   yjust: the same as 'xjust' for the legend y location.\n\nx.intersp: character interspacing factor for horizontal (x) spacing\n          between symbol and legend text.\n\ny.intersp: vertical (y) distances (in lines of text shared above/below\n          each legend entry).  A vector with one element for each row\n          of the legend can be used.\n\n     adj: numeric of length 1 or 2; the string adjustment for legend\n          text.  Useful for y-adjustment when 'labels' are plotmath\n          expressions.\n\ntext.width: the width of the legend text in x ('\"user\"') coordinates.\n          (Should be positive even for a reversed x axis.)  Can be a\n          single positive numeric value (same width for each column of\n          the legend), a vector (one element for each column of the\n          legend), 'NULL' (default) for computing a proper maximum\n          value of 'strwidth(legend)'), or 'NA' for computing a proper\n          column wise maximum value of 'strwidth(legend)').\n\ntext.col: the color used for the legend text.\n\ntext.font: the font used for the legend text, see 'text'.\n\n   merge: logical; if 'TRUE', merge points and lines but not filled\n          boxes.  Defaults to 'TRUE' if there are points and lines.\n\n   trace: logical; if 'TRUE', shows how 'legend' does all its magical\n          computations.\n\n    plot: logical.  If 'FALSE', nothing is plotted but the sizes are\n          returned.\n\n    ncol: the number of columns in which to set the legend items\n          (default is 1, a vertical legend).\n\n   horiz: logical; if 'TRUE', set the legend horizontally rather than\n          vertically (specifying 'horiz' overrides the 'ncol'\n          specification).\n\n   title: a character string or length-one expression giving a title to\n          be placed at the top of the legend.  Other objects will be\n          coerced by 'as.graphicsAnnot'.\n\n   inset: inset distance(s) from the margins as a fraction of the plot\n          region when legend is placed by keyword.\n\n     xpd: if supplied, a value of the graphical parameter 'xpd' to be\n          used while the legend is being drawn.\n\ntitle.col: color for 'title', defaults to 'text.col[1]'.\n\ntitle.adj: horizontal adjustment for 'title': see the help for\n          'par(\"adj\")'.\n\ntitle.cex: expansion factor(s) for the title, defaults to 'cex[1]'.\n\ntitle.font: the font used for the legend title, defaults to\n          'text.font[1]', see 'text'.\n\n seg.len: the length of lines drawn to illustrate 'lty' and/or 'lwd'\n          (in units of character widths).\n\nDetails:\n\n     Arguments 'x', 'y', 'legend' are interpreted in a non-standard way\n     to allow the coordinates to be specified _via_ one or two\n     arguments.  If 'legend' is missing and 'y' is not numeric, it is\n     assumed that the second argument is intended to be 'legend' and\n     that the first argument specifies the coordinates.\n\n     The coordinates can be specified in any way which is accepted by\n     'xy.coords'.  If this gives the coordinates of one point, it is\n     used as the top-left coordinate of the rectangle containing the\n     legend.  If it gives the coordinates of two points, these specify\n     opposite corners of the rectangle (either pair of corners, in any\n     order).\n\n     The location may also be specified by setting 'x' to a single\n     keyword from the list '\"bottomright\"', '\"bottom\"', '\"bottomleft\"',\n     '\"left\"', '\"topleft\"', '\"top\"', '\"topright\"', '\"right\"' and\n     '\"center\"'. This places the legend on the inside of the plot frame\n     at the given location. Partial argument matching is used.  The\n     optional 'inset' argument specifies how far the legend is inset\n     from the plot margins.  If a single value is given, it is used for\n     both margins; if two values are given, the first is used for 'x'-\n     distance, the second for 'y'-distance.\n\n     Attribute arguments such as 'col', 'pch', 'lty', etc, are recycled\n     if necessary: 'merge' is not.  Set entries of 'lty' to '0' or set\n     entries of 'lwd' to 'NA' to suppress lines in corresponding legend\n     entries; set 'pch' values to 'NA' to suppress points.\n\n     Points are drawn _after_ lines in order that they can cover the\n     line with their background color 'pt.bg', if applicable.\n\n     See the examples for how to right-justify labels.\n\n     Since they are not used for Unicode code points, values '-31:-1'\n     are silently omitted, as are 'NA' and '\"\"' values.\n\nValue:\n\n     A list with list components\n\n    rect: a list with components\n\n          'w', 'h' positive numbers giving *w*idth and *h*eight of the\n              legend's box.\n\n          'left', 'top' x and y coordinates of upper left corner of the\n              box.\n\n    text: a list with components\n\n          'x, y' numeric vectors of length 'length(legend)', giving the\n              x and y coordinates of the legend's text(s).\n\n     returned invisibly.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\n     Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\n\nSee Also:\n\n     'plot', 'barplot' which uses 'legend()', and 'text' for more\n     examples of math expressions.\n\nExamples:\n\n     ## Run the example in '?matplot' or the following:\n     leg.txt <- c(\"Setosa     Petals\", \"Setosa     Sepals\",\n                  \"Versicolor Petals\", \"Versicolor Sepals\")\n     y.leg <- c(4.5, 3, 2.1, 1.4, .7)\n     cexv  <- c(1.2, 1, 4/5, 2/3, 1/2)\n     matplot(c(1, 8), c(0, 4.5), type = \"n\", xlab = \"Length\", ylab = \"Width\",\n             main = \"Petal and Sepal Dimensions in Iris Blossoms\")\n     for (i in seq(cexv)) {\n       text  (1, y.leg[i] - 0.1, paste(\"cex=\", formatC(cexv[i])), cex = 0.8, adj = 0)\n       legend(3, y.leg[i], leg.txt, pch = \"sSvV\", col = c(1, 3), cex = cexv[i])\n     }\n     ## cex *vector* [in R <= 3.5.1 has 'if(xc < 0)' w/ length(xc) == 2]\n     legend(\"right\", leg.txt, pch = \"sSvV\", col = c(1, 3),\n            cex = 1+(-1:2)/8, trace = TRUE)# trace: show computed lengths & coords\n     \n     ## 'merge = TRUE' for merging lines & points:\n     x <- seq(-pi, pi, length.out = 65)\n     for(reverse in c(FALSE, TRUE)) {  ## normal *and* reverse axes:\n       F <- if(reverse) rev else identity\n       plot(x, sin(x), type = \"l\", col = 3, lty = 2,\n            xlim = F(range(x)), ylim = F(c(-1.2, 1.8)))\n       points(x, cos(x), pch = 3, col = 4)\n       lines(x, tan(x), type = \"b\", lty = 1, pch = 4, col = 6)\n       title(\"legend('top', lty = c(2, -1, 1), pch = c(NA, 3, 4), merge = TRUE)\",\n             cex.main = 1.1)\n       legend(\"top\", c(\"sin\", \"cos\", \"tan\"), col = c(3, 4, 6),\n            text.col = \"green4\", lty = c(2, -1, 1), pch = c(NA, 3, 4),\n            merge = TRUE, bg = \"gray90\", trace=TRUE)\n       \n     } # for(..)\n     \n     ## right-justifying a set of labels: thanks to Uwe Ligges\n     x <- 1:5; y1 <- 1/x; y2 <- 2/x\n     plot(rep(x, 2), c(y1, y2), type = \"n\", xlab = \"x\", ylab = \"y\")\n     lines(x, y1); lines(x, y2, lty = 2)\n     temp <- legend(\"topright\", legend = c(\" \", \" \"),\n                    text.width = strwidth(\"1,000,000\"),\n                    lty = 1:2, xjust = 1, yjust = 1, inset = 1/10,\n                    title = \"Line Types\", title.cex = 0.5, trace=TRUE)\n     text(temp$rect$left + temp$rect$w, temp$text$y,\n          c(\"1,000\", \"1,000,000\"), pos = 2)\n     \n     \n     ##--- log scaled Examples ------------------------------\n     leg.txt <- c(\"a one\", \"a two\")\n     \n     par(mfrow = c(2, 2))\n     for(ll in c(\"\",\"x\",\"y\",\"xy\")) {\n       plot(2:10, log = ll, main = paste0(\"log = '\", ll, \"'\"))\n       abline(1, 1)\n       lines(2:3, 3:4, col = 2)\n       points(2, 2, col = 3)\n       rect(2, 3, 3, 2, col = 4)\n       text(c(3,3), 2:3, c(\"rect(2,3,3,2, col=4)\",\n                           \"text(c(3,3),2:3,\\\"c(rect(...)\\\")\"), adj = c(0, 0.3))\n       legend(list(x = 2,y = 8), legend = leg.txt, col = 2:3, pch = 1:2,\n              lty = 1)  #, trace = TRUE)\n     } #      ^^^^^^^ to force lines -> automatic merge=TRUE\n     par(mfrow = c(1,1))\n     \n     ##-- Math expressions:  ------------------------------\n     x <- seq(-pi, pi, length.out = 65)\n     plot(x, sin(x), type = \"l\", col = 2, xlab = expression(phi),\n          ylab = expression(f(phi)))\n     abline(h = -1:1, v = pi/2*(-6:6), col = \"gray90\")\n     lines(x, cos(x), col = 3, lty = 2)\n     ex.cs1 <- expression(plain(sin) * phi,  paste(\"cos\", phi))  # 2 ways\n     utils::str(legend(-3, .9, ex.cs1, lty = 1:2, plot = FALSE,\n                adj = c(0, 0.6)))  # adj y !\n     legend(-3, 0.9, ex.cs1, lty = 1:2, col = 2:3,  adj = c(0, 0.6))\n     \n     require(stats)\n     x <- rexp(100, rate = .5)\n     hist(x, main = \"Mean and Median of a Skewed Distribution\")\n     abline(v = mean(x),   col = 2, lty = 2, lwd = 2)\n     abline(v = median(x), col = 3, lty = 3, lwd = 2)\n     ex12 <- expression(bar(x) == sum(over(x[i], n), i == 1, n),\n                        hat(x) == median(x[i], i == 1, n))\n     utils::str(legend(4.1, 30, ex12, col = 2:3, lty = 2:3, lwd = 2))\n     \n     ## 'Filled' boxes -- see also example(barplot) which may call legend(*, fill=)\n     barplot(VADeaths)\n     legend(\"topright\", rownames(VADeaths), fill = gray.colors(nrow(VADeaths)))\n     \n     ## Using 'ncol'\n     x <- 0:64/64\n     for(R in c(identity, rev)) { # normal *and* reverse x-axis works fine:\n       xl <- R(range(x)); x1 <- xl[1]\n     matplot(x, outer(x, 1:7, function(x, k) sin(k * pi * x)), xlim=xl,\n             type = \"o\", col = 1:7, ylim = c(-1, 1.5), pch = \"*\")\n     op <- par(bg = \"antiquewhite1\")\n     legend(x1, 1.5, paste(\"sin(\", 1:7, \"pi * x)\"), col = 1:7, lty = 1:7,\n            pch = \"*\", ncol = 4, cex = 0.8)\n     legend(\"bottomright\", paste(\"sin(\", 1:7, \"pi * x)\"), col = 1:7, lty = 1:7,\n            pch = \"*\", cex = 0.8)\n     legend(x1, -.1, paste(\"sin(\", 1:4, \"pi * x)\"), col = 1:4, lty = 1:4,\n            ncol = 2, cex = 0.8)\n     legend(x1, -.4, paste(\"sin(\", 5:7, \"pi * x)\"), col = 4:6,  pch = 24,\n            ncol = 2, cex = 1.5, lwd = 2, pt.bg = \"pink\", pt.cex = 1:3)\n     par(op)\n       \n     } # for(..)\n     \n     ## point covering line :\n     y <- sin(3*pi*x)\n     plot(x, y, type = \"l\", col = \"blue\",\n         main = \"points with bg & legend(*, pt.bg)\")\n     points(x, y, pch = 21, bg = \"white\")\n     legend(.4,1, \"sin(c x)\", pch = 21, pt.bg = \"white\", lty = 1, col = \"blue\")\n     \n     ## legends with titles at different locations\n     plot(x, y, type = \"n\")\n     legend(\"bottomright\", \"(x,y)\", pch=1, title= \"bottomright\")\n     legend(\"bottom\",      \"(x,y)\", pch=1, title= \"bottom\")\n     legend(\"bottomleft\",  \"(x,y)\", pch=1, title= \"bottomleft\")\n     legend(\"left\",        \"(x,y)\", pch=1, title= \"left\")\n     legend(\"topleft\",     \"(x,y)\", pch=1, title= \"topleft, inset = .05\", inset = .05)\n     legend(\"top\",         \"(x,y)\", pch=1, title= \"top\")\n     legend(\"topright\",    \"(x,y)\", pch=1, title= \"topright, inset = .02\",inset = .02)\n     legend(\"right\",       \"(x,y)\", pch=1, title= \"right\")\n     legend(\"center\",      \"(x,y)\", pch=1, title= \"center\")\n     \n     # using text.font (and text.col):\n     op <- par(mfrow = c(2, 2), mar = rep(2.1, 4))\n     c6 <- terrain.colors(10)[1:6]\n     for(i in 1:4) {\n        plot(1, type = \"n\", axes = FALSE, ann = FALSE); title(paste(\"text.font =\",i))\n        legend(\"top\", legend = LETTERS[1:6], col = c6,\n               ncol = 2, cex = 2, lwd = 3, text.font = i, text.col = c6)\n     }\n     par(op)\n     \n     # using text.width for several columns\n     plot(1, type=\"n\")\n     legend(\"topleft\", c(\"This legend\", \"has\", \"equally sized\", \"columns.\"),\n            pch = 1:4, ncol = 4)\n     legend(\"bottomleft\", c(\"This legend\", \"has\", \"optimally sized\", \"columns.\"),\n            pch = 1:4, ncol = 4, text.width = NA)\n     legend(\"right\", letters[1:4], pch = 1:4, ncol = 4,\n            text.width = 1:4 / 50)\n```\n:::\n:::\n\n\n\n\n## Add legend to the plot\n\nReminder function signature\n```\nlegend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n       border = \"black\", lty, lwd, pch,\n       angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n       box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n       pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n       xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n       adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n       text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n       plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n       inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n       title.cex = cex[1], title.font = text.font[1],\n       seg.len = 2)\n```\n\nLet's practice\n\n::: {.cell}\n\n```{.r .cell-code}\nbarplot(prop.cell.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,0.5), main=\"Seropositivity by Age Group\")\nlegend(x=2.5, y=0.5,\n\t\t\t fill=c(\"darkblue\",\"red\"), \n\t\t\t legend = c(\"seronegative\", \"seropositive\"))\n```\n:::\n\n\n\n## Add legend to the plot\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png){width=960}\n:::\n:::\n\n\n\n## `barplot()` example\n\nGetting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfreq <- table(df$seropos, df$age_group)\nprop.column.percentages <- prop.table(freq, margin=2)\ncolnames(prop.column.percentages) <- c(\"1-5 yo\", \"6-10 yo\", \"11-15 yo\")\n\nbarplot(prop.column.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=2.8, y=1.35,\n\t\t\t fill=c(\"darkblue\",\"red\"), \n\t\t\t legend = c(\"seronegative\", \"seropositive\"))\n```\n:::\n\n\n## `barplot()` example\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png){width=960}\n:::\n:::\n\n\n\n\n## `barplot()` example\n\nNow, let look at seropositivity by two individual level characteristics in the same plot. \n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\npar(mfrow = c(1,2))\nbarplot(prop.column.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(\"topright\",\n\t\t\t fill=c(\"darkblue\",\"red\"), \n\t\t\t legend = c(\"seronegative\", \"seropositive\"))\n\nbarplot(prop.column.percentages2, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Residence\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(\"topright\", fill=c(\"darkblue\",\"red\"),  legend = c(\"seronegative\", \"seropositive\"))\n```\n:::\n\n\n\n## `barplot()` example\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-30-1.png){width=960}\n:::\n:::\n\n\n\n## Summary\n\n-\t\t\n\n## Acknowledgements\n\nThese are the materials we looked through, modified, or extracted to complete this module's lecture.\n\n-   [\"Base Plotting in R\" by Medium](https://towardsdatascience.com/base-plotting-in-r-eb365da06b22)\n-\t\t[\"Base R margins: a cheatsheet\"](https://r-graph-gallery.com/74-margin-and-oma-cheatsheet.html)\n",
     "supporting": [
       "Module10-DataVisualization_files"
     ],
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png
index 5656265..4e5c9c8 100644
Binary files a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png differ
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-25-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-25-1.png
index 232d44e..edfae88 100644
Binary files a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-25-1.png and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-25-1.png differ
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png
index 1abfaa6..232d44e 100644
Binary files a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png differ
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-27-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-27-1.png
index 1abfaa6..232d44e 100644
Binary files a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-27-1.png and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-27-1.png differ
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-28-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-28-1.png
new file mode 100644
index 0000000..c6eb02c
Binary files /dev/null and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-28-1.png differ
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-30-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-30-1.png
new file mode 100644
index 0000000..c6eb02c
Binary files /dev/null and b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-30-1.png differ
diff --git a/_freeze/modules/ModuleXX-Iteration/execute-results/html.json b/_freeze/modules/ModuleXX-Iteration/execute-results/html.json
index e645b03..1afe07a 100644
--- a/_freeze/modules/ModuleXX-Iteration/execute-results/html.json
+++ b/_freeze/modules/ModuleXX-Iteration/execute-results/html.json
@@ -1,8 +1,7 @@
 {
-  "hash": "08f7e544d2bf32fea09fb30b8607df0b",
+  "hash": "3038ecdd34c4713f40e08365819703e7",
   "result": {
-    "engine": "knitr",
-    "markdown": "---\ntitle: \"Iteration in R\"\nformat: revealjs\n---\n\n\n\n\n\n\n\n## Learning goals\n\n1. Replace repetitive code with a `for` loop\n1. Compare and contrast `for` loops and `*apply()` functions\n1. Use vectorization to replace unnecessary loops\n\n## What is iteration?\n\n* Whenever you repeat something, that's iteration.\n* In `R`, this means running the same code multiple times in a row.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndata(\"penguins\", package = \"palmerpenguins\")\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n\n\n:::\n:::\n\n\n\n\n## Parts of a loop\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code  code-line-numbers=\"1,9\"}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n:::\n\n\n\n\nThe **header** declares how many times we will repeat the same code. The header\ncontains a **control variable** that changes in each repetition and a\n**sequence** of values for the control variable to take.\n\n## Parts of a loop\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code  code-line-numbers=\"2-8\"}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n:::\n\n\n\n\nThe **body** of the loop contains code that will be repeated a number of times\nbased on the header instructions. In `R`, the body has to be surrounded by\ncurly braces.\n\n## Header parts\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {...}\n```\n:::\n\n\n\n\n* `for`: keyword that declares we are doing a for loop.\n* `(...)`: parentheses after `for` declare the control variable and sequence.\n* `this_island`: the control variable.\n* `in`: keyword that separates the control varibale and sequence.\n* `levels(penguins$island)`: the sequence.\n* `{}`: curly braces will contain the body code.\n\n## Header parts\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {...}\n```\n:::\n\n\n\n\n* Since `levels(penguins$island)` evaluates to\n`c(\"Biscoe\", \"Dream\", \"Torgersen\")`, our loop will repeat 3 times.\n\n| Iteration | `this_island` |\n|-----------|---------------|\n| 1         | \"Biscoe\"      |\n| 2         | \"Dream\"       |\n| 3         | \"Torgersen\"   |\n\n* Everything inside of `{...}` will be repeated three times.\n\n## Loop iteration 1\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Biscoe\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Biscoe\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\n```\n\n\n:::\n:::\n\n\n\n\n## Loop iteration 2\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Dream\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Dream\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe mean bill depth on Dream Island was 18.34 mm.\n```\n\n\n:::\n:::\n\n\n\n\n## Loop iteration 3\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Torgersen\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Torgersen\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n\n\n:::\n:::\n\n\n\n\n## The loop structure automates this process for us so we don't have to copy and paste our code!\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n\n\n:::\n:::\n\n\n\n\n## Remember: write DRY code!\n\n* DRY = \"Don't Repeat Yourself\"\n* Instead of copying and pasting, write loops and functions.\n* Easier to debug and change in the future!\n\n. . .\n\n* Of course, we all copy and paste code sometimes. If you are running on a\ntight deadline or can't get a loop or function to work, you might need to.\n**DRY code is good, but working code is best!**\n\n## {#tweet-slide data-menu-title=\"Hadley tweet\" .center}\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](../images/hadley-tweet.PNG)\n:::\n:::\n\n\n\n\n## You try it!\n\nWrite a loop that goes from 1 to 10, squares each of the numbers, and prints\nthe squared number.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:10) {\n\tcat(i ^ 2, \"\\n\")\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 \n4 \n9 \n16 \n25 \n36 \n49 \n64 \n81 \n100 \n```\n\n\n:::\n:::\n\n\n\n\n## Wait, did we need to do that? {.incremental}\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n## Wait, did we need to do that?\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# No loop needed!\n(1:10)^2\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n [1]   1   4   9  16  25  36  49  64  81 100\n```\n\n\n:::\n:::\n\n\n\n\n## Wait, did we need to do that?\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# No loop needed!\n(1:10)^2\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n [1]   1   4   9  16  25  36  49  64  81 100\n```\n\n\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the first 10 odd numbers, a common CS 101 loop problem on exams\n(1:20)[which((1:20 %% 2) == 1)]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n [1]  1  3  5  7  9 11 13 15 17 19\n```\n\n\n:::\n:::\n\n\n\n\n* So you should really try vectorization first, then use loops only when\nyou can't use vectorization.\n\n## Loop walkthrough\n\n* Let's walk through a complex but useful example where we can't use\nvectorization.\n* Load the cleaned measles dataset, and subset it so you only have MCV1 records.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmeas <- readRDS(here::here(\"data\", \"measles_final.Rds\")) |>\n\tsubset(vaccine_antigen == \"MCV1\")\nstr(meas)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n'data.frame':\t7972 obs. of  7 variables:\n $ iso3c           : chr  \"AFG\" \"AFG\" \"AFG\" \"AFG\" ...\n $ time            : int  1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 ...\n $ country         : chr  \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" ...\n $ Cases           : int  2792 5166 2900 640 353 2012 1511 638 1154 492 ...\n $ vaccine_antigen : chr  \"MCV1\" \"MCV1\" \"MCV1\" \"MCV1\" ...\n $ vaccine_coverage: int  11 NA 8 9 14 14 14 31 34 22 ...\n $ total_pop       : chr  \"12486631\" \"11155195\" \"10088289\" \"9951449\" ...\n```\n\n\n:::\n:::\n\n\n\n\n## Loop walkthrough\n\n* First, make an empty `list`. This is where we'll store our results. Make it\nthe same length as the number of countries in the dataset.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nres <- vector(mode = \"list\", length = length(unique(meas$country)))\n```\n:::\n\n\n\n\n* This is called *preallocation* and it can make your loops much faster.\n\n## Loop walkthrough\n\n* Loop through every country in the dataset, and get the median, first and third\nquartiles, and range for each country. Store those summary statistics in a data frame.\n* What should the header look like?\n\n. . .\n\n\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncountries <- unique(meas$country)\nfor (i in 1:length(countries)) {...}\n```\n:::\n\n\n\n\n. . .\n\n* Note that we use the **index** as the control variable. When you need to\ndo complex operations inside a loop, this is easier than the **for-each**\nconstruction we used earlier.\n\n## Loop walkthrough {.scrollable}\n\n* Now write out the body of the code. First we need to subset the data, to get\nonly the data for the current country.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n}\n```\n:::\n\n\n\n\n. . .\n\n* Next we need to get the summary of the cases for that country.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_med <- median(country_cases, na.rm = TRUE)\n\tcountry_iqr <- IQR(country_cases, na.rm = TRUE)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n}\n```\n:::\n\n\n\n\n. . .\n\n* Next we save the summary statistics into a data frame.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n}\n```\n:::\n\n\n\n\n. . .\n\n* And finally, we save the data frame as the next element in our storage list.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\t# Save the results to our container\n\tres[[i]] <- country_summary\n}\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n\n\n:::\n:::\n\n\n\n\n. . .\n\n* Let's take a look at the results.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(res)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[[1]]\n      country min   Q1 median   Q3   max\n1 Afghanistan 353 1154   2205 5166 31107\n\n[[2]]\n  country min  Q1 median    Q3   max\n1  Angola  29 700   3271 14474 30067\n\n[[3]]\n  country min Q1 median Q3    max\n1 Albania   0  1     12 29 136034\n\n[[4]]\n  country min Q1 median Q3 max\n1 Andorra   0  0      1  2   5\n\n[[5]]\n               country min    Q1 median   Q3  max\n1 United Arab Emirates  22 89.75    320 1128 2913\n\n[[6]]\n    country min Q1 median     Q3   max\n1 Argentina   0  0     17 4591.5 42093\n```\n\n\n:::\n:::\n\n\n\n\n* How do we deal with this to get it into a nice form?\n\n. . .\n\n* We can use a *vectorization* trick: the function `do.call()` seems like\nancient computer science magic. And it is. But it will actually help us a\nlot.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nres_df <- do.call(rbind, res)\nhead(res_df)\n```\n\n::: {.cell-output-display}\n\n\n|country              | min|      Q1| median|      Q3|    max|\n|:--------------------|---:|-------:|------:|-------:|------:|\n|Afghanistan          | 353| 1154.00|   2205|  5166.0|  31107|\n|Angola               |  29|  700.00|   3271| 14474.0|  30067|\n|Albania              |   0|    1.00|     12|    29.0| 136034|\n|Andorra              |   0|    0.00|      1|     2.0|      5|\n|United Arab Emirates |  22|   89.75|    320|  1128.0|   2913|\n|Argentina            |   0|    0.00|     17|  4591.5|  42093|\n:::\n:::\n\n\n\n\n* It combined our data frames together! Let's take a look at the `rbind` and\n`do.call()` help packages to see what happened.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?rbind\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nCombine R Objects by Rows or Columns\n\nDescription:\n\n     Take a sequence of vector, matrix or data-frame arguments and\n     combine by _c_olumns or _r_ows, respectively.  These are generic\n     functions with methods for other R classes.\n\nUsage:\n\n     cbind(..., deparse.level = 1)\n     rbind(..., deparse.level = 1)\n     ## S3 method for class 'data.frame'\n     rbind(..., deparse.level = 1, make.row.names = TRUE,\n           stringsAsFactors = FALSE, factor.exclude = TRUE)\n     \nArguments:\n\n     ...: (generalized) vectors or matrices.  These can be given as\n          named arguments.  Other R objects may be coerced as\n          appropriate, or S4 methods may be used: see sections\n          'Details' and 'Value'.  (For the '\"data.frame\"' method of\n          'cbind' these can be further arguments to 'data.frame' such\n          as 'stringsAsFactors'.)\n\ndeparse.level: integer controlling the construction of labels in the\n          case of non-matrix-like arguments (for the default method):\n          'deparse.level = 0' constructs no labels;\n          the default 'deparse.level = 1' typically and 'deparse.level\n          = 2' always construct labels from the argument names, see the\n          'Value' section below.\n\nmake.row.names: (only for data frame method:) logical indicating if\n          unique and valid 'row.names' should be constructed from the\n          arguments.\n\nstringsAsFactors: logical, passed to 'as.data.frame'; only has an\n          effect when the '...' arguments contain a (non-'data.frame')\n          'character'.\n\nfactor.exclude: if the data frames contain factors, the default 'TRUE'\n          ensures that 'NA' levels of factors are kept, see PR#17562\n          and the 'Data frame methods'.  In R versions up to 3.6.x,\n          'factor.exclude = NA' has been implicitly hardcoded (R <=\n          3.6.0) or the default (R = 3.6.x, x >= 1).\n\nDetails:\n\n     The functions 'cbind' and 'rbind' are S3 generic, with methods for\n     data frames.  The data frame method will be used if at least one\n     argument is a data frame and the rest are vectors or matrices.\n     There can be other methods; in particular, there is one for time\n     series objects.  See the section on 'Dispatch' for how the method\n     to be used is selected.  If some of the arguments are of an S4\n     class, i.e., 'isS4(.)' is true, S4 methods are sought also, and\n     the hidden 'cbind' / 'rbind' functions from package 'methods'\n     maybe called, which in turn build on 'cbind2' or 'rbind2',\n     respectively.  In that case, 'deparse.level' is obeyed, similarly\n     to the default method.\n\n     In the default method, all the vectors/matrices must be atomic\n     (see 'vector') or lists.  Expressions are not allowed.  Language\n     objects (such as formulae and calls) and pairlists will be coerced\n     to lists: other objects (such as names and external pointers) will\n     be included as elements in a list result.  Any classes the inputs\n     might have are discarded (in particular, factors are replaced by\n     their internal codes).\n\n     If there are several matrix arguments, they must all have the same\n     number of columns (or rows) and this will be the number of columns\n     (or rows) of the result.  If all the arguments are vectors, the\n     number of columns (rows) in the result is equal to the length of\n     the longest vector.  Values in shorter arguments are recycled to\n     achieve this length (with a 'warning' if they are recycled only\n     _fractionally_).\n\n     When the arguments consist of a mix of matrices and vectors the\n     number of columns (rows) of the result is determined by the number\n     of columns (rows) of the matrix arguments.  Any vectors have their\n     values recycled or subsetted to achieve this length.\n\n     For 'cbind' ('rbind'), vectors of zero length (including 'NULL')\n     are ignored unless the result would have zero rows (columns), for\n     S compatibility.  (Zero-extent matrices do not occur in S3 and are\n     not ignored in R.)\n\n     Matrices are restricted to less than 2^31 rows and columns even on\n     64-bit systems.  So input vectors have the same length\n     restriction: as from R 3.2.0 input matrices with more elements\n     (but meeting the row and column restrictions) are allowed.\n\nValue:\n\n     For the default method, a matrix combining the '...' arguments\n     column-wise or row-wise.  (Exception: if there are no inputs or\n     all the inputs are 'NULL', the value is 'NULL'.)\n\n     The type of a matrix result determined from the highest type of\n     any of the inputs in the hierarchy raw < logical < integer <\n     double < complex < character < list .\n\n     For 'cbind' ('rbind') the column (row) names are taken from the\n     'colnames' ('rownames') of the arguments if these are matrix-like.\n     Otherwise from the names of the arguments or where those are not\n     supplied and 'deparse.level > 0', by deparsing the expressions\n     given, for 'deparse.level = 1' only if that gives a sensible name\n     (a 'symbol', see 'is.symbol').\n\n     For 'cbind' row names are taken from the first argument with\n     appropriate names: rownames for a matrix, or names for a vector of\n     length the number of rows of the result.\n\n     For 'rbind' column names are taken from the first argument with\n     appropriate names: colnames for a matrix, or names for a vector of\n     length the number of columns of the result.\n\nData frame methods:\n\n     The 'cbind' data frame method is just a wrapper for\n     'data.frame(..., check.names = FALSE)'.  This means that it will\n     split matrix columns in data frame arguments, and convert\n     character columns to factors unless 'stringsAsFactors = FALSE' is\n     specified.\n\n     The 'rbind' data frame method first drops all zero-column and\n     zero-row arguments.  (If that leaves none, it returns the first\n     argument with columns otherwise a zero-column zero-row data\n     frame.)  It then takes the classes of the columns from the first\n     data frame, and matches columns by name (rather than by position).\n     Factors have their levels expanded as necessary (in the order of\n     the levels of the level sets of the factors encountered) and the\n     result is an ordered factor if and only if all the components were\n     ordered factors.  Old-style categories (integer vectors with\n     levels) are promoted to factors.\n\n     Note that for result column 'j', 'factor(., exclude = X(j))' is\n     applied, where\n\n       X(j) := if(isTRUE(factor.exclude)) {\n                  if(!NA.lev[j]) NA # else NULL\n               } else factor.exclude\n     \n     where 'NA.lev[j]' is true iff any contributing data frame has had\n     a 'factor' in column 'j' with an explicit 'NA' level.\n\nDispatch:\n\n     The method dispatching is _not_ done via 'UseMethod()', but by\n     C-internal dispatching.  Therefore there is no need for, e.g.,\n     'rbind.default'.\n\n     The dispatch algorithm is described in the source file\n     ('.../src/main/bind.c') as\n\n       1. For each argument we get the list of possible class\n          memberships from the class attribute.\n\n       2. We inspect each class in turn to see if there is an\n          applicable method.\n\n       3. If we find a method, we use it.  Otherwise, if there was an\n          S4 object among the arguments, we try S4 dispatch; otherwise,\n          we use the default code.\n\n     If you want to combine other objects with data frames, it may be\n     necessary to coerce them to data frames first.  (Note that this\n     algorithm can result in calling the data frame method if all the\n     arguments are either data frames or vectors, and this will result\n     in the coercion of character vectors to factors.)\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'c' to combine vectors (and lists) as vectors, 'data.frame' to\n     combine vectors and matrices as a data frame.\n\nExamples:\n\n     m <- cbind(1, 1:7) # the '1' (= shorter vector) is recycled\n     m\n     m <- cbind(m, 8:14)[, c(1, 3, 2)] # insert a column\n     m\n     cbind(1:7, diag(3)) # vector is subset -> warning\n     \n     cbind(0, rbind(1, 1:3))\n     cbind(I = 0, X = rbind(a = 1, b = 1:3))  # use some names\n     xx <- data.frame(I = rep(0,2))\n     cbind(xx, X = rbind(a = 1, b = 1:3))   # named differently\n     \n     cbind(0, matrix(1, nrow = 0, ncol = 4)) #> Warning (making sense)\n     dim(cbind(0, matrix(1, nrow = 2, ncol = 0))) #-> 2 x 1\n     \n     ## deparse.level\n     dd <- 10\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 0) # middle 2 rownames\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 1) # 3 rownames (default)\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 2) # 4 rownames\n     \n     ## cheap row names:\n     b0 <- gl(3,4, labels=letters[1:3])\n     bf <- setNames(b0, paste0(\"o\", seq_along(b0)))\n     df  <- data.frame(a = 1, B = b0, f = gl(4,3))\n     df. <- data.frame(a = 1, B = bf, f = gl(4,3))\n     new <- data.frame(a = 8, B =\"B\", f = \"1\")\n     (df1  <- rbind(df , new))\n     (df.1 <- rbind(df., new))\n     stopifnot(identical(df1, rbind(df,  new, make.row.names=FALSE)),\n               identical(df1, rbind(df., new, make.row.names=FALSE)))\n```\n\n\n:::\n:::\n\n\n\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?do.call\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nExecute a Function Call\n\nDescription:\n\n     'do.call' constructs and executes a function call from a name or a\n     function and a list of arguments to be passed to it.\n\nUsage:\n\n     do.call(what, args, quote = FALSE, envir = parent.frame())\n     \nArguments:\n\n    what: either a function or a non-empty character string naming the\n          function to be called.\n\n    args: a _list_ of arguments to the function call.  The 'names'\n          attribute of 'args' gives the argument names.\n\n   quote: a logical value indicating whether to quote the arguments.\n\n   envir: an environment within which to evaluate the call.  This will\n          be most useful if 'what' is a character string and the\n          arguments are symbols or quoted expressions.\n\nDetails:\n\n     If 'quote' is 'FALSE', the default, then the arguments are\n     evaluated (in the calling environment, not in 'envir').  If\n     'quote' is 'TRUE' then each argument is quoted (see 'quote') so\n     that the effect of argument evaluation is to remove the quotes -\n     leaving the original arguments unevaluated when the call is\n     constructed.\n\n     The behavior of some functions, such as 'substitute', will not be\n     the same for functions evaluated using 'do.call' as if they were\n     evaluated from the interpreter.  The precise semantics are\n     currently undefined and subject to change.\n\nValue:\n\n     The result of the (evaluated) function call.\n\nWarning:\n\n     This should not be used to attempt to evade restrictions on the\n     use of '.Internal' and other non-API calls.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'call' which creates an unevaluated call.\n\nExamples:\n\n     do.call(\"complex\", list(imaginary = 1:3))\n     \n     ## if we already have a list (e.g., a data frame)\n     ## we need c() to add further arguments\n     tmp <- expand.grid(letters[1:2], 1:3, c(\"+\", \"-\"))\n     do.call(\"paste\", c(tmp, sep = \"\"))\n     \n     do.call(paste, list(as.name(\"A\"), as.name(\"B\")), quote = TRUE)\n     \n     ## examples of where objects will be found.\n     A <- 2\n     f <- function(x) print(x^2)\n     env <- new.env()\n     assign(\"A\", 10, envir = env)\n     assign(\"f\", f, envir = env)\n     f <- function(x) print(x)\n     f(A)                                      # 2\n     do.call(\"f\", list(A))                     # 2\n     do.call(\"f\", list(A), envir = env)        # 4\n     do.call( f,  list(A), envir = env)        # 2\n     do.call(\"f\", list(quote(A)), envir = env) # 100\n     do.call( f,  list(quote(A)), envir = env) # 10\n     do.call(\"f\", list(as.name(\"A\")), envir = env) # 100\n     \n     eval(call(\"f\", A))                      # 2\n     eval(call(\"f\", quote(A)))               # 2\n     eval(call(\"f\", A), envir = env)         # 4\n     eval(call(\"f\", quote(A)), envir = env)  # 100\n```\n\n\n:::\n:::\n\n\n\n\n. . .\n\n* OK, so basically what happened is that\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndo.call(rbind, list)\n```\n:::\n\n\n\n\n* Gets transformed into\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrbind(list[[1]], list[[2]], list[[3]], ..., list[[length(list)]])\n```\n:::\n\n\n\n\n* That's vectorization magic!\n\n## You try it! (if we have time) {.smaller}\n\n* Use the code you wrote before the get the incidence per 1000 people on the\nentire measles data set (add a column for incidence to the full data).\n* Use the code `plot(NULL, NULL, ...)` to make a blank plot. You will need to\nset the `xlim` and `ylim` arguments to sensible values, and specify the axis\ntitles as \"Year\" and \"Incidence per 1000 people\".\n* Using a `for` loop and the `lines()` function, make a plot that shows all of\nthe incidence curves over time, overlapping on the plot.\n* HINT: use `col = adjustcolor(black, alpha.f = 0.25)` to make the curves\ntransparent, so you can see the others.\n* BONUS PROBLEM: using the function `cumsum()`, make a plot of the cumulative\nincidence per 1000 people over time for all of the countries. (Dealing with\nthe NA's here is tricky!!)\n\n## Main problem solution\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmeas$cases_per_thousand <- meas$Cases / as.numeric(meas$total_pop) * 1000\ncountries <- unique(meas$country)\n\nplot(\n\tNULL, NULL,\n\txlim = c(1980, 2022),\n\tylim = c(0, 50),\n\txlab = \"Year\",\n\tylab = \"Incidence per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n\tcountry_data <- subset(meas, country == countries[[i]])\n\tlines(\n\t\tx = country_data$time,\n\t\ty = country_data$cases_per_thousand,\n\t\tcol = adjustcolor(\"black\", alpha.f = 0.25)\n\t)\n}\n```\n:::\n\n\n\n\n## Main problem solution\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png){width=960}\n:::\n:::\n\n\n\n\n## Bonus problem solution\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# First calculate the cumulative cases, treating NA as zeroes\ncumulative_cases <- ave(\n\tx = ifelse(is.na(meas$Cases), 0, meas$Cases),\n\tmeas$country,\n\tFUN = cumsum\n)\n\n# Now put the NAs back where they should be\nmeas$cumulative_cases <- cumulative_cases + (meas$Cases * 0)\n\nplot(\n\tNULL, NULL,\n\txlim = c(1980, 2022),\n\tylim = c(1, 6.2e6),\n\txlab = \"Year\",\n\tylab = \"Cumulative cases per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n\tcountry_data <- subset(meas, country == countries[[i]])\n\tlines(\n\t\tx = country_data$time,\n\t\ty = country_data$cumulative_cases,\n\t\tcol = adjustcolor(\"black\", alpha.f = 0.25)\n\t)\n}\n\ntext(\n\tx = 2020,\n\ty = 6e6,\n\tlabels = \"China →\"\n)\n```\n:::\n\n\n\n\n## Bonus problem solution\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png){width=960}\n:::\n:::\n\n\n\n\n## More practice on your own {.smaller}\n\n* Merge the `countries-regions.csv` data with the `measles_final.Rds` data.\nReshape the measles data so that `MCV1` and `MCV2` vaccine coverage are two\nseparate columns. Then use a loop to fit a poisson regression model for each\ncontinent where `Cases` is the outcome, and `MCV1 coverage` and `MCV2 coverage`\nare the predictors. Discuss your findings, and try adding an interation term.\n* Assess the impact of `age_months` as a confounder in the Diphtheria serology\ndata. First, write code to transform `age_months` into age ranges for each\nyear. Then, using a loop, calculate the crude odds ratio for the effect of\nvaccination on infection for each of the age ranges. How does the odds ratio\nchange as age increases? Can you formalize this analysis by fitting a logistic\nregression model with `age_months` and vaccination as predictors?\n\n\n",
+    "markdown": "---\ntitle: \"Iteration in R\"\nformat:\n  revealjs:\n    toc: false\n---\n\n\n\n\n\n## Learning goals\n\n1. Replace repetitive code with a `for` loop\n1. Use vectorization to replace unnecessary loops\n\n## What is iteration?\n\n* Whenever you repeat something, that's iteration.\n* In `R`, this means running the same code multiple times in a row.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndata(\"penguins\", package = \"palmerpenguins\")\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n:::\n:::\n\n\n## Parts of a loop\n\n\n::: {.cell}\n\n```{.r .cell-code  code-line-numbers=\"1,9\"}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n:::\n\n\nThe **header** declares how many times we will repeat the same code. The header\ncontains a **control variable** that changes in each repetition and a\n**sequence** of values for the control variable to take.\n\n## Parts of a loop\n\n\n::: {.cell}\n\n```{.r .cell-code  code-line-numbers=\"2-8\"}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n:::\n\n\nThe **body** of the loop contains code that will be repeated a number of times\nbased on the header instructions. In `R`, the body has to be surrounded by\ncurly braces.\n\n## Header parts\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {...}\n```\n:::\n\n\n* `for`: keyword that declares we are doing a for loop.\n* `(...)`: parentheses after `for` declare the control variable and sequence.\n* `this_island`: the control variable.\n* `in`: keyword that separates the control varibale and sequence.\n* `levels(penguins$island)`: the sequence.\n* `{}`: curly braces will contain the body code.\n\n## Header parts\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {...}\n```\n:::\n\n\n* Since `levels(penguins$island)` evaluates to\n`c(\"Biscoe\", \"Dream\", \"Torgersen\")`, our loop will repeat 3 times.\n\n| Iteration | `this_island` |\n|-----------|---------------|\n| 1         | \"Biscoe\"      |\n| 2         | \"Dream\"       |\n| 3         | \"Torgersen\"   |\n\n* Everything inside of `{...}` will be repeated three times.\n\n## Loop iteration 1\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Biscoe\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Biscoe\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\n```\n:::\n:::\n\n\n## Loop iteration 2\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Dream\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Dream\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThe mean bill depth on Dream Island was 18.34 mm.\n```\n:::\n:::\n\n\n## Loop iteration 3\n\n\n::: {.cell}\n\n```{.r .cell-code}\nisland_mean <-\n\tpenguins$bill_depth_mm[penguins$island == \"Torgersen\"] |>\n\tmean(na.rm = TRUE) |>\n\tround(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Torgersen\", \"Island was\", island_mean,\n\t\t\t\t\t\"mm.\\n\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n:::\n:::\n\n\n## The loop structure automates this process for us so we don't have to copy and paste our code!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (this_island in levels(penguins$island)) {\n\tisland_mean <-\n\t\tpenguins$bill_depth_mm[penguins$island == this_island] |>\n\t\tmean(na.rm = TRUE) |>\n\t\tround(digits = 2)\n\t\n\tcat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n\t\t\t\t\t\t\t\"mm.\\n\"))\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.\n```\n:::\n:::\n\n\n## Remember: write DRY code!\n\n* DRY = \"Don't Repeat Yourself\"\n* Instead of copying and pasting, write loops and functions.\n* Easier to debug and change in the future!\n\n. . .\n\n* Of course, we all copy and paste code sometimes. If you are running on a\ntight deadline or can't get a loop or function to work, you might need to.\n**DRY code is good, but working code is best!**\n\n## {#tweet-slide data-menu-title=\"Hadley tweet\" .center}\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](../images/hadley-tweet.PNG)\n:::\n:::\n\n\n## You try it!\n\nWrite a loop that goes from 1 to 10, squares each of the numbers, and prints\nthe squared number.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:10) {\n\tcat(i ^ 2, \"\\n\")\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n1 \n4 \n9 \n16 \n25 \n36 \n49 \n64 \n81 \n100 \n```\n:::\n:::\n\n\n## Wait, did we need to do that? {.incremental}\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n## Wait, did we need to do that?\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# No loop needed!\n(1:10)^2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n [1]   1   4   9  16  25  36  49  64  81 100\n```\n:::\n:::\n\n\n## Wait, did we need to do that?\n\n* Well, yes, because you need to practice loops!\n* But technically no, because we can use **vectorization**.\n* Almost all basic operations in R are **vectorized**: they work on a vector of\narguments all at the same time.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# No loop needed!\n(1:10)^2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n [1]   1   4   9  16  25  36  49  64  81 100\n```\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the first 10 odd numbers, a common CS 101 loop problem on exams\n(1:20)[which((1:20 %% 2) == 1)]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n [1]  1  3  5  7  9 11 13 15 17 19\n```\n:::\n:::\n\n\n* So you should really try vectorization first, then use loops only when\nyou can't use vectorization.\n\n## Loop walkthrough\n\n* Let's walk through a complex but useful example where we can't use\nvectorization.\n* Load the cleaned measles dataset, and subset it so you only have MCV1 records.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmeas <- readRDS(here::here(\"data\", \"measles_final.Rds\")) |>\n\tsubset(vaccine_antigen == \"MCV1\")\nstr(meas)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n'data.frame':\t7972 obs. of  7 variables:\n $ iso3c           : chr  \"AFG\" \"AFG\" \"AFG\" \"AFG\" ...\n $ time            : int  1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 ...\n $ country         : chr  \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" ...\n $ Cases           : int  2792 5166 2900 640 353 2012 1511 638 1154 492 ...\n $ vaccine_antigen : chr  \"MCV1\" \"MCV1\" \"MCV1\" \"MCV1\" ...\n $ vaccine_coverage: int  11 NA 8 9 14 14 14 31 34 22 ...\n $ total_pop       : chr  \"12486631\" \"11155195\" \"10088289\" \"9951449\" ...\n```\n:::\n:::\n\n\n## Loop walkthrough\n\n* First, make an empty `list`. This is where we'll store our results. Make it\nthe same length as the number of countries in the dataset.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nres <- vector(mode = \"list\", length = length(unique(meas$country)))\n```\n:::\n\n\n* This is called *preallocation* and it can make your loops much faster.\n\n## Loop walkthrough\n\n* Loop through every country in the dataset, and get the median, first and third\nquartiles, and range for each country. Store those summary statistics in a data frame.\n* What should the header look like?\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncountries <- unique(meas$country)\nfor (i in 1:length(countries)) {...}\n```\n:::\n\n\n. . .\n\n* Note that we use the **index** as the control variable. When you need to\ndo complex operations inside a loop, this is easier than the **for-each**\nconstruction we used earlier.\n\n## Loop walkthrough {.scrollable}\n\n* Now write out the body of the code. First we need to subset the data, to get\nonly the data for the current country.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n}\n```\n:::\n\n\n. . .\n\n* Next we need to get the summary of the cases for that country.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_med <- median(country_cases, na.rm = TRUE)\n\tcountry_iqr <- IQR(country_cases, na.rm = TRUE)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n}\n```\n:::\n\n\n. . .\n\n* Next we save the summary statistics into a data frame.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n}\n```\n:::\n\n\n. . .\n\n* And finally, we save the data frame as the next element in our storage list.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\t# Save the results to our container\n\tres[[i]] <- country_summary\n}\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in min(x): no non-missing arguments to min; returning Inf\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in max(x): no non-missing arguments to max; returning -Inf\n```\n:::\n:::\n\n\n. . .\n\n* Let's take a look at the results.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(res)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1]]\n      country min   Q1 median   Q3   max\n1 Afghanistan 353 1154   2205 5166 31107\n\n[[2]]\n  country min  Q1 median    Q3   max\n1  Angola  29 700   3271 14474 30067\n\n[[3]]\n  country min Q1 median Q3    max\n1 Albania   0  1     12 29 136034\n\n[[4]]\n  country min Q1 median Q3 max\n1 Andorra   0  0      1  2   5\n\n[[5]]\n               country min    Q1 median   Q3  max\n1 United Arab Emirates  22 89.75    320 1128 2913\n\n[[6]]\n    country min Q1 median     Q3   max\n1 Argentina   0  0     17 4591.5 42093\n```\n:::\n:::\n\n\n* How do we deal with this to get it into a nice form?\n\n. . .\n\n* We can use a *vectorization* trick: the function `do.call()` seems like\nancient computer science magic. And it is. But it will actually help us a\nlot.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nres_df <- do.call(rbind, res)\nhead(res_df)\n```\n\n::: {.cell-output-display}\n|country              | min|      Q1| median|      Q3|    max|\n|:--------------------|---:|-------:|------:|-------:|------:|\n|Afghanistan          | 353| 1154.00|   2205|  5166.0|  31107|\n|Angola               |  29|  700.00|   3271| 14474.0|  30067|\n|Albania              |   0|    1.00|     12|    29.0| 136034|\n|Andorra              |   0|    0.00|      1|     2.0|      5|\n|United Arab Emirates |  22|   89.75|    320|  1128.0|   2913|\n|Argentina            |   0|    0.00|     17|  4591.5|  42093|\n:::\n:::\n\n\n* It combined our data frames together! Let's take a look at the `rbind` and\n`do.call()` help packages to see what happened.\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?rbind\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nCombine R Objects by Rows or Columns\n\nDescription:\n\n     Take a sequence of vector, matrix or data-frame arguments and\n     combine by _c_olumns or _r_ows, respectively.  These are generic\n     functions with methods for other R classes.\n\nUsage:\n\n     cbind(..., deparse.level = 1)\n     rbind(..., deparse.level = 1)\n     ## S3 method for class 'data.frame'\n     rbind(..., deparse.level = 1, make.row.names = TRUE,\n           stringsAsFactors = FALSE, factor.exclude = TRUE)\n     \nArguments:\n\n     ...: (generalized) vectors or matrices.  These can be given as\n          named arguments.  Other R objects may be coerced as\n          appropriate, or S4 methods may be used: see sections\n          'Details' and 'Value'.  (For the '\"data.frame\"' method of\n          'cbind' these can be further arguments to 'data.frame' such\n          as 'stringsAsFactors'.)\n\ndeparse.level: integer controlling the construction of labels in the\n          case of non-matrix-like arguments (for the default method):\n          'deparse.level = 0' constructs no labels;\n          the default 'deparse.level = 1' typically and 'deparse.level\n          = 2' always construct labels from the argument names, see the\n          'Value' section below.\n\nmake.row.names: (only for data frame method:) logical indicating if\n          unique and valid 'row.names' should be constructed from the\n          arguments.\n\nstringsAsFactors: logical, passed to 'as.data.frame'; only has an\n          effect when the '...' arguments contain a (non-'data.frame')\n          'character'.\n\nfactor.exclude: if the data frames contain factors, the default 'TRUE'\n          ensures that 'NA' levels of factors are kept, see PR#17562\n          and the 'Data frame methods'.  In R versions up to 3.6.x,\n          'factor.exclude = NA' has been implicitly hardcoded (R <=\n          3.6.0) or the default (R = 3.6.x, x >= 1).\n\nDetails:\n\n     The functions 'cbind' and 'rbind' are S3 generic, with methods for\n     data frames.  The data frame method will be used if at least one\n     argument is a data frame and the rest are vectors or matrices.\n     There can be other methods; in particular, there is one for time\n     series objects.  See the section on 'Dispatch' for how the method\n     to be used is selected.  If some of the arguments are of an S4\n     class, i.e., 'isS4(.)' is true, S4 methods are sought also, and\n     the hidden 'cbind' / 'rbind' functions from package 'methods'\n     maybe called, which in turn build on 'cbind2' or 'rbind2',\n     respectively.  In that case, 'deparse.level' is obeyed, similarly\n     to the default method.\n\n     In the default method, all the vectors/matrices must be atomic\n     (see 'vector') or lists.  Expressions are not allowed.  Language\n     objects (such as formulae and calls) and pairlists will be coerced\n     to lists: other objects (such as names and external pointers) will\n     be included as elements in a list result.  Any classes the inputs\n     might have are discarded (in particular, factors are replaced by\n     their internal codes).\n\n     If there are several matrix arguments, they must all have the same\n     number of columns (or rows) and this will be the number of columns\n     (or rows) of the result.  If all the arguments are vectors, the\n     number of columns (rows) in the result is equal to the length of\n     the longest vector.  Values in shorter arguments are recycled to\n     achieve this length (with a 'warning' if they are recycled only\n     _fractionally_).\n\n     When the arguments consist of a mix of matrices and vectors the\n     number of columns (rows) of the result is determined by the number\n     of columns (rows) of the matrix arguments.  Any vectors have their\n     values recycled or subsetted to achieve this length.\n\n     For 'cbind' ('rbind'), vectors of zero length (including 'NULL')\n     are ignored unless the result would have zero rows (columns), for\n     S compatibility.  (Zero-extent matrices do not occur in S3 and are\n     not ignored in R.)\n\n     Matrices are restricted to less than 2^31 rows and columns even on\n     64-bit systems.  So input vectors have the same length\n     restriction: as from R 3.2.0 input matrices with more elements\n     (but meeting the row and column restrictions) are allowed.\n\nValue:\n\n     For the default method, a matrix combining the '...' arguments\n     column-wise or row-wise.  (Exception: if there are no inputs or\n     all the inputs are 'NULL', the value is 'NULL'.)\n\n     The type of a matrix result determined from the highest type of\n     any of the inputs in the hierarchy raw < logical < integer <\n     double < complex < character < list .\n\n     For 'cbind' ('rbind') the column (row) names are taken from the\n     'colnames' ('rownames') of the arguments if these are matrix-like.\n     Otherwise from the names of the arguments or where those are not\n     supplied and 'deparse.level > 0', by deparsing the expressions\n     given, for 'deparse.level = 1' only if that gives a sensible name\n     (a 'symbol', see 'is.symbol').\n\n     For 'cbind' row names are taken from the first argument with\n     appropriate names: rownames for a matrix, or names for a vector of\n     length the number of rows of the result.\n\n     For 'rbind' column names are taken from the first argument with\n     appropriate names: colnames for a matrix, or names for a vector of\n     length the number of columns of the result.\n\nData frame methods:\n\n     The 'cbind' data frame method is just a wrapper for\n     'data.frame(..., check.names = FALSE)'.  This means that it will\n     split matrix columns in data frame arguments, and convert\n     character columns to factors unless 'stringsAsFactors = FALSE' is\n     specified.\n\n     The 'rbind' data frame method first drops all zero-column and\n     zero-row arguments.  (If that leaves none, it returns the first\n     argument with columns otherwise a zero-column zero-row data\n     frame.)  It then takes the classes of the columns from the first\n     data frame, and matches columns by name (rather than by position).\n     Factors have their levels expanded as necessary (in the order of\n     the levels of the level sets of the factors encountered) and the\n     result is an ordered factor if and only if all the components were\n     ordered factors.  (The last point differs from S-PLUS.)  Old-style\n     categories (integer vectors with levels) are promoted to factors.\n\n     Note that for result column 'j', 'factor(., exclude = X(j))' is\n     applied, where\n\n       X(j) := if(isTRUE(factor.exclude)) {\n                  if(!NA.lev[j]) NA # else NULL\n               } else factor.exclude\n     \n     where 'NA.lev[j]' is true iff any contributing data frame has had\n     a 'factor' in column 'j' with an explicit 'NA' level.\n\nDispatch:\n\n     The method dispatching is _not_ done via 'UseMethod()', but by\n     C-internal dispatching.  Therefore there is no need for, e.g.,\n     'rbind.default'.\n\n     The dispatch algorithm is described in the source file\n     ('.../src/main/bind.c') as\n\n       1. For each argument we get the list of possible class\n          memberships from the class attribute.\n\n       2. We inspect each class in turn to see if there is an\n          applicable method.\n\n       3. If we find a method, we use it.  Otherwise, if there was an\n          S4 object among the arguments, we try S4 dispatch; otherwise,\n          we use the default code.\n\n     If you want to combine other objects with data frames, it may be\n     necessary to coerce them to data frames first.  (Note that this\n     algorithm can result in calling the data frame method if all the\n     arguments are either data frames or vectors, and this will result\n     in the coercion of character vectors to factors.)\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'c' to combine vectors (and lists) as vectors, 'data.frame' to\n     combine vectors and matrices as a data frame.\n\nExamples:\n\n     m <- cbind(1, 1:7) # the '1' (= shorter vector) is recycled\n     m\n     m <- cbind(m, 8:14)[, c(1, 3, 2)] # insert a column\n     m\n     cbind(1:7, diag(3)) # vector is subset -> warning\n     \n     cbind(0, rbind(1, 1:3))\n     cbind(I = 0, X = rbind(a = 1, b = 1:3))  # use some names\n     xx <- data.frame(I = rep(0,2))\n     cbind(xx, X = rbind(a = 1, b = 1:3))   # named differently\n     \n     cbind(0, matrix(1, nrow = 0, ncol = 4)) #> Warning (making sense)\n     dim(cbind(0, matrix(1, nrow = 2, ncol = 0))) #-> 2 x 1\n     \n     ## deparse.level\n     dd <- 10\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 0) # middle 2 rownames\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 1) # 3 rownames (default)\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 2) # 4 rownames\n     \n     ## cheap row names:\n     b0 <- gl(3,4, labels=letters[1:3])\n     bf <- setNames(b0, paste0(\"o\", seq_along(b0)))\n     df  <- data.frame(a = 1, B = b0, f = gl(4,3))\n     df. <- data.frame(a = 1, B = bf, f = gl(4,3))\n     new <- data.frame(a = 8, B =\"B\", f = \"1\")\n     (df1  <- rbind(df , new))\n     (df.1 <- rbind(df., new))\n     stopifnot(identical(df1, rbind(df,  new, make.row.names=FALSE)),\n               identical(df1, rbind(df., new, make.row.names=FALSE)))\n```\n:::\n:::\n\n\n. . .\n\n\n::: {.cell}\n\n```{.r .cell-code}\n?do.call\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nExecute a Function Call\n\nDescription:\n\n     'do.call' constructs and executes a function call from a name or a\n     function and a list of arguments to be passed to it.\n\nUsage:\n\n     do.call(what, args, quote = FALSE, envir = parent.frame())\n     \nArguments:\n\n    what: either a function or a non-empty character string naming the\n          function to be called.\n\n    args: a _list_ of arguments to the function call.  The 'names'\n          attribute of 'args' gives the argument names.\n\n   quote: a logical value indicating whether to quote the arguments.\n\n   envir: an environment within which to evaluate the call.  This will\n          be most useful if 'what' is a character string and the\n          arguments are symbols or quoted expressions.\n\nDetails:\n\n     If 'quote' is 'FALSE', the default, then the arguments are\n     evaluated (in the calling environment, not in 'envir').  If\n     'quote' is 'TRUE' then each argument is quoted (see 'quote') so\n     that the effect of argument evaluation is to remove the quotes -\n     leaving the original arguments unevaluated when the call is\n     constructed.\n\n     The behavior of some functions, such as 'substitute', will not be\n     the same for functions evaluated using 'do.call' as if they were\n     evaluated from the interpreter.  The precise semantics are\n     currently undefined and subject to change.\n\nValue:\n\n     The result of the (evaluated) function call.\n\nWarning:\n\n     This should not be used to attempt to evade restrictions on the\n     use of '.Internal' and other non-API calls.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'call' which creates an unevaluated call.\n\nExamples:\n\n     do.call(\"complex\", list(imaginary = 1:3))\n     \n     ## if we already have a list (e.g., a data frame)\n     ## we need c() to add further arguments\n     tmp <- expand.grid(letters[1:2], 1:3, c(\"+\", \"-\"))\n     do.call(\"paste\", c(tmp, sep = \"\"))\n     \n     do.call(paste, list(as.name(\"A\"), as.name(\"B\")), quote = TRUE)\n     \n     ## examples of where objects will be found.\n     A <- 2\n     f <- function(x) print(x^2)\n     env <- new.env()\n     assign(\"A\", 10, envir = env)\n     assign(\"f\", f, envir = env)\n     f <- function(x) print(x)\n     f(A)                                      # 2\n     do.call(\"f\", list(A))                     # 2\n     do.call(\"f\", list(A), envir = env)        # 4\n     do.call( f,  list(A), envir = env)        # 2\n     do.call(\"f\", list(quote(A)), envir = env) # 100\n     do.call( f,  list(quote(A)), envir = env) # 10\n     do.call(\"f\", list(as.name(\"A\")), envir = env) # 100\n     \n     eval(call(\"f\", A))                      # 2\n     eval(call(\"f\", quote(A)))               # 2\n     eval(call(\"f\", A), envir = env)         # 4\n     eval(call(\"f\", quote(A)), envir = env)  # 100\n```\n:::\n:::\n\n\n. . .\n\n* OK, so basically what happened is that\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndo.call(rbind, list)\n```\n:::\n\n\n* Gets transformed into\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrbind(list[[1]], list[[2]], list[[3]], ..., list[[length(list)]])\n```\n:::\n\n\n* That's vectorization magic!\n\n## You try it! (if we have time) {.smaller}\n\n* Use the code you wrote before the get the incidence per 1000 people on the\nentire measles data set (add a column for incidence to the full data).\n* Use the code `plot(NULL, NULL, ...)` to make a blank plot. You will need to\nset the `xlim` and `ylim` arguments to sensible values, and specify the axis\ntitles as \"Year\" and \"Incidence per 1000 people\".\n* Using a `for` loop and the `lines()` function, make a plot that shows all of\nthe incidence curves over time, overlapping on the plot.\n* HINT: use `col = adjustcolor(black, alpha.f = 0.25)` to make the curves\ntransparent, so you can see the others.\n* BONUS PROBLEM: using the function `cumsum()`, make a plot of the cumulative\nincidence per 1000 people over time for all of the countries. (Dealing with\nthe NA's here is tricky!!)\n\n## Main problem solution\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmeas$cases_per_thousand <- meas$Cases / as.numeric(meas$total_pop) * 1000\ncountries <- unique(meas$country)\n\nplot(\n\tNULL, NULL,\n\txlim = c(1980, 2022),\n\tylim = c(0, 50),\n\txlab = \"Year\",\n\tylab = \"Incidence per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n\tcountry_data <- subset(meas, country == countries[[i]])\n\tlines(\n\t\tx = country_data$time,\n\t\ty = country_data$cases_per_thousand,\n\t\tcol = adjustcolor(\"black\", alpha.f = 0.25)\n\t)\n}\n```\n:::\n\n\n## Main problem solution\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png){width=960}\n:::\n:::\n\n\n## Bonus problem solution\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# First calculate the cumulative cases, treating NA as zeroes\ncumulative_cases <- ave(\n\tx = ifelse(is.na(meas$Cases), 0, meas$Cases),\n\tmeas$country,\n\tFUN = cumsum\n)\n\n# Now put the NAs back where they should be\nmeas$cumulative_cases <- cumulative_cases + (meas$Cases * 0)\n\nplot(\n\tNULL, NULL,\n\txlim = c(1980, 2022),\n\tylim = c(1, 6.2e6),\n\txlab = \"Year\",\n\tylab = \"Cumulative cases per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n\tcountry_data <- subset(meas, country == countries[[i]])\n\tlines(\n\t\tx = country_data$time,\n\t\ty = country_data$cumulative_cases,\n\t\tcol = adjustcolor(\"black\", alpha.f = 0.25)\n\t)\n}\n\ntext(\n\tx = 2020,\n\ty = 6e6,\n\tlabels = \"China →\"\n)\n```\n:::\n\n\n## Bonus problem solution\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png){width=960}\n:::\n:::\n\n\n## More practice on your own {.smaller}\n\n* Merge the `countries-regions.csv` data with the `measles_final.Rds` data.\nReshape the measles data so that `MCV1` and `MCV2` vaccine coverage are two\nseparate columns. Then use a loop to fit a poisson regression model for each\ncontinent where `Cases` is the outcome, and `MCV1 coverage` and `MCV2 coverage`\nare the predictors. Discuss your findings, and try adding an interation term.\n* Assess the impact of `age_months` as a confounder in the Diphtheria serology\ndata. First, write code to transform `age_months` into age ranges for each\nyear. Then, using a loop, calculate the crude odds ratio for the effect of\nvaccination on infection for each of the age ranges. How does the odds ratio\nchange as age increases? Can you formalize this analysis by fitting a logistic\nregression model with `age_months` and vaccination as predictors?\n\n\n",
     "supporting": [
       "ModuleXX-Iteration_files"
     ],
diff --git a/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-30-1.png b/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-30-1.png
index 84a077e..d1c55ba 100644
Binary files a/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-30-1.png and b/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-30-1.png differ
diff --git a/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-32-1.png b/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-32-1.png
index 009fce5..66ca0eb 100644
Binary files a/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-32-1.png and b/_freeze/modules/ModuleXX-Iteration/figure-revealjs/unnamed-chunk-32-1.png differ
diff --git a/archive/participants.xlsx b/archive/participants.xlsx
new file mode 100644
index 0000000..9ec8bb6
Binary files /dev/null and b/archive/participants.xlsx differ
diff --git a/docs/archive/CaseStudy01.html b/docs/archive/CaseStudy01.html
new file mode 100644
index 0000000..81d333a
--- /dev/null
+++ b/docs/archive/CaseStudy01.html
@@ -0,0 +1,1008 @@
+<!DOCTYPE html>
+<html lang="en"><head>
+<script src="../site_libs/clipboard/clipboard.min.js"></script>
+<script src="../site_libs/quarto-html/tabby.min.js"></script>
+<script src="../site_libs/quarto-html/popper.min.js"></script>
+<script src="../site_libs/quarto-html/tippy.umd.min.js"></script>
+<link href="../site_libs/quarto-html/tippy.css" rel="stylesheet">
+<link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
+<link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
+<link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
+  <meta name="generator" content="quarto-1.3.353">
+
+  <meta name="author" content="Amy Winter">
+  <meta name="author" content="Zane Billings">
+  <title>SISMID Module NUMBER Materials (2025) - Algorithmic Thinking Case Study 1</title>
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/reset.css">
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/reveal.css">
+  <style>
+    code{white-space: pre-wrap;}
+    span.smallcaps{font-variant: small-caps;}
+    div.columns{display: flex; gap: min(4vw, 1.5em);}
+    div.column{flex: auto; overflow-x: auto;}
+    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+    ul.task-list{list-style: none;}
+    ul.task-list li input[type="checkbox"] {
+      width: 0.8em;
+      margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
+      vertical-align: middle;
+    }
+    /* CSS for syntax highlighting */
+    pre > code.sourceCode { white-space: pre; position: relative; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
+    pre > code.sourceCode > span:empty { height: 1.2em; }
+    .sourceCode { overflow: visible; }
+    code.sourceCode > span { color: inherit; text-decoration: inherit; }
+    div.sourceCode { margin: 1em 0; }
+    pre.sourceCode { margin: 0; }
+    @media screen {
+    div.sourceCode { overflow: auto; }
+    }
+    @media print {
+    pre > code.sourceCode { white-space: pre-wrap; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+    }
+    pre.numberSource code
+      { counter-reset: source-line 0; }
+    pre.numberSource code > span
+      { position: relative; left: -4em; counter-increment: source-line; }
+    pre.numberSource code > span > a:first-child::before
+      { content: counter(source-line);
+        position: relative; left: -1em; text-align: right; vertical-align: baseline;
+        border: none; display: inline-block;
+        -webkit-touch-callout: none; -webkit-user-select: none;
+        -khtml-user-select: none; -moz-user-select: none;
+        -ms-user-select: none; user-select: none;
+        padding: 0 4px; width: 4em;
+        color: #aaaaaa;
+      }
+    pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
+    div.sourceCode
+      { color: #003b4f; background-color: #f1f3f5; }
+    @media screen {
+    pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+    }
+    code span { color: #003b4f; } /* Normal */
+    code span.al { color: #ad0000; } /* Alert */
+    code span.an { color: #5e5e5e; } /* Annotation */
+    code span.at { color: #657422; } /* Attribute */
+    code span.bn { color: #ad0000; } /* BaseN */
+    code span.bu { } /* BuiltIn */
+    code span.cf { color: #003b4f; } /* ControlFlow */
+    code span.ch { color: #20794d; } /* Char */
+    code span.cn { color: #8f5902; } /* Constant */
+    code span.co { color: #5e5e5e; } /* Comment */
+    code span.cv { color: #5e5e5e; font-style: italic; } /* CommentVar */
+    code span.do { color: #5e5e5e; font-style: italic; } /* Documentation */
+    code span.dt { color: #ad0000; } /* DataType */
+    code span.dv { color: #ad0000; } /* DecVal */
+    code span.er { color: #ad0000; } /* Error */
+    code span.ex { } /* Extension */
+    code span.fl { color: #ad0000; } /* Float */
+    code span.fu { color: #4758ab; } /* Function */
+    code span.im { color: #00769e; } /* Import */
+    code span.in { color: #5e5e5e; } /* Information */
+    code span.kw { color: #003b4f; } /* Keyword */
+    code span.op { color: #5e5e5e; } /* Operator */
+    code span.ot { color: #003b4f; } /* Other */
+    code span.pp { color: #ad0000; } /* Preprocessor */
+    code span.sc { color: #5e5e5e; } /* SpecialChar */
+    code span.ss { color: #20794d; } /* SpecialString */
+    code span.st { color: #20794d; } /* String */
+    code span.va { color: #111111; } /* Variable */
+    code span.vs { color: #20794d; } /* VerbatimString */
+    code span.wa { color: #5e5e5e; font-style: italic; } /* Warning */
+  </style>
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/theme/quarto.css">
+  <link href="../site_libs/revealjs/plugin/quarto-line-highlight/line-highlight.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/reveal-menu/menu.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/reveal-menu/quarto-menu.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/quarto-support/footer.css" rel="stylesheet">
+  <style type="text/css">
+
+  .callout {
+    margin-top: 1em;
+    margin-bottom: 1em;  
+    border-radius: .25rem;
+  }
+
+  .callout.callout-style-simple { 
+    padding: 0em 0.5em;
+    border-left: solid #acacac .3rem;
+    border-right: solid 1px silver;
+    border-top: solid 1px silver;
+    border-bottom: solid 1px silver;
+    display: flex;
+  }
+
+  .callout.callout-style-default {
+    border-left: solid #acacac .3rem;
+    border-right: solid 1px silver;
+    border-top: solid 1px silver;
+    border-bottom: solid 1px silver;
+  }
+
+  .callout .callout-body-container {
+    flex-grow: 1;
+  }
+
+  .callout.callout-style-simple .callout-body {
+    font-size: 1rem;
+    font-weight: 400;
+  }
+
+  .callout.callout-style-default .callout-body {
+    font-size: 0.9rem;
+    font-weight: 400;
+  }
+
+  .callout.callout-titled.callout-style-simple .callout-body {
+    margin-top: 0.2em;
+  }
+
+  .callout:not(.callout-titled) .callout-body {
+      display: flex;
+  }
+
+  .callout:not(.no-icon).callout-titled.callout-style-simple .callout-content {
+    padding-left: 1.6em;
+  }
+
+  .callout.callout-titled .callout-header {
+    padding-top: 0.2em;
+    margin-bottom: -0.2em;
+  }
+
+  .callout.callout-titled .callout-title  p {
+    margin-top: 0.5em;
+    margin-bottom: 0.5em;
+  }
+    
+  .callout.callout-titled.callout-style-simple .callout-content  p {
+    margin-top: 0;
+  }
+
+  .callout.callout-titled.callout-style-default .callout-content  p {
+    margin-top: 0.7em;
+  }
+
+  .callout.callout-style-simple div.callout-title {
+    border-bottom: none;
+    font-size: .9rem;
+    font-weight: 600;
+    opacity: 75%;
+  }
+
+  .callout.callout-style-default  div.callout-title {
+    border-bottom: none;
+    font-weight: 600;
+    opacity: 85%;
+    font-size: 0.9rem;
+    padding-left: 0.5em;
+    padding-right: 0.5em;
+  }
+
+  .callout.callout-style-default div.callout-content {
+    padding-left: 0.5em;
+    padding-right: 0.5em;
+  }
+
+  .callout.callout-style-simple .callout-icon::before {
+    height: 1rem;
+    width: 1rem;
+    display: inline-block;
+    content: "";
+    background-repeat: no-repeat;
+    background-size: 1rem 1rem;
+  }
+
+  .callout.callout-style-default .callout-icon::before {
+    height: 0.9rem;
+    width: 0.9rem;
+    display: inline-block;
+    content: "";
+    background-repeat: no-repeat;
+    background-size: 0.9rem 0.9rem;
+  }
+
+  .callout-title {
+    display: flex
+  }
+    
+  .callout-icon::before {
+    margin-top: 1rem;
+    padding-right: .5rem;
+  }
+
+  .callout.no-icon::before {
+    display: none !important;
+  }
+
+  .callout.callout-titled .callout-body > .callout-content > :last-child {
+    margin-bottom: 0.5rem;
+  }
+
+  .callout.callout-titled .callout-icon::before {
+    margin-top: .5rem;
+    padding-right: .5rem;
+  }
+
+  .callout:not(.callout-titled) .callout-icon::before {
+    margin-top: 1rem;
+    padding-right: .5rem;
+  }
+
+  /* Callout Types */
+
+  div.callout-note {
+    border-left-color: #4582ec !important;
+  }
+
+  div.callout-note .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAEU0lEQVRYCcVXTWhcVRQ+586kSUMMxkyaElstCto2SIhitS5Ek8xUKV2poatCcVHtUlFQk8mbaaziwpWgglJwVaquitBOfhQXFlqlzSJpFSpIYyXNjBNiTCck7x2/8/LeNDOZxDuEkgOXe++553zfefee+/OYLOXFk3+1LLrRdiO81yNqZ6K9cG0P3MeFaMIQjXssE8Z1JzLO9ls20MBZX7oG8w9GxB0goaPrW5aNMp1yOZIa7Wv6o2ykpLtmAPs/vrG14Z+6d4jpbSKuhdcSyq9wGMPXjonwmESXrriLzFGOdDBLB8Y6MNYBu0dRokSygMA/mrun8MGFN3behm6VVAwg4WR3i6FvYK1T7MHo9BK7ydH+1uurECoouk5MPRyVSBrBHMYwVobG2aOXM07sWrn5qgB60rc6mcwIDJtQrnrEr44kmy+UO9r0u9O5/YbkS9juQckLed3DyW2XV/qWBBB3ptvI8EUY3I9p/67OW+g967TNr3Sotn3IuVlfMLVnsBwH4fsnebJvyGm5GeIUA3jljERmrv49SizPYuq+z7c2H/jlGC+Ghhupn/hcapqmcudB9jwJ/3jvnvu6vu5lVzF1fXyZuZZ7U8nRmVzytvT+H3kilYvH09mLWrQdwFSsFEsxFVs5fK7A0g8gMZjbif4ACpKbjv7gNGaD8bUrlk8x+KRflttr22JEMRUbTUwwDQScyzPgedQHZT0xnx7ujw2jfVfExwYHwOsDTjLdJ2ebmeQIlJ7neo41s/DrsL3kl+W2lWvAga0tR3zueGr6GL78M3ifH0rGXrBC2aAR8uYcIA5gwV8zIE8onoh8u0Fca/ciF7j1uOzEnqcIm59sEXoGc0+z6+H45V1CvAvHcD7THztu669cnp+L0okAeIc6zjbM/24LgGM1gZk7jnRu1aQWoU9sfUOuhrmtaPIO3YY1KLLWZaEO5TKUbMY5zx8W9UJ6elpLwKXbsaZ4EFl7B4bMtDv0iRipKoDQT2sNQI9b1utXFdYisi+wzZ/ri/1m7QfDgEuvgUUEIJPq3DhX/5DWNqIXDOweC2wvIR90Oq3lDpdMIgD2r0dXvGdsEW5H6x6HLRJYU7C69VefO1x8Gde1ZFSJLfWS1jbCnhtOPxmpfv2LXOA2Xk2tvnwKKPFuZ/oRmwBwqRQDcKNeVQkYcOjtWVBuM/JuYw5b6isojIkYxyYAFn5K7ZBF10fea52y8QltAg6jnMqNHFBmGkQ1j+U43HMi2xMar1Nv0zGsf1s8nUsmUtPOOrbFIR8bHFDMB5zL13Gmr/kGlCkUzedTzzmzsaJXhYawnA3UmARpiYj5ooJZiUoxFRtK3X6pgNPv+IZVPcnwbOl6f+aBaO1CNvPW9n9LmCp01nuSaTRF2YxHqZ8DYQT6WsXT+RD6eUztwYLZ8rM+rcPxamv1VQzFUkzFXvkiVrySGQgJNvXHJAxiU3/NwiC03rSf05VBaPtu/Z7/B8Yn/w7eguloAAAAAElFTkSuQmCC');
+  }
+
+  div.callout-note.callout-style-default .callout-title {
+    background-color: #dae6fb
+  }
+
+  div.callout-important {
+    border-left-color: #d9534f !important;
+  }
+
+  div.callout-important .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAEKklEQVRYCcVXTWhcVRS+575MJym48A+hSRFr00ySRQhURRfd2HYjk2SSTokuBCkU2o0LoSKKraKIBTcuFCoidGFD08nkBzdREbpQ1EDNIv8qSGMFUboImMSZd4/f9zJv8ibJMC8xJQfO3HPPPef7zrvvvnvviIkpC9nsw0UttFunbUhpFzFtarSd6WJkStVMw5xyVqYTvkwfzuf/5FgtkVoB0729j1rjXwThS7Vio+Mo6DNnvLfahoZ+i/o32lULuJ3NNiz7q6+pyAUkJaFF6JwaM2lUJlV0MlnQn5aTRbEu0SEqHUa0A4AdiGuB1kFXRfVyg5d87+Dg4DL6m2TLAub60ilj7A1Ec4odSAc8X95sHh7+ZRPCFo6Fnp7HfU/fBng/hi10CjCnWnJjsxvDNxWw0NfV6Rv5GgP3I3jGWXumdTD/3cbEOP2ZbOZp69yniG3FQ9z1jD7bnBu9Fc2tKGC2q+uAJOQHBDRiZX1x36o7fWBs7J9ownbtO+n0/qWkvW7UPIfc37WgT6ZGR++EOJyeQDSb9UB+DZ1G6DdLDzyS+b/kBCYGsYgJbSQHuThGKRcw5xdeQf8YdNHsc6ePXrlSYMBuSIAFTGAtQo+VuALo4BX83N190NWZWbynBjhOHsmNfFWLeL6v+ynsA58zDvvAC8j5PkbOcXCMg2PZFk3q8MjI7WAG/Dp9AwP7jdGBOOQkAvlFUB+irtm16I1Zw9YBcpGTGXYmk3kQIC/Cds55l+iMI3jqhjAuaoe+am2Jw5GT3Nbz3CkE12NavmzN5+erJW7046n/CH1RO/RVa8lBLozXk9uqykkGAyRXLWlLv5jyp4RFsG5vGVzpDLnIjTWgnRy2Rr+tDKvRc7Y8AyZq10jj8DqXdnIRNtFZb+t/ZRtXcDiVnzpqx8mPcDWxgARUqx0W1QB9MeUZiNrV4qP+Ehc+BpNgATsTX8ozYKL2NtFYAHc84fG7ndxUPr+AR/iQSns7uSUufAymwDOb2+NjK27lEFocm/EE2WpyIy/Hi66MWuMKJn8RvxIcj87IM5Vh9663ziW36kR0HNenXuxmfaD8JC7tfKbrhFr7LiZCrMjrzTeGx+PmkosrkNzW94ObzwocJ7A1HokLolY+AvkTiD/q1H0cN48c5EL8Crkttsa/AXQVDmutfyku0E7jShx49XqV3MFK8IryDhYVbj7Sj2P2eBxwcXoe8T8idsKKPRcnZw1b+slFTubwUwhktrfnAt7J++jwQtLZcm3sr9LQrjRzz6cfMv9aLvgmnAGvpoaGLxM4mAEaLV7iAzQ3oU0IvD5x9ix3yF2RAAuYAOO2f7PEFWCXZ4C9Pb2UsgDeVnFSpbFK7/IWu7TPTvBqzbGdCHOJQSxiEjt6IyZmxQyEJHv6xyQsYk//moVFsN2zP6fRImjfq7/n/wFDguUQFNEwugAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-important.callout-style-default .callout-title {
+    background-color: #f7dddc
+  }
+
+  div.callout-warning {
+    border-left-color: #f0ad4e !important;
+  }
+
+  div.callout-warning .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAETklEQVRYCeVWW2gcVRg+58yaTUnizqbipZeX4uWhBEniBaoUX1Ioze52t7sRq6APio9V9MEaoWlVsFasRq0gltaAPuxms8lu0gcviE/FFOstVbSIxgcv6SU7EZqmdc7v9+9mJtNks51NTUH84ed889/PP+cmxP+d5FIbMJmNbpREu4WUkiTtCicKny0l1pIKmBzovF2S+hIJHX8iEu3hZJ5lNZGqyRrGSIQpq15AzF28jgpeY6yk6GVdrfFqdrD6Iw+QlB8g0YS2g7dyQmXM/IDhBhT0UCiRf59lfqmmDvzRt6kByV/m4JjtzuaujMUM2c5Z2d6JdKrRb3K2q6mA+oYVz8JnDdKPmmNthzkAk/lN63sYPgevrguc72aZX/L9C6x09GYyxBgCX4NlvyGUHOKELlm5rXeR1kchuChJt4SSwyddZRXgvwMGvYo4QSlk3/zkHD8UHxwVJA6zjZZqP8v8kK8OWLnIZtLyCAJagYC4rTGW/9Pqj92N/c+LUaAj27movwbi19tk/whRCIE7Q9vyI6yvRpftAKVTdUjOW40X3h5OXsKCdmFcx0xlLJoSuQngnrJe7Kcjm4OMq9FlC7CMmScQANuNvjfP3PjGXDBaUQmbp296S5L4DrpbrHN1T87ZVEZVCzg1FF0Ft+dKrlLukI+/c9ENo+TvlTDbYFvuKPtQ9+l052rXrgKoWkDAFnvh0wTOmYn8R5f4k/jN/fZiCM1tQx9jQQ4ANhqG4hiL0qIFTGViG9DKB7GYzgubnpofgYRwO+DFjh0Zin2m4b/97EDkXkc+f6xYAPX0KK2I/7fUQuwzuwo/L3AkcjugPNixC8cHf0FyPjWlItmLxWw4Ou9YsQCr5fijMGoD/zpdRy95HRysyXA74MWOnscpO4j2y3HAVisw85hX5+AFBRSHt4ShfLFkIMXTqyKFc46xdzQM6XbAi702a7sy04J0+feReMFKp5q9esYLCqAZYw/k14E/xcLLsFElaornTuJB0svMuJINy8xkIYuL+xPAlWRceH6+HX7THJ0djLUom46zREu7tTkxwmf/FdOZ/sh6Q8qvEAiHpm4PJ4a/doJe0gH1t+aHRgCzOvBvJedEK5OFE5jpm4AGP2a8Dxe3gGJ/pAutug9Gp6he92CsSsWBaEcxGx0FHytmIpuqGkOpldqNYQK8cSoXvd+xLxXADw0kf6UkJNFtdo5MOgaLjiQOQHcn+A6h5NuL2s0qsC2LOM75PcF3yr5STuBSAcGG+meA14K/CI21HcS4LBT6tv0QAh8Dr5l93AhZzG5ZJ4VxAqdZUEl9z7WJ4aN+svMvwHHL21UKTd1mqvChH7/Za5xzXBBKrUcB0TQ+Ulgkfbi/H/YT5EptrGzsEK7tR1B7ln9BBwckYfMiuSqklSznIuoIIOM42MQO+QnduCoFCI0bpkzjCjddHPN/F+2Yu+sd9bKNpVwHhbS3LluK/0zgfwD0xYI5dXuzlQAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-warning.callout-style-default .callout-title {
+    background-color: #fcefdc
+  }
+
+  div.callout-tip {
+    border-left-color: #02b875 !important;
+  }
+
+  div.callout-tip .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAADr0lEQVRYCe1XTWgTQRj9ZjZV8a9SPIkKgj8I1bMHsUWrqYLVg4Ue6v9BwZOxSYsIerFao7UiUryIqJcqgtpimhbBXoSCVxUFe9CTiogUrUp2Pt+3aUI2u5vdNh4dmMzOzHvvezuz8xNFM0mjnbXaNu1MvFWRXkXEyE6aYOYJpdW4IXuA4r0fo8qqSMDBU0v1HJUgVieAXxzCsdE/YJTdFcVIZQNMyhruOMJKXYFoLfIfIvVIMWdsrd+Rpd86ZmyzzjJmLStqRn0v8lzkb4rVIXvnpScOJuAn2ACC65FkPzEdEy4TPWRLJ2h7z4cArXzzaOdKlbOvKKX25Wl00jSnrwVxAg3o4dRxhO13RBSdNvH0xSARv3adTXbBdTf64IWO2vH0LT+cv4GR1DJt+DUItaQogeBX/chhbTBxEiZ6gftlDNXTrvT7co4ub5A6gp9HIcHvzTa46OS5fBeP87Qm0fQkr4FsYgVQ7Qg+ZayaDg9jhg1GkWj8RG6lkeSacrrHgDaxdoBiZPg+NXV/KifMuB6//JmYH4CntVEHy/keA6x4h4CU5oFy8GzrBS18cLJMXcljAKB6INjWsRcuZBWVaS3GDrqB7rdapVIeA+isQ57Eev9eCqzqOa81CY05VLd6SamW2wA2H3SiTbnbSxmzfp7WtKZkqy4mdyAlGx7ennghYf8voqp9cLSgKdqNfa6RdRsAAkPwRuJZNbpByn+RrJi1RXTwdi8RQF6ymDwGMAtZ6TVE+4uoKh+MYkcLsT0Hk8eAienbiGdjJHZTpmNjlbFJNKDVAp2fJlYju6IreQxQ08UJDNYdoLSl6AadO+fFuCQqVMB1NJwPm69T04Wv5WhfcWyfXQB+wXRs1pt+nCknRa0LVzSA/2B+a9+zQJadb7IyyV24YAxKp2Jqs3emZTuNnKxsah+uabKbMk7CbTgJx/zIgQYErIeTKRQ9yD9wxVof5YolPHqaWo7TD6tJlh7jQnK5z2n3+fGdggIOx2kaa2YI9QWarc5Ce1ipNWMKeSG4DysFF52KBmTNMmn5HqCFkwy34rDg05gDwgH3bBi+sgFhN/e8QvRn8kbamCOhgrZ9GJhFDgfcMHzFb6BAtjKpFhzTjwv1KCVuxHvCbsSiEz4CANnj84cwHdFXAbAOJ4LTSAawGWFn5tDhLMYz6nWeU2wJfIhmIJBefcd/A5FWQWGgrWzyORZ3Q6HuV+Jf0Bj+BTX69fm1zWgK7By1YTXchFDORywnfQ7GpzOo6S+qECrsx2ifVQAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-tip.callout-style-default .callout-title {
+    background-color: #ccf1e3
+  }
+
+  div.callout-caution {
+    border-left-color: #fd7e14 !important;
+  }
+
+  div.callout-caution .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAACV0lEQVRYCdVWzWoUQRCuqp2ICBLJXgITZL1EfQDBW/bkzUMUD7klD+ATSHBEfAIfQO+iXsWDxJsHL96EHAwhgzlkg8nBg25XWb0zIb0zs9muYYWkoKeru+vn664fBqElyZNuyh167NXJ8Ut8McjbmEraKHkd7uAnAFku+VWdb3reSmRV8PKSLfZ0Gjn3a6Xlcq9YGb6tADjn+lUfTXtVmaZ1KwBIvFI11rRXlWlatwIAAv2asaa9mlB9wwygiDX26qaw1yYPzFXg2N1GgG0FMF8Oj+VIx7E/03lHx8UhvYyNZLN7BwSPgekXXLribw7w5/c8EF+DBK5idvDVYtEEwMeYefjjLAdEyQ3M9nfOkgnPTEkYU+sxMq0BxNR6jExrAI31H1rzvLEfRIdgcv1XEdj6QTQAS2wtstEALLG1yEZ3QhH6oDX7ExBSFEkFINXH98NTrme5IOaaA7kIfiu2L8A3qhH9zRbukdCqdsA98TdElyeMe5BI8Rs2xHRIsoTSSVFfCFCWGPn9XHb4cdobRIWABNf0add9jakDjQJpJ1bTXOJXnnRXHRf+dNL1ZV1MBRCXhMbaHqGI1JkKIL7+i8uffuP6wVQAzO7+qVEbF6NbS0LJureYcWXUUhH66nLR5rYmva+2tjRFtojkM2aD76HEGAD3tPtKM309FJg5j/K682ywcWJ3PASCcycH/22u+Bh7Aa0ehM2Fu4z0SAE81HF9RkB21c5bEn4Dzw+/qNOyXr3DCTQDMBOdhi4nAgiFDGCinIa2owCEChUwD8qzd03PG+qdW/4fDzjUMcE1ZpIAAAAASUVORK5CYII=');
+  }
+
+  div.callout-caution.callout-style-default .callout-title {
+    background-color: #ffe5d0
+  }
+
+  </style>
+  <style type="text/css">
+    .reveal div.sourceCode {
+      margin: 0;
+      overflow: auto;
+    }
+    .reveal div.hanging-indent {
+      margin-left: 1em;
+      text-indent: -1em;
+    }
+    .reveal .slide:not(.center) {
+      height: 100%;
+    }
+    .reveal .slide.scrollable {
+      overflow-y: auto;
+    }
+    .reveal .footnotes {
+      height: 100%;
+      overflow-y: auto;
+    }
+    .reveal .slide .absolute {
+      position: absolute;
+      display: block;
+    }
+    .reveal .footnotes ol {
+      counter-reset: ol;
+      list-style-type: none; 
+      margin-left: 0;
+    }
+    .reveal .footnotes ol li:before {
+      counter-increment: ol;
+      content: counter(ol) ". "; 
+    }
+    .reveal .footnotes ol li > p:first-child {
+      display: inline-block;
+    }
+    .reveal .slide ul,
+    .reveal .slide ol {
+      margin-bottom: 0.5em;
+    }
+    .reveal .slide ul li,
+    .reveal .slide ol li {
+      margin-top: 0.4em;
+      margin-bottom: 0.2em;
+    }
+    .reveal .slide ul[role="tablist"] li {
+      margin-bottom: 0;
+    }
+    .reveal .slide ul li > *:first-child,
+    .reveal .slide ol li > *:first-child {
+      margin-block-start: 0;
+    }
+    .reveal .slide ul li > *:last-child,
+    .reveal .slide ol li > *:last-child {
+      margin-block-end: 0;
+    }
+    .reveal .slide .columns:nth-child(3) {
+      margin-block-start: 0.8em;
+    }
+    .reveal blockquote {
+      box-shadow: none;
+    }
+    .reveal .tippy-content>* {
+      margin-top: 0.2em;
+      margin-bottom: 0.7em;
+    }
+    .reveal .tippy-content>*:last-child {
+      margin-bottom: 0.2em;
+    }
+    .reveal .slide > img.stretch.quarto-figure-center,
+    .reveal .slide > img.r-stretch.quarto-figure-center {
+      display: block;
+      margin-left: auto;
+      margin-right: auto; 
+    }
+    .reveal .slide > img.stretch.quarto-figure-left,
+    .reveal .slide > img.r-stretch.quarto-figure-left  {
+      display: block;
+      margin-left: 0;
+      margin-right: auto; 
+    }
+    .reveal .slide > img.stretch.quarto-figure-right,
+    .reveal .slide > img.r-stretch.quarto-figure-right  {
+      display: block;
+      margin-left: auto;
+      margin-right: 0; 
+    }
+  </style>
+</head>
+<body class="quarto-light">
+  <div class="reveal">
+    <div class="slides">
+
+<section id="title-slide" class="quarto-title-block center">
+  <h1 class="title">Algorithmic Thinking Case Study 1</h1>
+  <p class="subtitle">SISMID 2024 – Introduction to R</p>
+
+<div class="quarto-title-authors">
+<div class="quarto-title-author">
+<div class="quarto-title-author-name">
+<a href="https://publichealth.uga.edu/faculty-member/amy-k-winter/">Amy Winter</a> <a href="https://orcid.org/0000-0003-2737-7003" class="quarto-title-author-orcid"> <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA2ZpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVzb3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1NOk9yaWdpbmFsRG9jdW1lbnRJRD0ieG1wLmRpZDo1N0NEMjA4MDI1MjA2ODExOTk0QzkzNTEzRjZEQTg1NyIgeG1wTU06RG9jdW1lbnRJRD0ieG1wLmRpZDozM0NDOEJGNEZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wTU06SW5zdGFuY2VJRD0ieG1wLmlpZDozM0NDOEJGM0ZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wOkNyZWF0b3JUb29sPSJBZG9iZSBQaG90b3Nob3AgQ1M1IE1hY2ludG9zaCI+IDx4bXBNTTpEZXJpdmVkRnJvbSBzdFJlZjppbnN0YW5jZUlEPSJ4bXAuaWlkOkZDN0YxMTc0MDcyMDY4MTE5NUZFRDc5MUM2MUUwNEREIiBzdFJlZjpkb2N1bWVudElEPSJ4bXAuZGlkOjU3Q0QyMDgwMjUyMDY4MTE5OTRDOTM1MTNGNkRBODU3Ii8+IDwvcmRmOkRlc2NyaXB0aW9uPiA8L3JkZjpSREY+IDwveDp4bXBtZXRhPiA8P3hwYWNrZXQgZW5kPSJyIj8+84NovQAAAR1JREFUeNpiZEADy85ZJgCpeCB2QJM6AMQLo4yOL0AWZETSqACk1gOxAQN+cAGIA4EGPQBxmJA0nwdpjjQ8xqArmczw5tMHXAaALDgP1QMxAGqzAAPxQACqh4ER6uf5MBlkm0X4EGayMfMw/Pr7Bd2gRBZogMFBrv01hisv5jLsv9nLAPIOMnjy8RDDyYctyAbFM2EJbRQw+aAWw/LzVgx7b+cwCHKqMhjJFCBLOzAR6+lXX84xnHjYyqAo5IUizkRCwIENQQckGSDGY4TVgAPEaraQr2a4/24bSuoExcJCfAEJihXkWDj3ZAKy9EJGaEo8T0QSxkjSwORsCAuDQCD+QILmD1A9kECEZgxDaEZhICIzGcIyEyOl2RkgwAAhkmC+eAm0TAAAAABJRU5ErkJggg=="></a>
+</div>
+</div>
+<div class="quarto-title-author">
+<div class="quarto-title-author-name">
+<a href="https://wzbillings.com/">Zane Billings</a> <a href="https://orcid.org/0000-0002-0184-6134" class="quarto-title-author-orcid"> <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA2ZpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVzb3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1NOk9yaWdpbmFsRG9jdW1lbnRJRD0ieG1wLmRpZDo1N0NEMjA4MDI1MjA2ODExOTk0QzkzNTEzRjZEQTg1NyIgeG1wTU06RG9jdW1lbnRJRD0ieG1wLmRpZDozM0NDOEJGNEZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wTU06SW5zdGFuY2VJRD0ieG1wLmlpZDozM0NDOEJGM0ZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wOkNyZWF0b3JUb29sPSJBZG9iZSBQaG90b3Nob3AgQ1M1IE1hY2ludG9zaCI+IDx4bXBNTTpEZXJpdmVkRnJvbSBzdFJlZjppbnN0YW5jZUlEPSJ4bXAuaWlkOkZDN0YxMTc0MDcyMDY4MTE5NUZFRDc5MUM2MUUwNEREIiBzdFJlZjpkb2N1bWVudElEPSJ4bXAuZGlkOjU3Q0QyMDgwMjUyMDY4MTE5OTRDOTM1MTNGNkRBODU3Ii8+IDwvcmRmOkRlc2NyaXB0aW9uPiA8L3JkZjpSREY+IDwveDp4bXBtZXRhPiA8P3hwYWNrZXQgZW5kPSJyIj8+84NovQAAAR1JREFUeNpiZEADy85ZJgCpeCB2QJM6AMQLo4yOL0AWZETSqACk1gOxAQN+cAGIA4EGPQBxmJA0nwdpjjQ8xqArmczw5tMHXAaALDgP1QMxAGqzAAPxQACqh4ER6uf5MBlkm0X4EGayMfMw/Pr7Bd2gRBZogMFBrv01hisv5jLsv9nLAPIOMnjy8RDDyYctyAbFM2EJbRQw+aAWw/LzVgx7b+cwCHKqMhjJFCBLOzAR6+lXX84xnHjYyqAo5IUizkRCwIENQQckGSDGY4TVgAPEaraQr2a4/24bSuoExcJCfAEJihXkWDj3ZAKy9EJGaEo8T0QSxkjSwORsCAuDQCD+QILmD1A9kECEZgxDaEZhICIzGcIyEyOl2RkgwAAhkmC+eAm0TAAAAABJRU5ErkJggg=="></a>
+</div>
+</div>
+</div>
+
+</section>
+<section id="learning-goals" class="slide level2">
+<h2>Learning goals</h2>
+<ul>
+<li>Use logical operators, subsetting functions, and math calculations in R</li>
+<li>Translate human-understandable problem descriptions into instructions that R can understand.</li>
+</ul>
+</section>
+<section>
+<section id="remember-r-always-does-exactly-what-you-tell-it-to-do" class="title-slide slide level1 center">
+<h1>Remember, R always does EXACTLY what you tell it to do!</h1>
+
+</section>
+<section id="instructions" class="slide level2">
+<h2>Instructions</h2>
+<ul>
+<li>Make a new R script for this case study, and save it to your code folder.</li>
+<li>We’ll use the diphtheria serosample data from Exercise 1 for this case study. Load it into R and use the functions we’ve learned to look at it.</li>
+</ul>
+</section>
+<section id="instructions-1" class="slide level2">
+<h2>Instructions</h2>
+<ul>
+<li>Make a new R script for this case study, and save it to your code folder.</li>
+<li>We’ll use the diphtheria serosample data from Exercise 1 for this case study. Load it into R and use the functions we’ve learned to look at it.</li>
+<li>The <code>str()</code> of your dataset should look like this.</li>
+</ul>
+<div class="cell">
+<div class="cell-output cell-output-stdout">
+<pre><code>tibble [250 × 5] (S3: tbl_df/tbl/data.frame)
+ $ age_months  : num [1:250] 15 44 103 88 88 118 85 19 78 112 ...
+ $ group       : chr [1:250] "urban" "rural" "urban" "urban" ...
+ $ DP_antibody : num [1:250] 0.481 0.657 1.368 1.218 0.333 ...
+ $ DP_infection: num [1:250] 1 1 1 1 1 1 1 1 1 1 ...
+ $ DP_vacc     : num [1:250] 0 1 1 1 1 1 1 1 1 1 ...</code></pre>
+</div>
+</div>
+</section>
+<section id="q1-was-the-overall-prevalence-higher-in-urban-or-rural-areas" class="slide level2">
+<h2>Q1: Was the overall prevalence higher in urban or rural areas?</h2>
+<div>
+<ol type="1">
+<li class="fragment">How do we calculate the prevalence from the data?</li>
+<li class="fragment">How do we calculate the prevalence separately for urban and rural areas?</li>
+<li class="fragment">How do we determine which prevalence is higher and if the difference is meaningful?</li>
+</ol>
+</div>
+</section>
+<section id="q1-how-do-we-calculate-the-prevalence-from-the-data" class="slide level2">
+<h2>Q1: How do we calculate the prevalence from the data?</h2>
+<div>
+<ul>
+<li class="fragment">The variable <code>DP_infection</code> in our dataset is binary / dichotomous.</li>
+<li class="fragment">The prevalence is the number or percent of people who had the disease over some duration.</li>
+<li class="fragment">The average of a binary variable gives the prevalence!</li>
+</ul>
+</div>
+<div class="fragment">
+<div class="cell">
+<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1"></a><span class="fu">mean</span>(diph<span class="sc">$</span>DP_infection)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 0.8</code></pre>
+</div>
+</div>
+</div>
+</section>
+<section id="q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas" class="slide level2">
+<h2>Q1: How do we calculate the prevalence separately for urban and rural areas?</h2>
+<div class="fragment">
+<div class="cell">
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1"></a><span class="fu">mean</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"urban"</span>, ]<span class="sc">$</span>DP_infection)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 0.8235294</code></pre>
+</div>
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a><span class="fu">mean</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"rural"</span>, ]<span class="sc">$</span>DP_infection)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 0.778626</code></pre>
+</div>
+</div>
+</div>
+<div class="fragment">
+<ul>
+<li>There are many ways you could write this code! You can use <code>subset()</code> or you can write the indices many ways.</li>
+<li>Using <code>tbl_df</code> objects from <code>haven</code> uses different <code>[[</code> rules than a base R data frame.</li>
+</ul>
+</div>
+</section>
+<section id="q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas-1" class="slide level2">
+<h2>Q1: How do we calculate the prevalence separately for urban and rural areas?</h2>
+<ul>
+<li>One easy way is to use the <code>aggregate()</code> function.</li>
+</ul>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1"></a><span class="fu">aggregate</span>(DP_infection <span class="sc">~</span> group, <span class="at">data =</span> diph, <span class="at">FUN =</span> mean)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>  group DP_infection
+1 rural    0.7786260
+2 urban    0.8235294</code></pre>
+</div>
+</div>
+</section>
+<section id="q1-how-do-we-determine-which-prevalence-is-higher-and-if-the-difference-is-meaningful" class="slide level2">
+<h2>Q1: How do we determine which prevalence is higher and if the difference is meaningful?</h2>
+<div>
+<ul>
+<li class="fragment">We probably need to include a confidence interval in our calculation.</li>
+<li class="fragment">This is actually not so easy without more advanced tools that we will learn in upcoming modules.</li>
+<li class="fragment">Right now the best options are to do it by hand or google a function.</li>
+</ul>
+</div>
+</section>
+<section id="q1-by-hand" class="slide level2">
+<h2>Q1: By hand</h2>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1"></a>p_urban <span class="ot">&lt;-</span> <span class="fu">mean</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"urban"</span>, ]<span class="sc">$</span>DP_infection)</span>
+<span id="cb10-2"><a href="#cb10-2"></a>p_rural <span class="ot">&lt;-</span> <span class="fu">mean</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"rural"</span>, ]<span class="sc">$</span>DP_infection)</span>
+<span id="cb10-3"><a href="#cb10-3"></a>se_urban <span class="ot">&lt;-</span> <span class="fu">sqrt</span>(p_urban <span class="sc">*</span> (<span class="dv">1</span> <span class="sc">-</span> p_urban) <span class="sc">/</span> <span class="fu">nrow</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"urban"</span>, ]))</span>
+<span id="cb10-4"><a href="#cb10-4"></a>se_rural <span class="ot">&lt;-</span> <span class="fu">sqrt</span>(p_rural <span class="sc">*</span> (<span class="dv">1</span> <span class="sc">-</span> p_rural) <span class="sc">/</span> <span class="fu">nrow</span>(diph[diph<span class="sc">$</span>group <span class="sc">==</span> <span class="st">"rural"</span>, ])) </span>
+<span id="cb10-5"><a href="#cb10-5"></a></span>
+<span id="cb10-6"><a href="#cb10-6"></a>result_urban <span class="ot">&lt;-</span> <span class="fu">paste0</span>(</span>
+<span id="cb10-7"><a href="#cb10-7"></a>    <span class="st">"Urban: "</span>, <span class="fu">round</span>(p_urban, <span class="dv">2</span>), <span class="st">"; 95% CI: ("</span>,</span>
+<span id="cb10-8"><a href="#cb10-8"></a>    <span class="fu">round</span>(p_urban <span class="sc">-</span> <span class="fl">1.96</span> <span class="sc">*</span> se_urban, <span class="dv">2</span>), <span class="st">", "</span>,</span>
+<span id="cb10-9"><a href="#cb10-9"></a>    <span class="fu">round</span>(p_urban <span class="sc">+</span> <span class="fl">1.96</span> <span class="sc">*</span> se_urban, <span class="dv">2</span>), <span class="st">")"</span></span>
+<span id="cb10-10"><a href="#cb10-10"></a>)</span>
+<span id="cb10-11"><a href="#cb10-11"></a></span>
+<span id="cb10-12"><a href="#cb10-12"></a>result_rural <span class="ot">&lt;-</span> <span class="fu">paste0</span>(</span>
+<span id="cb10-13"><a href="#cb10-13"></a>    <span class="st">"Rural: "</span>, <span class="fu">round</span>(p_rural, <span class="dv">2</span>), <span class="st">"; 95% CI: ("</span>,</span>
+<span id="cb10-14"><a href="#cb10-14"></a>    <span class="fu">round</span>(p_rural <span class="sc">-</span> <span class="fl">1.96</span> <span class="sc">*</span> se_rural, <span class="dv">2</span>), <span class="st">", "</span>,</span>
+<span id="cb10-15"><a href="#cb10-15"></a>    <span class="fu">round</span>(p_rural <span class="sc">+</span> <span class="fl">1.96</span> <span class="sc">*</span> se_rural, <span class="dv">2</span>), <span class="st">")"</span></span>
+<span id="cb10-16"><a href="#cb10-16"></a>)</span>
+<span id="cb10-17"><a href="#cb10-17"></a></span>
+<span id="cb10-18"><a href="#cb10-18"></a><span class="fu">cat</span>(result_urban, result_rural, <span class="at">sep =</span> <span class="st">"</span><span class="sc">\n</span><span class="st">"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>Urban: 0.82; 95% CI: (0.76, 0.89)
+Rural: 0.78; 95% CI: (0.71, 0.85)</code></pre>
+</div>
+</div>
+</section>
+<section id="q1-by-hand-1" class="slide level2">
+<h2>Q1: By hand</h2>
+<ul>
+<li>We can see that the 95% CI’s overlap, so the groups are probably not that different. <strong>To be sure, we need to do a 2-sample test! But this is not a statistics class.</strong></li>
+<li>Some people will tell you that coding like this is “bad”. <strong>But ‘bad’ code that gives you answers is better than broken code!</strong> We will learn techniques for writing this with less work and less repetition in upcoming modules.</li>
+</ul>
+</section>
+<section id="q1-googling-a-package" class="slide level2">
+<h2>Q1: Googling a package</h2>
+<div class="fragment">
+<div class="cell">
+<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1"></a><span class="co"># install.packages("DescTools")</span></span>
+<span id="cb12-2"><a href="#cb12-2"></a><span class="fu">library</span>(DescTools)</span>
+<span id="cb12-3"><a href="#cb12-3"></a></span>
+<span id="cb12-4"><a href="#cb12-4"></a><span class="fu">aggregate</span>(DP_infection <span class="sc">~</span> group, <span class="at">data =</span> diph, <span class="at">FUN =</span> DescTools<span class="sc">::</span>MeanCI)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>  group DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci
+1 rural         0.7786260           0.7065872           0.8506647
+2 urban         0.8235294           0.7540334           0.8930254</code></pre>
+</div>
+</div>
+</div>
+</section>
+<section id="you-try-it" class="slide level2">
+<h2>You try it!</h2>
+<ul>
+<li>Using any of the approaches you can think of, answer this question!</li>
+<li><strong>How many children under 5 were vaccinated? In children under 5, did vaccination lower the prevalence of infection?</strong></li>
+</ul>
+</section>
+<section id="you-try-it-1" class="slide level2">
+<h2>You try it!</h2>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1"></a><span class="co"># How many children under 5 were vaccinated</span></span>
+<span id="cb14-2"><a href="#cb14-2"></a><span class="fu">sum</span>(diph<span class="sc">$</span>DP_vacc[diph<span class="sc">$</span>age_months <span class="sc">&lt;</span> <span class="dv">60</span>])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] 91</code></pre>
+</div>
+<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1"></a><span class="co"># Prevalence in both vaccine groups for children under 5</span></span>
+<span id="cb16-2"><a href="#cb16-2"></a><span class="fu">aggregate</span>(</span>
+<span id="cb16-3"><a href="#cb16-3"></a>    DP_infection <span class="sc">~</span> DP_vacc,</span>
+<span id="cb16-4"><a href="#cb16-4"></a>    <span class="at">data =</span> <span class="fu">subset</span>(diph, age_months <span class="sc">&lt;</span> <span class="dv">60</span>),</span>
+<span id="cb16-5"><a href="#cb16-5"></a>    <span class="at">FUN =</span> DescTools<span class="sc">::</span>MeanCI</span>
+<span id="cb16-6"><a href="#cb16-6"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>  DP_vacc DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci
+1       0         0.4285714           0.1977457           0.6593972
+2       1         0.6373626           0.5366845           0.7380407</code></pre>
+</div>
+</div>
+<p>It appears that prevalence was HIGHER in the vaccine group? That is counterintuitive, but the sample size for the unvaccinated group is too small to be sure.</p>
+</section>
+<section id="congratulations-for-finishing-the-first-case-study" class="slide level2">
+<h2>Congratulations for finishing the first case study!</h2>
+<ul>
+<li>What R functions and skills did you practice?</li>
+<li>What other questions could you answer about the same dataset with the skills you know now?</li>
+</ul>
+
+<div class="footer footer-default">
+
+</div>
+</section></section>
+    </div>
+  </div>
+
+  <script>window.backupDefine = window.define; window.define = undefined;</script>
+  <script src="../site_libs/revealjs/dist/reveal.js"></script>
+  <!-- reveal.js plugins -->
+  <script src="../site_libs/revealjs/plugin/quarto-line-highlight/line-highlight.js"></script>
+  <script src="../site_libs/revealjs/plugin/pdf-export/pdfexport.js"></script>
+  <script src="../site_libs/revealjs/plugin/reveal-menu/menu.js"></script>
+  <script src="../site_libs/revealjs/plugin/reveal-menu/quarto-menu.js"></script>
+  <script src="../site_libs/revealjs/plugin/quarto-support/support.js"></script>
+  
+
+  <script src="../site_libs/revealjs/plugin/notes/notes.js"></script>
+  <script src="../site_libs/revealjs/plugin/search/search.js"></script>
+  <script src="../site_libs/revealjs/plugin/zoom/zoom.js"></script>
+  <script src="../site_libs/revealjs/plugin/math/math.js"></script>
+  <script>window.define = window.backupDefine; window.backupDefine = undefined;</script>
+
+  <script>
+
+      // Full list of configuration options available at:
+      // https://revealjs.com/config/
+      Reveal.initialize({
+'controlsAuto': true,
+'previewLinksAuto': false,
+'smaller': false,
+'pdfSeparateFragments': false,
+'autoAnimateEasing': "ease",
+'autoAnimateDuration': 1,
+'autoAnimateUnmatched': true,
+'menu': {"side":"left","useTextContentForMissingTitles":true,"markers":false,"loadIcons":false,"custom":[{"title":"Tools","icon":"<i class=\"fas fa-gear\"></i>","content":"<ul class=\"slide-menu-items\">\n<li class=\"slide-tool-item active\" data-item=\"0\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.fullscreen(event)\"><kbd>f</kbd> Fullscreen</a></li>\n<li class=\"slide-tool-item\" data-item=\"1\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.speakerMode(event)\"><kbd>s</kbd> Speaker View</a></li>\n<li class=\"slide-tool-item\" data-item=\"2\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.overview(event)\"><kbd>o</kbd> Slide Overview</a></li>\n<li class=\"slide-tool-item\" data-item=\"3\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.togglePdfExport(event)\"><kbd>e</kbd> PDF Export Mode</a></li>\n<li class=\"slide-tool-item\" data-item=\"4\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.keyboardHelp(event)\"><kbd>?</kbd> Keyboard Help</a></li>\n</ul>"}],"openButton":true},
+'smaller': false,
+ 
+        // Display controls in the bottom right corner
+        controls: false,
+
+        // Help the user learn the controls by providing hints, for example by
+        // bouncing the down arrow when they first encounter a vertical slide
+        controlsTutorial: false,
+
+        // Determines where controls appear, "edges" or "bottom-right"
+        controlsLayout: 'edges',
+
+        // Visibility rule for backwards navigation arrows; "faded", "hidden"
+        // or "visible"
+        controlsBackArrows: 'faded',
+
+        // Display a presentation progress bar
+        progress: true,
+
+        // Display the page number of the current slide
+        slideNumber: false,
+
+        // 'all', 'print', or 'speaker'
+        showSlideNumber: 'all',
+
+        // Add the current slide number to the URL hash so that reloading the
+        // page/copying the URL will return you to the same slide
+        hash: true,
+
+        // Start with 1 for the hash rather than 0
+        hashOneBasedIndex: false,
+
+        // Flags if we should monitor the hash and change slides accordingly
+        respondToHashChanges: true,
+
+        // Push each slide change to the browser history
+        history: true,
+
+        // Enable keyboard shortcuts for navigation
+        keyboard: true,
+
+        // Enable the slide overview mode
+        overview: true,
+
+        // Disables the default reveal.js slide layout (scaling and centering)
+        // so that you can use custom CSS layout
+        disableLayout: false,
+
+        // Vertical centering of slides
+        center: false,
+
+        // Enables touch navigation on devices with touch input
+        touch: true,
+
+        // Loop the presentation
+        loop: false,
+
+        // Change the presentation direction to be RTL
+        rtl: false,
+
+        // see https://revealjs.com/vertical-slides/#navigation-mode
+        navigationMode: 'linear',
+
+        // Randomizes the order of slides each time the presentation loads
+        shuffle: false,
+
+        // Turns fragments on and off globally
+        fragments: true,
+
+        // Flags whether to include the current fragment in the URL,
+        // so that reloading brings you to the same fragment position
+        fragmentInURL: false,
+
+        // Flags if the presentation is running in an embedded mode,
+        // i.e. contained within a limited portion of the screen
+        embedded: false,
+
+        // Flags if we should show a help overlay when the questionmark
+        // key is pressed
+        help: true,
+
+        // Flags if it should be possible to pause the presentation (blackout)
+        pause: true,
+
+        // Flags if speaker notes should be visible to all viewers
+        showNotes: false,
+
+        // Global override for autoplaying embedded media (null/true/false)
+        autoPlayMedia: null,
+
+        // Global override for preloading lazy-loaded iframes (null/true/false)
+        preloadIframes: null,
+
+        // Number of milliseconds between automatically proceeding to the
+        // next slide, disabled when set to 0, this value can be overwritten
+        // by using a data-autoslide attribute on your slides
+        autoSlide: 0,
+
+        // Stop auto-sliding after user input
+        autoSlideStoppable: true,
+
+        // Use this method for navigation when auto-sliding
+        autoSlideMethod: null,
+
+        // Specify the average time in seconds that you think you will spend
+        // presenting each slide. This is used to show a pacing timer in the
+        // speaker view
+        defaultTiming: null,
+
+        // Enable slide navigation via mouse wheel
+        mouseWheel: false,
+
+        // The display mode that will be used to show slides
+        display: 'block',
+
+        // Hide cursor if inactive
+        hideInactiveCursor: true,
+
+        // Time before the cursor is hidden (in ms)
+        hideCursorTime: 5000,
+
+        // Opens links in an iframe preview overlay
+        previewLinks: false,
+
+        // Transition style (none/fade/slide/convex/concave/zoom)
+        transition: 'none',
+
+        // Transition speed (default/fast/slow)
+        transitionSpeed: 'default',
+
+        // Transition style for full page slide backgrounds
+        // (none/fade/slide/convex/concave/zoom)
+        backgroundTransition: 'none',
+
+        // Number of slides away from the current that are visible
+        viewDistance: 3,
+
+        // Number of slides away from the current that are visible on mobile
+        // devices. It is advisable to set this to a lower number than
+        // viewDistance in order to save resources.
+        mobileViewDistance: 2,
+
+        // The "normal" size of the presentation, aspect ratio will be preserved
+        // when the presentation is scaled to fit different resolutions. Can be
+        // specified using percentage units.
+        width: 1050,
+
+        height: 700,
+
+        // Factor of the display size that should remain empty around the content
+        margin: 0.1,
+
+        math: {
+          mathjax: 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js',
+          config: 'TeX-AMS_HTML-full',
+          tex2jax: {
+            inlineMath: [['\\(','\\)']],
+            displayMath: [['\\[','\\]']],
+            balanceBraces: true,
+            processEscapes: false,
+            processRefs: true,
+            processEnvironments: true,
+            preview: 'TeX',
+            skipTags: ['script','noscript','style','textarea','pre','code'],
+            ignoreClass: 'tex2jax_ignore',
+            processClass: 'tex2jax_process'
+          },
+        },
+
+        // reveal.js plugins
+        plugins: [QuartoLineHighlight, PdfExport, RevealMenu, QuartoSupport,
+
+          RevealMath,
+          RevealNotes,
+          RevealSearch,
+          RevealZoom
+        ]
+      });
+    </script>
+    
+    <script>
+      // htmlwidgets need to know to resize themselves when slides are shown/hidden.
+      // Fire the "slideenter" event (handled by htmlwidgets.js) when the current
+      // slide changes (different for each slide format).
+      (function () {
+        // dispatch for htmlwidgets
+        function fireSlideEnter() {
+          const event = window.document.createEvent("Event");
+          event.initEvent("slideenter", true, true);
+          window.document.dispatchEvent(event);
+        }
+
+        function fireSlideChanged(previousSlide, currentSlide) {
+          fireSlideEnter();
+
+          // dispatch for shiny
+          if (window.jQuery) {
+            if (previousSlide) {
+              window.jQuery(previousSlide).trigger("hidden");
+            }
+            if (currentSlide) {
+              window.jQuery(currentSlide).trigger("shown");
+            }
+          }
+        }
+
+        // hookup for slidy
+        if (window.w3c_slidy) {
+          window.w3c_slidy.add_observer(function (slide_num) {
+            // slide_num starts at position 1
+            fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);
+          });
+        }
+
+      })();
+    </script>
+
+    <script id="quarto-html-after-body" type="application/javascript">
+    window.document.addEventListener("DOMContentLoaded", function (event) {
+      const toggleBodyColorMode = (bsSheetEl) => {
+        const mode = bsSheetEl.getAttribute("data-mode");
+        const bodyEl = window.document.querySelector("body");
+        if (mode === "dark") {
+          bodyEl.classList.add("quarto-dark");
+          bodyEl.classList.remove("quarto-light");
+        } else {
+          bodyEl.classList.add("quarto-light");
+          bodyEl.classList.remove("quarto-dark");
+        }
+      }
+      const toggleBodyColorPrimary = () => {
+        const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
+        if (bsSheetEl) {
+          toggleBodyColorMode(bsSheetEl);
+        }
+      }
+      toggleBodyColorPrimary();  
+      const tabsets =  window.document.querySelectorAll(".panel-tabset-tabby")
+      tabsets.forEach(function(tabset) {
+        const tabby = new Tabby('#' + tabset.id);
+      });
+      const isCodeAnnotation = (el) => {
+        for (const clz of el.classList) {
+          if (clz.startsWith('code-annotation-')) {                     
+            return true;
+          }
+        }
+        return false;
+      }
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
+        // button target
+        const button = e.trigger;
+        // don't keep focus
+        button.blur();
+        // flash "checked"
+        button.classList.add('code-copy-button-checked');
+        var currentTitle = button.getAttribute("title");
+        button.setAttribute("title", "Copied!");
+        let tooltip;
+        if (window.bootstrap) {
+          button.setAttribute("data-bs-toggle", "tooltip");
+          button.setAttribute("data-bs-placement", "left");
+          button.setAttribute("data-bs-title", "Copied!");
+          tooltip = new bootstrap.Tooltip(button, 
+            { trigger: "manual", 
+              customClass: "code-copy-button-tooltip",
+              offset: [0, -8]});
+          tooltip.show();    
+        }
+        setTimeout(function() {
+          if (tooltip) {
+            tooltip.hide();
+            button.removeAttribute("data-bs-title");
+            button.removeAttribute("data-bs-toggle");
+            button.removeAttribute("data-bs-placement");
+          }
+          button.setAttribute("title", currentTitle);
+          button.classList.remove('code-copy-button-checked');
+        }, 1000);
+        // clear code selection
+        e.clearSelection();
+      });
+      function tippyHover(el, contentFn) {
+        const config = {
+          allowHTML: true,
+          content: contentFn,
+          maxWidth: 500,
+          delay: 100,
+          arrow: false,
+          appendTo: function(el) {
+              return el.closest('section.slide') || el.parentElement;
+          },
+          interactive: true,
+          interactiveBorder: 10,
+          theme: 'light-border',
+          placement: 'bottom-start'
+        };
+          config['offset'] = [0,0];
+          config['maxWidth'] = 700;
+        window.tippy(el, config); 
+      }
+      const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
+      for (var i=0; i<noterefs.length; i++) {
+        const ref = noterefs[i];
+        tippyHover(ref, function() {
+          // use id or data attribute instead here
+          let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
+          try { href = new URL(href).hash; } catch {}
+          const id = href.replace(/^#\/?/, "");
+          const note = window.document.getElementById(id);
+          return note.innerHTML;
+        });
+      }
+      const findCites = (el) => {
+        const parentEl = el.parentElement;
+        if (parentEl) {
+          const cites = parentEl.dataset.cites;
+          if (cites) {
+            return {
+              el,
+              cites: cites.split(' ')
+            };
+          } else {
+            return findCites(el.parentElement)
+          }
+        } else {
+          return undefined;
+        }
+      };
+      var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
+      for (var i=0; i<bibliorefs.length; i++) {
+        const ref = bibliorefs[i];
+        const citeInfo = findCites(ref);
+        if (citeInfo) {
+          tippyHover(citeInfo.el, function() {
+            var popup = window.document.createElement('div');
+            citeInfo.cites.forEach(function(cite) {
+              var citeDiv = window.document.createElement('div');
+              citeDiv.classList.add('hanging-indent');
+              citeDiv.classList.add('csl-entry');
+              var biblioDiv = window.document.getElementById('ref-' + cite);
+              if (biblioDiv) {
+                citeDiv.innerHTML = biblioDiv.innerHTML;
+              }
+              popup.appendChild(citeDiv);
+            });
+            return popup.innerHTML;
+          });
+        }
+      }
+    });
+    </script>
+    
+
+</body></html>
\ No newline at end of file
diff --git a/docs/modules/Module00-Welcome.html b/docs/modules/Module00-Welcome.html
index 27ae87d..1601c48 100644
--- a/docs/modules/Module00-Welcome.html
+++ b/docs/modules/Module00-Welcome.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Welcome to SISMID Workshop: Introduction to R</title>
+  <title>SISMID Module NUMBER Materials (2025) - Welcome to SISMID Workshop: Introduction to R</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -157,8 +157,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -343,35 +342,16 @@ <h1 class="title">Welcome to SISMID Workshop: Introduction to R</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/welcome-to-sismid-workshop-introduction-to-r" id="/toc-welcome-to-sismid-workshop-introduction-to-r">Welcome to SISMID Workshop: Introduction to R!</a></li>
-<li><a href="#/introductions" id="/toc-introductions">Introductions</a></li>
-<li><a href="#/what-is-r" id="/toc-what-is-r">What is R?</a></li>
-<li><a href="#/what-is-r-1" id="/toc-what-is-r-1">What is R?</a></li>
-<li><a href="#/what-is-r-2" id="/toc-what-is-r-2">What is R?</a></li>
-<li><a href="#/what-is-r-3" id="/toc-what-is-r-3">What is R?</a></li>
-<li><a href="#/why-r" id="/toc-why-r">Why R?</a></li>
-<li><a href="#/why-not-r" id="/toc-why-not-r">Why not R?</a></li>
-<li><a href="#/is-r-difficult" id="/toc-is-r-difficult">Is R DIfficult?</a></li>
-<li><a href="#/overall-workshop-objectives" id="/toc-overall-workshop-objectives">Overall Workshop Objectives</a></li>
-<li><a href="#/this-workshop-differs-from-introduction-to-tidyverse" id="/toc-this-workshop-differs-from-introduction-to-tidyverse">This workshop differs from “Introduction to Tidyverse”</a></li>
-<li><a href="#/workshop-overview" id="/toc-workshop-overview">Workshop Overview</a></li>
-<li><a href="#/reproducibility" id="/toc-reproducibility">Reproducibility</a></li>
-<li><a href="#/good-coding-techniques" id="/toc-good-coding-techniques">Good coding techniques</a></li>
-<li><a href="#/thinking-algorithmically" id="/toc-thinking-algorithmically">Thinking algorithmically</a></li>
-<li><a href="#/useful-free-resources" id="/toc-useful-free-resources">Useful (+ Free) Resources</a></li>
-<li><a href="#/useful-free-resources-1" id="/toc-useful-free-resources-1">Useful (+Free) Resources</a></li>
-<li><a href="#/installing-r" id="/toc-installing-r">Installing R</a></li>
-</ul>
-</nav>
 </section>
 <section id="welcome-to-sismid-workshop-introduction-to-r" class="slide level2">
 <h2>Welcome to SISMID Workshop: Introduction to R!</h2>
-<p><strong>Amy Winter (she/her)</strong> Assistant Professor, Department of Epidemiology and Biostatistics Email: awinter@uga.edu</p>
-<p><strong>Zane Billings (he/him)</strong> PhD Candidate, Department of Epidemiology and Biostatistics Email: Wesley.Billings@uga.edu</p>
+<p><strong>Amy Winter (she/her)</strong></p>
+<p>Assistant Professor, Department of Epidemiology and Biostatistics</p>
+<p>Email: awinter@uga.edu</p>
+<p><br></p>
+<p><strong>Zane Billings (he/him)</strong></p>
+<p>PhD Candidate, Department of Epidemiology and Biostatistics</p>
+<p>Email: Wesley.Billings@uga.edu</p>
 </section>
 <section id="introductions" class="slide level2">
 <h2>Introductions</h2>
@@ -399,7 +379,7 @@ <h2>What is R?</h2>
 <li><p>R is both <a href="https://en.wikipedia.org/wiki/Open_source">open source</a> and <a href="https://en.wikipedia.org/wiki/Open-source_software_development">open development</a></p></li>
 </ul>
 
-<img data-src="https://www.r-project.org/logo/Rlogo.png" class="quarto-figure quarto-figure-center r-stretch" style="width:20.0%" alt="R logo"></section>
+<img data-src="https://www.r-project.org/logo/Rlogo.png" style="width:20.0%" alt="R logo" class="r-stretch quarto-figure-center"></section>
 <section id="what-is-r-2" class="slide level2">
 <h2>What is R?</h2>
 <ul>
@@ -414,7 +394,7 @@ <h2>What is R?</h2>
 <h2>What is R?</h2>
 <ul>
 <li>Program: R is a clear and accessible programming tool</li>
-<li>Transform: R is made up of a collection of libraries designed specifically for data science</li>
+<li>Transform: R is made up of a collection of packages/libraries designed specifically for statistical computing</li>
 <li>Discover: Investigate the data, refine your hypothesis and analyze them</li>
 <li>Model: R provides a wide array of tools to capture the right model for your data</li>
 <li>Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world</li>
@@ -439,26 +419,26 @@ <h2>Why not R?</h2>
 </ul>
 </section>
 <section id="is-r-difficult" class="slide level2">
-<h2>Is R DIfficult?</h2>
+<h2>Is R Difficult?</h2>
 <ul>
-<li>Short answer – It has a steep learning curve.</li>
-<li>Years ago, R was a difficult language to master. The language was confusing and not as structured as the other programming tools.</li>
+<li>Short answer – It has a steep learning curve, like all programming languages</li>
+<li>Years ago, R was a difficult language to master.</li>
 <li>Hadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore.</li>
 </ul>
 </section>
-<section id="overall-workshop-objectives" class="slide level2 scrollable">
+<section id="overall-workshop-objectives" class="slide level2">
 <h2>Overall Workshop Objectives</h2>
 <p>By the end of this workshop, you should be able to</p>
 <ol type="1">
 <li>start a new project, read in data, and conduct basic data manipulation, analysis, and visualization</li>
 <li>know how to use and find packages/functions that we did not specifically learn in class</li>
-<li>troubleshoot errors (xxzane? – not included right now)</li>
+<li>troubleshoot errors</li>
 </ol>
 </section>
 <section id="this-workshop-differs-from-introduction-to-tidyverse" class="slide level2">
 <h2>This workshop differs from “Introduction to Tidyverse”</h2>
 <p>We will focus this class on using <strong>Base R</strong> functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.</p>
-<p>the Tidyverse is one set of useful/pretty packages, designed to can make your code more <strong>intuitive</strong> as compared to the original older Base R. <strong>Tidyverse advantages</strong>:</p>
+<p>The Tidyverse is one set of useful/pretty sets of packages, designed to can make your code more <strong>intuitive</strong> as compared to the original older Base R. <strong>Tidyverse advantages</strong>:</p>
 <ul>
 <li><strong>consistent structure</strong> - making it easier to learn how to use different packages</li>
 <li>particularly good for <strong>wrangling</strong> (manipulating, cleaning, joining) data<br>
@@ -466,22 +446,27 @@ <h2>This workshop differs from “Introduction to Tidyverse”</h2>
 <li>more flexible for <strong>visualizing</strong> data</li>
 </ul>
 
-<img data-src="https://tidyverse.tidyverse.org/logo.png" class="quarto-figure quarto-figure-center r-stretch" style="width:10.0%" alt="Tidyverse hex sticker"></section>
+<img data-src="https://tidyverse.tidyverse.org/logo.png" style="width:10.0%" alt="Tidyverse hex sticker" class="r-stretch quarto-figure-center"></section>
 <section id="workshop-overview" class="slide level2">
 <h2>Workshop Overview</h2>
-<p>14 lecture blocks that will each: - Start with learning objectives - End with summary slides - Include mini-exercise(s) or a full exercise</p>
-<p>Themes that will show up throughout the workshop: - Reproducibility - Good coding techniques - Thinking algorithmically - <a href="https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf">Basic terms / R jargon</a></p>
+<p>14 lecture blocks that will each:</p>
+<ul>
+<li>Start with learning objectives</li>
+<li>End with summary slides</li>
+<li>Include mini-exercise(s) or a full exercise</li>
+</ul>
+<p>Themes that will show up throughout the workshop:</p>
+<ul>
+<li>Reproducibility</li>
+<li>Good coding techniques</li>
+<li>Thinking algorithmically</li>
+<li><a href="https://link.springer.com/content/pdf/bbm%3A978-1-4419-1318-0%2F1.pdf">Basic terms / R jargon</a></li>
+</ul>
 </section>
 <section id="reproducibility" class="slide level2">
 <h2>Reproducibility</h2>
 <p>xxzane slides</p>
 </section>
-<section id="good-coding-techniques" class="slide level2">
-<h2>Good coding techniques</h2>
-</section>
-<section id="thinking-algorithmically" class="slide level2">
-<h2>Thinking algorithmically</h2>
-</section>
 <section id="useful-free-resources" class="slide level2">
 <h2>Useful (+ Free) Resources</h2>
 <p><strong>Want more?</strong></p>
@@ -515,10 +500,8 @@ <h2>Installing R</h2>
 <li><a href="https://www.rstudio.com/products/rstudio/download/">Install RStudio</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -547,6 +530,7 @@ <h2>Installing R</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -801,7 +785,18 @@ <h2>Installing R</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -833,50 +828,11 @@ <h2>Installing R</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -886,17 +842,8 @@ <h2>Installing R</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -910,11 +857,7 @@ <h2>Installing R</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module01-Intro.html b/docs/modules/Module01-Intro.html
index a580ca7..ccc4114 100644
--- a/docs/modules/Module01-Intro.html
+++ b/docs/modules/Module01-Intro.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 1: Introduction to RStudio and R Basics</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 1: Introduction to RStudio and R Basics</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,58 +407,17 @@ <h1 class="title">Module 1: Introduction to RStudio and R Basics</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/working-with-r-rstudio" id="/toc-working-with-r-rstudio">Working with R – RStudio</a></li>
-<li><a href="#/rstudio" id="/toc-rstudio">RStudio</a></li>
-<li><a href="#/rstudio-1" id="/toc-rstudio-1">RStudio</a></li>
-<li><a href="#/getting-the-editor" id="/toc-getting-the-editor">Getting the editor</a></li>
-<li><a href="#/working-with-r-in-rstudio---2-major-panes" id="/toc-working-with-r-in-rstudio---2-major-panes">Working with R in RStudio - 2 major panes:</a></li>
-<li><a href="#/source-editor" id="/toc-source-editor">Source / Editor</a></li>
-<li><a href="#/r-console" id="/toc-r-console">R Console</a></li>
-<li><a href="#/rstudio-2" id="/toc-rstudio-2">RStudio</a></li>
-<li><a href="#/rstudio-layout" id="/toc-rstudio-layout">RStudio Layout</a></li>
-<li><a href="#/workspaceenvironment" id="/toc-workspaceenvironment">Workspace/Environment</a></li>
-<li><a href="#/workspacehistory" id="/toc-workspacehistory">Workspace/History</a></li>
-<li><a href="#/workspaceother-panes" id="/toc-workspaceother-panes">Workspace/Other Panes</a></li>
-<li><a href="#/getting-started" id="/toc-getting-started">Getting Started</a></li>
-<li><a href="#/explaining-output-on-slides" id="/toc-explaining-output-on-slides">Explaining output on slides</a></li>
-<li><a href="#/r-as-a-calculator" id="/toc-r-as-a-calculator">R as a calculator</a></li>
-<li><a href="#/r-as-a-calculator-1" id="/toc-r-as-a-calculator-1">R as a calculator</a></li>
-<li><a href="#/execute-run-code" id="/toc-execute-run-code">Execute / Run Code</a></li>
-<li><a href="#/mini-exercise" id="/toc-mini-exercise">Mini exercise</a></li>
-<li><a href="#/commenting-in-scripts" id="/toc-commenting-in-scripts">Commenting in Scripts</a></li>
-<li><a href="#/commenting-an-r-script-header" id="/toc-commenting-an-r-script-header">Commenting an R Script header</a></li>
-<li><a href="#/commenting-to-create-sections" id="/toc-commenting-to-create-sections">Commenting to create sections</a></li>
-<li><a href="#/commenting-to-explain-code" id="/toc-commenting-to-explain-code">Commenting to explain code</a></li>
-<li><a href="#/commenting-to-explain-code-1" id="/toc-commenting-to-explain-code-1">Commenting to explain code</a></li>
-<li><a href="#/object---basic-terms" id="/toc-object---basic-terms">Object - Basic terms</a></li>
-<li><a href="#/objects" id="/toc-objects">Objects</a></li>
-<li><a href="#/mini-exercise-1" id="/toc-mini-exercise-1">Mini Exercise</a></li>
-<li><a href="#/objects-1" id="/toc-objects-1">Objects</a></li>
-<li><a href="#/assignment---good-coding" id="/toc-assignment---good-coding">Assignment - Good coding</a></li>
-<li><a href="#/lists" id="/toc-lists">Lists</a></li>
-<li><a href="#/useful-r-studio-shortcuts" id="/toc-useful-r-studio-shortcuts">Useful R Studio Shortcuts</a></li>
-<li><a href="#/rstudio-helps-with-tab-completion" id="/toc-rstudio-helps-with-tab-completion">RStudio helps with “tab completion”</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/mini-exercise-2" id="/toc-mini-exercise-2">Mini Exercise</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
 <p>After module 1, you should be able to…</p>
 <ul>
 <li>Create and save an R script</li>
-<li>Describe the utility and differences b/w the console and an R script</li>
-<li>Modify R Studio windows</li>
+<li>Describe the utility and differences b/w the Console and the Source panes</li>
+<li>Modify R Studio panes</li>
 <li>Create objects</li>
 <li>Describe the difference b/w character, numeric, list, and matrix objects</li>
-<li>Reference objects in the RStudio Global Environment</li>
+<li>Reference objects in the RStudio Environment pane</li>
 <li>Use basic arithmetic operators in R</li>
 <li>Use comments within an R script to create header, sections, and make notes</li>
 </ul>
@@ -472,11 +430,11 @@ <h2>Working with R – RStudio</h2>
 <li>Makes things easier</li>
 <li>Is NOT a dropdown statistical tool (such as Stata)
 <ul>
-<li>See <a href="https://cran.r-project.org/web/packages/Rcmdr/index.html">Rcmdr</a> or <a href="http://vnijs.github.io/radiant/">Radiant</a></li>
+<li>See <a href="https://www.jamovi.org/">jamovi</a> or also <a href="https://cran.r-project.org/web/packages/Rcmdr/index.html">Rcmdr</a>, <a href="http://vnijs.github.io/radiant/">Radiant</a></li>
 </ul></li>
 </ul>
 
-<img data-src="https://d33wubrfki0l68.cloudfront.net/62bcc8535a06077094ca3c29c383e37ad7334311/a263f/assets/img/logo.svg" class="quarto-figure quarto-figure-center r-stretch" style="width:30.0%" alt="RStudio logo"></section>
+<img data-src="https://d33wubrfki0l68.cloudfront.net/62bcc8535a06077094ca3c29c383e37ad7334311/a263f/assets/img/logo.svg" style="width:30.0%" alt="RStudio logo" class="r-stretch quarto-figure-center"></section>
 <section id="rstudio" class="slide level2">
 <h2>RStudio</h2>
 <p>Easier working with R</p>
@@ -495,20 +453,23 @@ <h2>RStudio</h2>
 <section id="rstudio-1" class="slide level2">
 <h2>RStudio</h2>
 
-<img data-src="https://ayeimanolr.files.wordpress.com/2013/04/r-rstudio-1-1.png?w=640&amp;h=382" class="quarto-figure quarto-figure-center r-stretch" style="width:80.0%" alt="RStudio"></section>
+<img data-src="https://ayeimanolr.files.wordpress.com/2013/04/r-rstudio-1-1.png?w=640&amp;h=382" style="width:80.0%" alt="RStudio" class="r-stretch quarto-figure-center"></section>
 <section id="getting-the-editor" class="slide level2">
 <h2>Getting the editor</h2>
 
 <img data-src="images/both.png" style="width:90.0%" class="r-stretch"></section>
-<section id="working-with-r-in-rstudio---2-major-panes" class="slide level2 scrollable">
+<section id="working-with-r-in-rstudio---2-major-panes" class="slide level2">
 <h2>Working with R in RStudio - 2 major panes:</h2>
 <ol type="1">
-<li>The <strong>Source/Editor</strong>: “Analysis” Script + Interactive Exploration
+<li>The <strong>Source/Editor</strong>: xxamy</li>
+</ol>
 <ul>
+<li>“Analysis” Script</li>
 <li>Static copy of what you did (reproducibility)</li>
 <li>Top by default</li>
-</ul></li>
-<li>The <strong>R Console</strong>: “interprets” whatever you type
+</ul>
+<ol start="2" type="1">
+<li><p>The <strong>R Console</strong>: “interprets” whatever you type:</p>
 <ul>
 <li>Calculator</li>
 <li>Try things out interactively, then add to your editor</li>
@@ -533,18 +494,18 @@ <h2>R Console</h2>
 <li>Code is <strong>not saved</strong></li>
 </ul>
 
-<img data-src="images/rstudio_console.png" class="quarto-figure quarto-figure-center r-stretch" style="width:60.0%"></section>
+<img data-src="images/rstudio_console.png" style="width:60.0%" class="r-stretch quarto-figure-center"></section>
 <section id="rstudio-2" class="slide level2">
 <h2>RStudio</h2>
 <p>Useful RStudio “cheat sheet”: <a href="https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf" class="uri">https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf</a></p>
 
-<img data-src="images/rstudio_sheet.png" class="quarto-figure quarto-figure-center r-stretch" style="width:65.0%" alt="RStudio"></section>
+<img data-src="images/rstudio_sheet.png" style="width:65.0%" alt="RStudio" class="r-stretch quarto-figure-center"></section>
 <section id="rstudio-layout" class="slide level2">
 <h2>RStudio Layout</h2>
 <p>If RStudio doesn’t look the way you want (or like our RStudio), then do:</p>
-<p>RStudio –&gt; View –&gt; Panes –&gt; Pane Layout</p>
+<p>In R Studio Menu Bar go to View Menu –&gt; Panes –&gt; Pane Layout</p>
 
-<img data-src="images/pane_layout.png" class="quarto-figure quarto-figure-center r-stretch" width="500"></section>
+<img data-src="images/pane_layout.png" width="500" class="r-stretch quarto-figure-center"></section>
 <section id="workspaceenvironment" class="slide level2">
 <h2>Workspace/Environment</h2>
 <ul>
@@ -557,7 +518,7 @@ <h2>Workspace/Environment</h2>
 <h2>Workspace/History</h2>
 <ul>
 <li>Shows previous commands. Good to look at for debugging, but <strong>don’t rely</strong> on it.</li>
-<li>Also type the “up” key in the Console to scroll through previous commands</li>
+<li>Also type the “up” and “down” key in the Console to scroll through previous commands</li>
 </ul>
 </section>
 <section id="workspaceother-panes" class="slide level2">
@@ -573,35 +534,36 @@ <h2>Workspace/Other Panes</h2>
 <section id="getting-started" class="slide level2">
 <h2>Getting Started</h2>
 <ul>
-<li>File –&gt; New File –&gt; R Script</li>
+<li>In R Studio Menu Bar go to File Menu –&gt; New File –&gt; R Script</li>
 <li>Save the blank R script as Module1.R</li>
 </ul>
 </section>
 <section id="explaining-output-on-slides" class="slide level2">
 <h2>Explaining output on slides</h2>
-<p>In slides, a command (we’ll also call them code or a code chunk) will look like this</p>
+<p>In slides, the R command/code will be in a box, and then directly after it, will be the output of the code starting with <code>[1]</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a><span class="fu">print</span>(<span class="st">"I'm code"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a><span class="fu">print</span>(<span class="st">"I'm code"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "I'm code"</code></pre>
 </div>
 </div>
-<p>And then directly after it, will be the output of the code.<br>
-So <code>print("I'm code")</code> is the code chunk and <code>[1] "I'm code"</code> is the output.</p>
+<p>So <code>print("I'm code")</code> is the command and <code>[1] "I'm code"</code> is the output.</p>
+<p><br></p>
+<p>Commands/code and output written as inline text will be typewriter blue font. For example <code>code</code></p>
 </section>
 <section id="r-as-a-calculator" class="slide level2">
 <h2>R as a calculator</h2>
 <p>You can do basic arithmetic in R, which I surprisingly use all the time.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a></a><span class="dv">2</span> <span class="sc">+</span> <span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1"></a><span class="dv">2</span> <span class="sc">+</span> <span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 4</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a></a><span class="dv">2</span> <span class="sc">*</span> <span class="dv">4</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1"></a><span class="dv">2</span> <span class="sc">*</span> <span class="dv">4</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 8</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a></a><span class="dv">2</span><span class="sc">^</span><span class="dv">3</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a href="#cb7-1"></a><span class="dv">2</span><span class="sc">^</span><span class="dv">3</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 8</code></pre>
 </div>
@@ -611,18 +573,18 @@ <h2>R as a calculator</h2>
 <h2>R as a calculator</h2>
 <ul>
 <li>The R console is a full calculator</li>
-<li>Try to play around with it:
+<li>Arithmetic operators:
 <ul>
-<li>+, -, /, * are add, subtract, divide and multiply</li>
-<li>^ or ** is power</li>
-<li>parentheses – ( and ) – work with order of operations</li>
-<li>%% finds the remainder</li>
+<li><code>+</code>, <code>-</code>, <code>/</code>, <code>*</code> are add, subtract, divide and multiply</li>
+<li><code>^</code> or <code>**</code> is power</li>
+<li>parentheses – <code>(</code> and <code>)</code> – work with order of operations</li>
+<li><code>%%</code> finds the remainder</li>
 </ul></li>
 </ul>
 </section>
-<section id="execute-run-code" class="slide level2 scrollable">
+<section id="execute-run-code" class="slide level2">
 <h2>Execute / Run Code</h2>
-<p>To execute or run a line of code, you just put your cursor on line of code and then:</p>
+<p>To execute or run a line of code (i.e., command), you just put your cursor on the command and then:</p>
 <ol type="1">
 <li>Press Run (which you will find at the top of your window)</li>
 </ol>
@@ -630,13 +592,13 @@ <h2>Execute / Run Code</h2>
 <ol start="2" type="1">
 <li>Press <code>Cmd + Return</code> (iOS) OR <code>Ctrl + Enter</code> (Windows).</li>
 </ol>
-<p>To execute or run multiple lines of code, you just need to highlight the code you want to run and then follow option 1 or 2.</p>
+<p>To execute or run multiple lines of code, you need to highlight the code you want to run and then follow option 1 or 2.</p>
 </section>
 <section id="mini-exercise" class="slide level2">
 <h2>Mini exercise</h2>
 <p>Execute <code>5+4</code> from your .R file, and then find the answer 9 in the Console.</p>
 </section>
-<section id="commenting-in-scripts" class="slide level2 scrollable">
+<section id="commenting-in-scripts" class="slide level2">
 <h2>Commenting in Scripts</h2>
 <p>The syntax <code>#</code> creates a comment, which means anything to the right of <code>#</code> will not be executed / run</p>
 <p>Commenting is useful to:</p>
@@ -650,49 +612,46 @@ <h2>Commenting in Scripts</h2>
 <h2>Commenting an R Script header</h2>
 <p>Add a comment header to Module1.R. This is the one I typically use, but you may have your own preference. The goal is that you are consistent so that future you / collaborators can make sense of your code.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a></a><span class="do">### Title: Module 1</span></span>
-<span id="cb9-2"><a></a><span class="do">### Author: Amy Winter </span></span>
-<span id="cb9-3"><a></a><span class="do">### Objective: Mini Exercise - Developing first R Script</span></span>
-<span id="cb9-4"><a></a><span class="do">### Date: 15 July 2024</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1"></a><span class="do">### Title: Module 1</span></span>
+<span id="cb9-2"><a href="#cb9-2"></a><span class="do">### Author: Amy Winter </span></span>
+<span id="cb9-3"><a href="#cb9-3"></a><span class="do">### Objective: Mini Exercise - Developing first R Script</span></span>
+<span id="cb9-4"><a href="#cb9-4"></a><span class="do">### Date: 15 July 2024</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="commenting-to-create-sections" class="slide level2">
 <h2>Commenting to create sections</h2>
-<p>You can also create sections within your code by ending a comment with 4 hash marks. <strong>This is very useful for creating an outline of your R Script.</strong> The “Outline” can be found in the top right of the your source window.</p>
+<p>You can also create sections within your code by ending a comment with 4 hash marks. <strong>This is very useful for creating an outline of your R Script.</strong> The “Outline” can be found in the top right of the your Source pane</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a></a><span class="co"># Section 1 Header ####</span></span>
-<span id="cb10-2"><a></a><span class="do">## Section 2 Sub-header ####</span></span>
-<span id="cb10-3"><a></a><span class="do">### Section 3 Sub-sub-header ####</span></span>
-<span id="cb10-4"><a></a><span class="do">#### Section 4 Sub-sub-sub-header ####</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1"></a><span class="co"># Section 1 Header ####</span></span>
+<span id="cb10-2"><a href="#cb10-2"></a><span class="do">## Section 2 Sub-header ####</span></span>
+<span id="cb10-3"><a href="#cb10-3"></a><span class="do">### Section 3 Sub-sub-header ####</span></span>
+<span id="cb10-4"><a href="#cb10-4"></a><span class="do">#### Section 4 Sub-sub-sub-header ####</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 
 <img data-src="images/outline.png" style="width:90.0%" class="r-stretch"></section>
 <section id="commenting-to-explain-code" class="slide level2">
 <h2>Commenting to explain code</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a></a><span class="do">## this # is still a comment</span></span>
-<span id="cb11-2"><a></a><span class="do">### you can use many #'s as you want</span></span>
-<span id="cb11-3"><a></a></span>
-<span id="cb11-4"><a></a><span class="co"># sometimes you have a really long comment,</span></span>
-<span id="cb11-5"><a></a><span class="co">#    like explaining what you are doing</span></span>
-<span id="cb11-6"><a></a><span class="co">#    for a step in analysis. </span></span>
-<span id="cb11-7"><a></a><span class="co"># Take it to another line</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1"></a><span class="do">## this # is still a comment</span></span>
+<span id="cb11-2"><a href="#cb11-2"></a><span class="do">### you can use many #'s as you want</span></span>
+<span id="cb11-3"><a href="#cb11-3"></a></span>
+<span id="cb11-4"><a href="#cb11-4"></a><span class="co"># sometimes you have a really long comment,</span></span>
+<span id="cb11-5"><a href="#cb11-5"></a><span class="co">#    like explaining what you are doing</span></span>
+<span id="cb11-6"><a href="#cb11-6"></a><span class="co">#    for a step in analysis. </span></span>
+<span id="cb11-7"><a href="#cb11-7"></a><span class="co"># Take it to another line</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-</section>
-<section id="commenting-to-explain-code-1" class="slide level2">
-<h2>Commenting to explain code</h2>
 <p>I tend to use:</p>
 <ul>
-<li>One hash tag with a space to describe what is happening in the following few lines of code</li>
-<li>One hastag with no space after a command to list specifics</li>
+<li>One hash mark with a space to describe what is happening in the following few lines of code</li>
+<li>One hash mark with no space after a command to list specifics</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a></a><span class="co"># Practicing my arithmetic</span></span>
-<span id="cb12-2"><a></a><span class="dv">5</span><span class="sc">+</span><span class="dv">2</span></span>
-<span id="cb12-3"><a></a><span class="dv">3</span><span class="sc">*</span><span class="dv">5</span></span>
-<span id="cb12-4"><a></a><span class="dv">9</span><span class="sc">/</span><span class="dv">8</span></span>
-<span id="cb12-5"><a></a></span>
-<span id="cb12-6"><a></a><span class="dv">5</span><span class="sc">+</span><span class="dv">2</span> <span class="co">#5 plus 2 </span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1"></a><span class="co"># Practicing my arithmetic</span></span>
+<span id="cb12-2"><a href="#cb12-2"></a><span class="dv">5</span><span class="sc">+</span><span class="dv">2</span></span>
+<span id="cb12-3"><a href="#cb12-3"></a><span class="dv">3</span><span class="sc">*</span><span class="dv">5</span></span>
+<span id="cb12-4"><a href="#cb12-4"></a><span class="dv">9</span><span class="sc">/</span><span class="dv">8</span></span>
+<span id="cb12-5"><a href="#cb12-5"></a></span>
+<span id="cb12-6"><a href="#cb12-6"></a><span class="dv">5</span><span class="sc">+</span><span class="dv">2</span> <span class="co">#5 plus 2 </span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="object---basic-terms" class="slide level2">
@@ -713,15 +672,15 @@ <h2>Objects</h2>
 <ul>
 <li>You can create objects from within the R environment and from files on your computer</li>
 <li>R uses <code>&lt;-</code> to assign values to an object name</li>
-<li>Note: Object names are case-sensitive, i.e.&nbsp;X and x are different</li>
+<li>Note: Object names are case-sensitive, i.e.&nbsp;<code>X</code> and <code>x</code> are different</li>
 <li>Here are examples of creating five different objects:</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a></a>number.object <span class="ot">&lt;-</span> <span class="dv">3</span></span>
-<span id="cb13-2"><a></a>character.object <span class="ot">&lt;-</span> <span class="st">"blue"</span></span>
-<span id="cb13-3"><a></a>vector.object1 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>,<span class="dv">5</span>)</span>
-<span id="cb13-4"><a></a>vector.object2 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"blue"</span>, <span class="st">"red"</span>, <span class="st">"yellow"</span>)</span>
-<span id="cb13-5"><a></a>matrix.object <span class="ot">&lt;-</span> <span class="fu">matrix</span>(<span class="at">data=</span>vector.object1, <span class="at">nrow=</span><span class="dv">2</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a>number.object <span class="ot">&lt;-</span> <span class="dv">3</span></span>
+<span id="cb13-2"><a href="#cb13-2"></a>character.object <span class="ot">&lt;-</span> <span class="st">"blue"</span></span>
+<span id="cb13-3"><a href="#cb13-3"></a>vector.object1 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>,<span class="dv">5</span>)</span>
+<span id="cb13-4"><a href="#cb13-4"></a>vector.object2 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"blue"</span>, <span class="st">"red"</span>, <span class="st">"yellow"</span>)</span>
+<span id="cb13-5"><a href="#cb13-5"></a>matrix.object <span class="ot">&lt;-</span> <span class="fu">matrix</span>(<span class="at">data=</span>vector.object1, <span class="at">nrow=</span><span class="dv">2</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Note, <code>c()</code> and <code>matrix()</code> are functions, which we will talk more about in module 2.</p>
 </section>
@@ -729,26 +688,26 @@ <h2>Objects</h2>
 <h2>Mini Exercise</h2>
 <p>Try creating one or two of these objects in your R script</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a></a>number.object <span class="ot">&lt;-</span> <span class="dv">3</span></span>
-<span id="cb14-2"><a></a>character.object <span class="ot">&lt;-</span> <span class="st">"blue"</span></span>
-<span id="cb14-3"><a></a>vector.object1 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>,<span class="dv">5</span>)</span>
-<span id="cb14-4"><a></a>vector.object2 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"blue"</span>, <span class="st">"red"</span>, <span class="st">"yellow"</span>)</span>
-<span id="cb14-5"><a></a>matrix.object <span class="ot">&lt;-</span> <span class="fu">matrix</span>(<span class="at">data=</span>vector.object1, <span class="at">nrow=</span><span class="dv">2</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1"></a>number.object <span class="ot">&lt;-</span> <span class="dv">3</span></span>
+<span id="cb14-2"><a href="#cb14-2"></a>character.object <span class="ot">&lt;-</span> <span class="st">"blue"</span></span>
+<span id="cb14-3"><a href="#cb14-3"></a>vector.object1 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>,<span class="dv">5</span>)</span>
+<span id="cb14-4"><a href="#cb14-4"></a>vector.object2 <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"blue"</span>, <span class="st">"red"</span>, <span class="st">"yellow"</span>)</span>
+<span id="cb14-5"><a href="#cb14-5"></a>matrix.object <span class="ot">&lt;-</span> <span class="fu">matrix</span>(<span class="at">data=</span>vector.object1, <span class="at">nrow=</span><span class="dv">2</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="objects-1" class="slide level2">
 <h2>Objects</h2>
 <p>Note, you can find these objects now in the Global Environment.</p>
 
-<img data-src="images/global_env.png" style="width:90.0%" class="r-stretch"><p>Also, you can call them anytime (i.e, see them in the Console) by executing (running) the object. For example,</p>
+<img data-src="images/global_env.png" style="width:90.0%" class="r-stretch"><p>Also, you can print them anytime (i.e, see them in the Console) by executing (running) the object. For example,</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a></a>character.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1"></a>character.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "blue"</code></pre>
 </div>
 </div>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a></a>matrix.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1"></a>matrix.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>     [,1] [,2]
 [1,]    2    3
@@ -756,16 +715,18 @@ <h2>Objects</h2>
 </div>
 </div>
 </section>
-<section id="assignment---good-coding" class="slide level2">
-<h2>Assignment - Good coding</h2>
-<p><code>=</code> and <code>&lt;-</code> can both be used for assignment, but <code>&lt;-</code> is better coding practice, because <code>==</code> is a logical operator. We will talk about this more, later.</p>
+<section>
+<section id="object-names-and-assingment---good-coding" class="title-slide slide level1 center">
+<h1>Object names and assingment - Good coding</h1>
+<p>xxzane</p>
+<p><code>=</code> and <code>&lt;-</code> can both be used for assignment, but <code>&lt;-</code> is better coding practice, because sometimes <code>=</code> doesn’t work and we want to distinguish between the logical operator <code>==</code>. We will talk about this more, later.</p>
 </section>
 <section id="lists" class="slide level2">
 <h2>Lists</h2>
 <p>List is a special data class, that can hold vectors, strings, matrices, models, list of other lists.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a></a>list.object <span class="ot">&lt;-</span> <span class="fu">list</span>(number.object, vector.object2, matrix.object)</span>
-<span id="cb19-2"><a></a>list.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a href="#cb19-1"></a>list.object <span class="ot">&lt;-</span> <span class="fu">list</span>(number.object, vector.object2, matrix.object)</span>
+<span id="cb19-2"><a href="#cb19-2"></a>list.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[[1]]
 [1] 3
@@ -804,33 +765,31 @@ <h2>Summary</h2>
 <li>The Editor is for static code like R Scripts</li>
 <li>The Console is for testing code that can’t be saved</li>
 <li>Commenting is your new best friend</li>
-<li>In R we create objects that can be viewed in the Environment panel and called anytime</li>
+<li>In R we create objects that can be viewed in the Environment pane and used anytime</li>
 <li>An object is something that can be worked with in R</li>
 <li>Use <code>&lt;-</code> syntax to create objects</li>
 </ul>
 </section>
-<section id="mini-exercise-2" class="slide level2 scrollable">
+<section id="mini-exercise-2" class="slide level2">
 <h2>Mini Exercise</h2>
 <ol type="1">
 <li>Create a new number object and name it <code>my.object</code></li>
 <li>Create a vector of 4 numbers and name it <code>my.vector</code> using the <code>c()</code> function</li>
-<li>Add <code>my.object</code> and <code>my.vector</code> together use arithmatic operator</li>
+<li>Add <code>my.object</code> and <code>my.vector</code> together using an arithmetic operator</li>
 </ol>
 </section>
 <section id="acknowledgements" class="slide level2">
 <h2>Acknowledgements</h2>
-<p>These are the materials I looked through, modified, or extracted to complete this module’s lecture.</p>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 <ul>
 <li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
 <li>Some RStudio snapshots were pulled from <a href="http://ayeimanol-r.net/2013/04/21/289/" class="uri">http://ayeimanol-r.net/2013/04/21/289/</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
 </div>
-</div>
-</section>
+</section></section>
     </div>
   </div>
 
@@ -857,6 +816,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -1111,7 +1071,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -1143,50 +1114,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -1196,17 +1128,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -1220,11 +1143,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module02-Functions.html b/docs/modules/Module02-Functions.html
index cde85c0..cc0cd42 100644
--- a/docs/modules/Module02-Functions.html
+++ b/docs/modules/Module02-Functions.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 2: Functions</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 2: Functions</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,31 +407,6 @@ <h1 class="title">Module 2: Functions</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/function---basic-term" id="/toc-function---basic-term">Function - Basic term</a></li>
-<li><a href="#/function" id="/toc-function">Function</a></li>
-<li><a href="#/arguments---basic-term" id="/toc-arguments---basic-term">Arguments - Basic term</a></li>
-<li><a href="#/arguments" id="/toc-arguments">Arguments</a></li>
-<li><a href="#/example" id="/toc-example">Example</a></li>
-<li><a href="#/sure-that-is-easy-enough-but-how-do-you-know" id="/toc-sure-that-is-easy-enough-but-how-do-you-know">Sure that is easy enough, but how do you know</a></li>
-<li><a href="#/seeking-help-for-using-functions" id="/toc-seeking-help-for-using-functions">Seeking help for using functions</a></li>
-<li><a href="#/how-to-specify-arguments" id="/toc-how-to-specify-arguments">How to specify arguments</a></li>
-<li><a href="#/package---basic-term" id="/toc-package---basic-term">Package - Basic term</a></li>
-<li><a href="#/packages" id="/toc-packages">Packages</a></li>
-<li><a href="#/additional-packages" id="/toc-additional-packages">Additional Packages</a></li>
-<li><a href="#/installing-and-calling-packages" id="/toc-installing-and-calling-packages"><strong>Installing</strong> and calling packages</a></li>
-<li><a href="#/installing-and-calling-packages-1" id="/toc-installing-and-calling-packages-1">Installing and <strong>calling</strong> packages</a></li>
-<li><a href="#/mini-exercise" id="/toc-mini-exercise">Mini Exercise</a></li>
-<li><a href="#/functions-from-module-1" id="/toc-functions-from-module-1">Functions from Module 1</a></li>
-<li><a href="#/functions-from-module-1-1" id="/toc-functions-from-module-1-1">Functions from Module 1</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
@@ -446,10 +420,10 @@ <h2>Learning Objectives</h2>
 </section>
 <section id="function---basic-term" class="slide level2">
 <h2>Function - Basic term</h2>
-<p><strong>Function</strong> - Functions are “self contained” modules of code that accomplish specific tasks. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.</p>
+<p><strong>Function</strong> - Functions are “self contained” modules of code that <strong>accomplish specific tasks</strong>. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.</p>
 <p>A function might help you add numbers together, create a plot, or organize your data. In fact, we have already used three functions in the Module 1, including <code>c()</code>, <code>matrix()</code>, <code>list()</code>. Here is another one, <code>sum()</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a><span class="fu">sum</span>(<span class="dv">1</span>, <span class="dv">20234</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a><span class="fu">sum</span>(<span class="dv">1</span>, <span class="dv">20234</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 20235</code></pre>
 </div>
@@ -457,19 +431,19 @@ <h2>Function - Basic term</h2>
 </section>
 <section id="function" class="slide level2">
 <h2>Function</h2>
-<p>The general usage for a function is the name of the function followed by parentheses. Within the parentheses are <strong>arguments</strong>.</p>
+<p>The general usage for a function is the name of the function followed by parentheses (i.e., the function signature). Within the parentheses are <strong>arguments</strong>.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a></a><span class="fu">function_name</span>(argument1, argument2, ...)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1"></a><span class="fu">function_name</span>(argument1, argument2, ...)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
-<section id="arguments---basic-term" class="slide level2 scrollable">
+<section id="arguments---basic-term" class="slide level2">
 <h2>Arguments - Basic term</h2>
 <p><strong>Arguments</strong> are what you pass to the function and can include:</p>
 <ol type="1">
 <li>the physical object on which the function carries out a task (e.g., can be data such as a number 1 or 20234)</li>
 </ol>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a></a><span class="fu">sum</span>(<span class="dv">1</span>, <span class="dv">20234</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1"></a><span class="fu">sum</span>(<span class="dv">1</span>, <span class="dv">20234</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 20235</code></pre>
 </div>
@@ -478,15 +452,15 @@ <h2>Arguments - Basic term</h2>
 <li>options that alter the way the function operates (e.g., such as the <code>base</code> argument in the function <code>log()</code>)</li>
 </ol>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base =</span> <span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base =</span> <span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 1</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1"></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3.321928</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base=</span><span class="fu">exp</span>(<span class="dv">1</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1"></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base=</span><span class="fu">exp</span>(<span class="dv">1</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 2.302585</code></pre>
 </div>
@@ -504,7 +478,7 @@ <h2>Arguments</h2>
 <h2>Example</h2>
 <p>What is the default in the <code>base</code> argument of the <code>log()</code> function?</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1"></a><span class="fu">log</span>(<span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 2.302585</code></pre>
 </div>
@@ -522,7 +496,7 @@ <h2>Sure that is easy enough, but how do you know</h2>
 <h2>Seeking help for using functions</h2>
 <p>The best way of finding out this information is to use the <code>?</code> followed by the name of the function. Doing this will open up the help manual in the bottom RStudio Help panel. It provides a description of the function, usage, arguments, details, and examples. Lets look at the help file for the function <code>round()</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a></a>?log</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1"></a>?log</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <pre><code>Registered S3 method overwritten by 'printr':
   method                from     
@@ -611,7 +585,7 @@ <h2>Seeking help for using functions</h2>
  cbind(deparse.level=2, # to get nice column names
        x, log(1+x), log1p(x), exp(x)-1, expm1(x))</code></pre>
 </section>
-<section id="how-to-specify-arguments" class="slide level2 scrollable">
+<section id="how-to-specify-arguments" class="slide level2">
 <h2>How to specify arguments</h2>
 <ol type="1">
 <li>Arguments are separated with a comma</li>
@@ -619,19 +593,19 @@ <h2>How to specify arguments</h2>
 </ol>
 
 <img data-src="images/log_args.png" style="width:70.0%" class="r-stretch"><div class="cell">
-<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1"></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3.321928</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a></a><span class="fu">log</span>(<span class="at">base=</span><span class="dv">2</span>, <span class="at">x=</span><span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1"></a><span class="fu">log</span>(<span class="at">base=</span><span class="dv">2</span>, <span class="at">x=</span><span class="dv">10</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3.321928</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb30"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a></a><span class="fu">log</span>(<span class="at">x=</span><span class="dv">10</span>, <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb30"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1"></a><span class="fu">log</span>(<span class="at">x=</span><span class="dv">10</span>, <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3.321928</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb32"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb32-1"><a></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base=</span><span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb32"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1"></a><span class="fu">log</span>(<span class="dv">10</span>, <span class="at">base=</span><span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3.321928</code></pre>
 </div>
@@ -639,7 +613,7 @@ <h2>How to specify arguments</h2>
 </section>
 <section id="package---basic-term" class="slide level2">
 <h2>Package - Basic term</h2>
-<p>When you download R, it has a “base” set of functions, that are associated with a “base” set of packages including: ‘base’, ‘datasets’, ‘graphics’, ‘grDevices’, ‘methods’, ‘stats’, ‘methods’ (typically just referred to as <strong>Base R</strong>).</p>
+<p>When you download R, it has a “base” set of functions, that are associated with a “base” set of packages including: ‘base’, ‘datasets’, ‘graphics’, ‘grDevices’, ‘methods’, ‘stats’ (typically just referred to as <strong>Base R</strong>).</p>
 <ul>
 <li>e.g., the <code>log()</code> function comes from the ‘base’ package</li>
 </ul>
@@ -648,12 +622,12 @@ <h2>Package - Basic term</h2>
 </section>
 <section id="packages" class="slide level2">
 <h2>Packages</h2>
-<p>The Packages window in RStudio can help you identify what have been installed (listed), and which one have been called (check mark).</p>
+<p>The Packages pane in RStudio can help you identify what have been installed (listed), and which one have been attached (check mark).</p>
 <p>Lets go look at the Packages window, find the <code>base</code> package and find the <code>log()</code> function. It automatically loads the help file that we looked at earlier using <code>?log</code>.</p>
 </section>
 <section id="additional-packages" class="slide level2">
 <h2>Additional Packages</h2>
-<p>You can install additional packages for your uses from <a href="https://cran.r-project.org/">CRAN</a> or <a href="https://github.com/">GitHub</a>. These additional packages are written by RStudio or R users/developers (like us)</p>
+<p>You can install additional packages for your use from <a href="https://cran.r-project.org/">CRAN</a> or <a href="https://github.com/">GitHub</a>. These additional packages are written by RStudio or R users/developers (like us)</p>
 <ul>
 <li>Not all packages available on CRAN or GitHub are trustworthy</li>
 <li>RStudio (the company) makes a lot of great packages</li>
@@ -661,28 +635,28 @@ <h2>Additional Packages</h2>
 <li>How to <a href="https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/#:~:text=The%20first%20thing%20I%20do,I%20immediately%20trust%20the%20package.">trust</a> an R package</li>
 </ul>
 </section>
-<section id="installing-and-calling-packages" class="slide level2 scrollable">
-<h2><strong>Installing</strong> and calling packages</h2>
-<p>To use the bundle or “package” of code (and or possibly data) from a package, you need to install and also call the package.</p>
+<section id="installing-and-attaching-packages" class="slide level2">
+<h2><strong>Installing</strong> and attaching packages</h2>
+<p>To use the bundle or “package” of code (and or possibly data) from a package, you need to install and also attach the package.</p>
 <p>To install a package you can</p>
 <ol type="1">
-<li>go to Tools —&gt; Install Packages in the RStudio header</li>
+<li>go to R Studio Menu Bar Tools Menu —&gt; Install Packages in the RStudio header</li>
 </ol>
 <p>OR</p>
 <ol start="2" type="1">
 <li>use the following code:</li>
 </ol>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb34"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb34-1"><a></a><span class="fu">install.packages</span>(package_name)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb34"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1"></a><span class="fu">install.packages</span>(<span class="st">"package_name"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
-<section id="installing-and-calling-packages-1" class="slide level2">
-<h2>Installing and <strong>calling</strong> packages</h2>
-<p>To call (i.e., be able to use the package) you can use the following code:</p>
+<section id="installing-and-attaching-packages-1" class="slide level2">
+<h2>Installing and <strong>attaching</strong> packages</h2>
+<p>To attach (i.e., be able to use the package) you can use the following code:</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb35"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb35-1"><a></a><span class="fu">library</span>(package_name)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb35"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb35-1"><a href="#cb35-1"></a><span class="fu">require</span>(package_name) <span class="co">#library(package_name) also works</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>More on installing and calling packages later…</p>
+<p>More on installing and attaching packages later…</p>
 </section>
 <section id="mini-exercise" class="slide level2">
 <h2>Mini Exercise</h2>
@@ -690,9 +664,9 @@ <h2>Mini Exercise</h2>
 </section>
 <section id="functions-from-module-1" class="slide level2">
 <h2>Functions from Module 1</h2>
-<p>The combine function <code>c()</code> collects/combines/joins single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types.</p>
+<p>The combine function <code>c()</code> concatenate/collects/combines single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb36"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb36-1"><a></a>?c</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb36"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1"></a>?c</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <div class="cell">
 <div class="cell-output cell-output-stdout">
@@ -798,12 +772,133 @@ <h2>Functions from Module 1</h2>
 <h2>Functions from Module 1</h2>
 <p>The <code>matrix()</code> function creates a matrix from the given set of values.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a></a>?matrix</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1"></a>?matrix</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>xxamy - doesn’t seem to work - may need to paste in a screen shot figure</p>
 <div class="cell">
 <div class="cell-output cell-output-stdout">
-<pre><code>No documentation for 'matix' in specified packages and libraries</code></pre>
+<pre><code>Matrices
+
+Description:
+
+     'matrix' creates a matrix from the given set of values.
+
+     'as.matrix' attempts to turn its argument into a matrix.
+
+     'is.matrix' tests if its argument is a (strict) matrix.
+
+Usage:
+
+     matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,
+            dimnames = NULL)
+     
+     as.matrix(x, ...)
+     ## S3 method for class 'data.frame'
+     as.matrix(x, rownames.force = NA, ...)
+     
+     is.matrix(x)
+     
+Arguments:
+
+    data: an optional data vector (including a list or 'expression'
+          vector).  Non-atomic classed R objects are coerced by
+          'as.vector' and all attributes discarded.
+
+    nrow: the desired number of rows.
+
+    ncol: the desired number of columns.
+
+   byrow: logical. If 'FALSE' (the default) the matrix is filled by
+          columns, otherwise the matrix is filled by rows.
+
+dimnames: A 'dimnames' attribute for the matrix: 'NULL' or a 'list' of
+          length 2 giving the row and column names respectively.  An
+          empty list is treated as 'NULL', and a list of length one as
+          row names.  The list can be named, and the list names will be
+          used as names for the dimensions.
+
+       x: an R object.
+
+     ...: additional arguments to be passed to or from methods.
+
+rownames.force: logical indicating if the resulting matrix should have
+          character (rather than 'NULL') 'rownames'.  The default,
+          'NA', uses 'NULL' rownames if the data frame has 'automatic'
+          row.names or for a zero-row data frame.
+
+Details:
+
+     If one of 'nrow' or 'ncol' is not given, an attempt is made to
+     infer it from the length of 'data' and the other parameter.  If
+     neither is given, a one-column matrix is returned.
+
+     If there are too few elements in 'data' to fill the matrix, then
+     the elements in 'data' are recycled.  If 'data' has length zero,
+     'NA' of an appropriate type is used for atomic vectors ('0' for
+     raw vectors) and 'NULL' for lists.
+
+     'is.matrix' returns 'TRUE' if 'x' is a vector and has a '"dim"'
+     attribute of length 2 and 'FALSE' otherwise.  Note that a
+     'data.frame' is *not* a matrix by this test.  The function is
+     generic: you can write methods to handle specific classes of
+     objects, see InternalMethods.
+
+     'as.matrix' is a generic function.  The method for data frames
+     will return a character matrix if there is only atomic columns and
+     any non-(numeric/logical/complex) column, applying 'as.vector' to
+     factors and 'format' to other non-character columns.  Otherwise,
+     the usual coercion hierarchy (logical &lt; integer &lt; double &lt;
+     complex) will be used, e.g., all-logical data frames will be
+     coerced to a logical matrix, mixed logical-integer will give a
+     integer matrix, etc.
+
+     The default method for 'as.matrix' calls 'as.vector(x)', and hence
+     e.g. coerces factors to character vectors.
+
+     When coercing a vector, it produces a one-column matrix, and
+     promotes the names (if any) of the vector to the rownames of the
+     matrix.
+
+     'is.matrix' is a primitive function.
+
+     The 'print' method for a matrix gives a rectangular layout with
+     dimnames or indices.  For a list matrix, the entries of length not
+     one are printed in the form 'integer,7' indicating the type and
+     length.
+
+Note:
+
+     If you just want to convert a vector to a matrix, something like
+
+       dim(x) &lt;- c(nx, ny)
+       dimnames(x) &lt;- list(row_names, col_names)
+     
+     will avoid duplicating 'x' _and_ preserve 'class(x)' which may be
+     useful, e.g., for 'Date' objects.
+
+References:
+
+     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
+     Language_.  Wadsworth &amp; Brooks/Cole.
+
+See Also:
+
+     'data.matrix', which attempts to convert to a numeric matrix.
+
+     A matrix is the special case of a two-dimensional 'array'.
+     'inherits(m, "array")' is true for a 'matrix' 'm'.
+
+Examples:
+
+     is.matrix(as.matrix(1:10))
+     !is.matrix(warpbreaks)  # data.frame, NOT matrix!
+     warpbreaks[1:10,]
+     as.matrix(warpbreaks[1:10,])  # using as.matrix.data.frame(.) method
+     
+     ## Example of setting row and column names
+     mdat &lt;- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,
+                    dimnames = list(c("row1", "row2"),
+                                    c("C.1", "C.2", "C.3")))
+     mdat</code></pre>
 </div>
 </div>
 </section>
@@ -813,21 +908,19 @@ <h2>Summary</h2>
 <li>Functions are “self contained” modules of code that accomplish specific tasks.</li>
 <li>Arguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)</li>
 <li>Arguments may include defaults that the author of the function specified as being “good enough in standard cases”, but that can be changed.</li>
-<li>An R Package is a bundle or “package” of code (and or possibly data) that can be used by installing it once and calling it (using <code>library()</code>) each time R/Rstudio is opened</li>
+<li>An R Package is a bundle or “package” of code (and or possibly data) that can be used by installing it once and attaching it (using <code>library()</code>) each time R/Rstudio is opened</li>
 <li>The Help window in RStudio is useful for to get more information about functions and packages</li>
 </ul>
 </section>
 <section id="acknowledgements" class="slide level2">
 <h2>Acknowledgements</h2>
-<p>These are the materials I looked through, modified, or extracted to complete this module’s lecture.</p>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 <ul>
 <li><a href="https://hbctraining.github.io/Intro-to-R/lessons/03_introR-functions-and-arguments.html#:~:text=A%20key%20feature%20of%20R,it%2C%20and%20return%20a%20result.">“Introduction to R - ARCHIVED” from Harvard Chan Bioinformatics Core (HBC)</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -856,6 +949,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -1110,7 +1204,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -1142,50 +1247,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -1195,17 +1261,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -1219,11 +1276,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module03-WorkingDirectories.html b/docs/modules/Module03-WorkingDirectories.html
index 617e48f..ae3ee6b 100644
--- a/docs/modules/Module03-WorkingDirectories.html
+++ b/docs/modules/Module03-WorkingDirectories.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 3: Working Directories</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 3: Working Directories</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,29 +407,12 @@ <h1 class="title">Module 3: Working Directories</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/file-structure" id="/toc-file-structure">File Structure</a></li>
-<li><a href="#/working-directory-basic-term" id="/toc-working-directory-basic-term">Working Directory – Basic term</a></li>
-<li><a href="#/getting-and-setting-the-working-directory-using-code" id="/toc-getting-and-setting-the-working-directory-using-code">Getting and setting the working directory using code</a></li>
-<li><a href="#/setting-a-working-directory" id="/toc-setting-a-working-directory">Setting a working directory</a></li>
-<li><a href="#/absolute-vs.-relative-paths" id="/toc-absolute-vs.-relative-paths">Absolute vs.&nbsp;relative paths</a></li>
-<li><a href="#/relative-path" id="/toc-relative-path">Relative path</a></li>
-<li><a href="#/setting-the-working-directory-using-your-cursor" id="/toc-setting-the-working-directory-using-your-cursor">Setting the working directory using your cursor</a></li>
-<li><a href="#/setting-the-working-directory" id="/toc-setting-the-working-directory">Setting the Working Directory</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
 <p>After module 3, you should be able to…</p>
 <ul>
-<li>Understand your own systems file structure and the purpose of the working directory</li>
+<li>Understand your own systems’ file structure and the purpose of the working directory</li>
 <li>Determine the working directory</li>
 <li>Change the working directory</li>
 </ul>
@@ -443,16 +425,16 @@ <h2>File Structure</h2>
 <h2>Working Directory – Basic term</h2>
 <ul>
 <li>R “looks” for files on your computer relative to the “working” directory</li>
-<li>For example, if you want to load data into R or save a figure, you will need to tell R where/store the file</li>
+<li>For example, if you want to load data into R or save a figure, you will need to tell R where to look for or store the file</li>
 <li>Many people recommend not setting a directory in the scripts, rather assume you’re in the directory the script is in</li>
 </ul>
 </section>
 <section id="getting-and-setting-the-working-directory-using-code" class="slide level2">
 <h2>Getting and setting the working directory using code</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a><span class="do">## get the working directory</span></span>
-<span id="cb1-2"><a></a><span class="fu">getwd</span>()</span>
-<span id="cb1-3"><a></a><span class="fu">setwd</span>(<span class="st">"~/"</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a><span class="do">## get the working directory</span></span>
+<span id="cb1-2"><a href="#cb1-2"></a><span class="fu">getwd</span>()</span>
+<span id="cb1-3"><a href="#cb1-3"></a><span class="fu">setwd</span>(<span class="st">"~/"</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="setting-a-working-directory" class="slide level2">
@@ -488,13 +470,13 @@ <h2>Setting the working directory using your cursor</h2>
 <p>Remember above “Many people recommend not setting a directory in the scripts, rather assume you’re in the directory the script is in.” To do so, go to Session –&gt; Set Working Directory –&gt; To Source File Location</p>
 <p>RStudio will show the code in the Console for the action you took with your cursor. This is a good way to learn about your file system how to set a correct working directory!</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a></a><span class="fu">setwd</span>(<span class="st">"~/Dropbox/Git/SISMID-2024"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1"></a><span class="fu">setwd</span>(<span class="st">"~/Dropbox/Git/SISMID-2024"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="setting-the-working-directory" class="slide level2">
 <h2>Setting the Working Directory</h2>
-<p>If you have not yet saved a “source” file, it will set working directory to the default location. See RStudio -&gt; Preferences -&gt; General for default location.</p>
-<p>To change the working directory to another location, go to Session –&gt; Set Working Directory –&gt; Choose Directory`</p>
+<p>If you have not yet saved a “source” file, it will set working directory to the default location.Find the Tool Menu in the Menu Bar -&gt; Global Opsions -&gt; General for default location.</p>
+<p>To change the working directory to another location, find Session Menu in the Menu Bar –&gt; Set Working Directory –&gt; Choose Directory`</p>
 <p>Again, RStudio will show the code in the Console for the action you took with your cursor.</p>
 </section>
 <section id="summary" class="slide level2">
@@ -503,7 +485,7 @@ <h2>Summary</h2>
 <li>R “looks” for files on your computer relative to the “working” directory</li>
 <li>Absolute path points to the same location in a file system - it is specific to your system and your system alone</li>
 <li>Relative path points is based on the current working directory</li>
-<li>Two functions, <code>setwd()</code> and <code>getwd()</code>, are your new best friends.</li>
+<li>Two functions, <code>setwd()</code> and <code>getwd()</code> are useful for identifying and manipulating the working directory.</li>
 </ul>
 </section>
 <section id="acknowledgements" class="slide level2">
@@ -513,10 +495,8 @@ <h2>Acknowledgements</h2>
 <li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -545,6 +525,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -799,7 +780,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -831,50 +823,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -884,17 +837,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -908,11 +852,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module04-RProject.html b/docs/modules/Module04-RProject.html
index b580b3d..212660a 100644
--- a/docs/modules/Module04-RProject.html
+++ b/docs/modules/Module04-RProject.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 4: R Project</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 4: R Project</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -157,8 +157,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -343,21 +342,6 @@ <h1 class="title">Module 4: R Project</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/rstudio-project" id="/toc-rstudio-project">RStudio Project</a></li>
-<li><a href="#/rstudio-project-creation" id="/toc-rstudio-project-creation">RStudio Project Creation</a></li>
-<li><a href="#/rstudio-project-organization" id="/toc-rstudio-project-organization">RStudio Project Organization</a></li>
-<li><a href="#/some-things-to-notice-in-an-r-project" id="/toc-some-things-to-notice-in-an-r-project">Some things to notice in an R Project</a></li>
-<li><a href="#/r-project---common-issues" id="/toc-r-project---common-issues">R Project - Common issues</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/mini-exercise" id="/toc-mini-exercise">Mini Exercise</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
@@ -365,11 +349,11 @@ <h2>Learning Objectives</h2>
 <ul>
 <li>Create an R Project</li>
 <li>Check you are in the desired R Project</li>
-<li>Reference the Files window in RStudio</li>
+<li>Reference the Files pane in RStudio</li>
 <li>Describe “good” R Project organization</li>
 </ul>
 </section>
-<section id="rstudio-project" class="slide level2 scrollable">
+<section id="rstudio-project" class="slide level2">
 <h2>RStudio Project</h2>
 <p>RStudio “Project” is one highly recommended strategy to build organized and reproducible code in R.</p>
 <ol type="1">
@@ -381,8 +365,8 @@ <h2>RStudio Project</h2>
 <section id="rstudio-project-creation" class="slide level2">
 <h2>RStudio Project Creation</h2>
 <p>Let’s create a new RStudio Project.</p>
-<p>Go to File –&gt; New Project –&gt; New Directory –&gt; New Project</p>
-<p>Call your Project “IntroToR_RProject”</p>
+<p>Find the File Menu in the Menu Bar –&gt; New Project –&gt; New Directory –&gt; New Project</p>
+<p>Name your Project “IntroToR_RProject”</p>
 </section>
 <section id="rstudio-project-organization" class="slide level2">
 <h2>RStudio Project Organization</h2>
@@ -394,36 +378,38 @@ <h2>RStudio Project Organization</h2>
 <li>output</li>
 <li>figures</li>
 </ul>
-<p>We will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the ‘code’ sub-directories.</p>
+<p>We will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the ‘code’ sub-directory.</p>
 </section>
-<section id="some-things-to-notice-in-an-r-project" class="slide level2 scrollable">
+<section id="some-things-to-notice-in-an-r-project" class="slide level2">
 <h2>Some things to notice in an R Project</h2>
 <ol type="1">
-<li>The name of the R Project will be shown at the top of the RStudio application</li>
+<li>The name of the R Project will be shown at the top of the RStudio Window</li>
 <li>If you check the working directory using <code>getwd()</code> you will find the working directory is set to the location where the R Project was saved.</li>
-<li>The Files window in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio.</li>
+<li>The Files pane in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio.</li>
 </ol>
 </section>
 <section id="r-project---common-issues" class="slide level2">
 <h2>R Project - Common issues</h2>
 <p>If you simply open RStudio, it will not automatically open your R Project. As a result, when you say run a function to import data using the relative path based on your working directory, it won’t be able to find the data.</p>
-<p>To open a previously created R Project, you need to open the R Project (i.e., SISMID_IntroToR_RProject.RProj)</p>
+<p>To open a previously created R Project, you need to open the R Project (i.e., double click on SISMID_IntroToR_RProject.RProj)</p>
 </section>
 <section id="summary" class="slide level2">
 <h2>Summary</h2>
 <ul>
 <li>R Projects are really helpful for lots of reasons, including to improve the reproducibility of your work</li>
 <li>Consistently set up your R Project’s sub-directories so that you can easily navigate the project</li>
+<li>If you get an error that a file can’t be found, make sure you correctly opened the R Project by looking for the Project name at the top of the RStudio application window.</li>
 </ul>
 </section>
-<section id="mini-exercise" class="slide level2 scrollable">
+<section id="mini-exercise" class="slide level2">
 <h2>Mini Exercise</h2>
 <ol type="1">
 <li>Close R Studio</li>
-<li>Reopen you R Project</li>
+<li>Reopen your R Project</li>
 <li>Check that you are actually in the R Project</li>
 <li>Create a new R script and save it in your ‘code’ subdirectory</li>
-<li>Create a vector of numbers and then get a summary statistics of that vector (e.g., sum, mean, median)</li>
+<li>Create a vector of numbers</li>
+<li>Create a vector a character values</li>
 <li>Add comment(s) to your R script to explain your code.</li>
 </ol>
 </section>
@@ -431,10 +417,8 @@ <h2>Mini Exercise</h2>
 <h2>Acknowledgements</h2>
 <p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -463,6 +447,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -679,7 +664,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -711,50 +707,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -764,17 +721,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -788,11 +736,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module05-DataImportExport.html b/docs/modules/Module05-DataImportExport.html
index a8eac77..abf6541 100644
--- a/docs/modules/Module05-DataImportExport.html
+++ b/docs/modules/Module05-DataImportExport.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 5: Data Import and Export</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 5: Data Import and Export</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,55 +407,22 @@ <h1 class="title">Module 5: Data Import and Export</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/import-read-data" id="/toc-import-read-data">Import (read) Data</a></li>
-<li><a href="#/mini-exercise" id="/toc-mini-exercise">Mini exercise</a></li>
-<li><a href="#/import-delimited-data" id="/toc-import-delimited-data">Import delimited data</a></li>
-<li><a href="#/import-.csv-files" id="/toc-import-.csv-files">Import .csv files</a></li>
-<li><a href="#/mini-exercise-1" id="/toc-mini-exercise-1">Mini Exercise</a></li>
-<li><a href="#/import-.csv-files-1" id="/toc-import-.csv-files-1">Import .csv files</a></li>
-<li><a href="#/import-.txt-files" id="/toc-import-.txt-files">Import .txt files</a></li>
-<li><a href="#/import-.txt-files-1" id="/toc-import-.txt-files-1">Import .txt files</a></li>
-<li><a href="#/what-if-we-have-a-.xlsx-file---what-do-we-do" id="/toc-what-if-we-have-a-.xlsx-file---what-do-we-do">What if we have a .xlsx file - what do we do?</a></li>
-<li><a href="#/internet-search" id="/toc-internet-search">1. Internet Search</a></li>
-<li><a href="#/find-and-vet-function-and-package-you-want" id="/toc-find-and-vet-function-and-package-you-want">2. Find and vet function and package you want</a></li>
-<li><a href="#/install-package" id="/toc-install-package">3. Install Package</a></li>
-<li><a href="#/call-package" id="/toc-call-package">4. Call Package</a></li>
-<li><a href="#/use-function" id="/toc-use-function">5. Use Function</a></li>
-<li><a href="#/use-function-1" id="/toc-use-function-1">5. Use Function</a></li>
-<li><a href="#/mini-exercise-2" id="/toc-mini-exercise-2">Mini exercise</a></li>
-<li><a href="#/installing-and-calling-packages---common-confusion" id="/toc-installing-and-calling-packages---common-confusion">Installing and calling packages - Common confusion</a></li>
-<li><a href="#/common-error" id="/toc-common-error">Common Error</a></li>
-<li><a href="#/export-write-data" id="/toc-export-write-data">Export (write) Data</a></li>
-<li><a href="#/export-delimited-data" id="/toc-export-delimited-data">Export delimited data</a></li>
-<li><a href="#/export-delimited-data-1" id="/toc-export-delimited-data-1">Export delimited data</a></li>
-<li><a href="#/r-.rds-and-.rdardata-files" id="/toc-r-.rds-and-.rdardata-files">R .rds and .rda/RData files</a></li>
-<li><a href="#/rds-binary-file" id="/toc-rds-binary-file">.rds binary file</a></li>
-<li><a href="#/rdardata-files" id="/toc-rdardata-files">.rda/RData files</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
 <p>After module 5, you should be able to…</p>
 <ul>
 <li>Use Base R functions to load data</li>
-<li>Install and call external R Packages to extend R’s functionality</li>
-<li>Install any type of data into R</li>
-<li>Find loaded data in the Global Environment window of RStudio</li>
+<li>Install and attach external R Packages to extend R’s functionality</li>
+<li>Load any type of data into R</li>
+<li>Find loaded data in the Environment pane of RStudio</li>
 <li>Reading and writing R .Rds and .Rda/.RData files</li>
 </ul>
 </section>
 <section id="import-read-data" class="slide level2">
 <h2>Import (read) Data</h2>
 <ul>
-<li>Importing or ‘Reading in’ data is the first step of any real project/analysis</li>
+<li>Importing or ‘Reading in’ data are the first step of any real project / data analysis</li>
 <li>R can read almost any file format, especially with external, non-Base R, packages</li>
 <li>We are going to focus on simple delimited files first.
 <ul>
@@ -466,19 +432,20 @@ <h2>Import (read) Data</h2>
 </ul>
 <p>A delimited file is a sequential file with column delimiters. Each delimited file is a stream of records, which consists of fields that are ordered by column. Each record contains fields for one row. Within each row, individual fields are separated by column <strong>delimiters</strong> (IBM.com definition)</p>
 </section>
-<section id="mini-exercise" class="slide level2 scrollable">
+<section id="mini-exercise" class="slide level2">
 <h2>Mini exercise</h2>
 <ol type="1">
 <li><p>Download Module 5 data from the website and save the data to your data subdirectory – specifically <code>SISMID_IntroToR_RProject/data</code></p></li>
-<li><p>Open the data files in a text editor application and familiarize you self with the data.</p></li>
-<li><p>Determine the delminiter of the two ‘.txt’ files</p></li>
+<li><p>Open the ‘.csv’ and ‘.txt’ data files in a text editor application and familiarize yourself with the data (i.e., Notepad for Windows and TextEdit for Mac)</p></li>
+<li><p>Open the ‘.xlsx’ data file in excel and familiarize yourself with the data - if you use a Mac <strong>do not</strong> open in Numbers, it can corrupt the file - if you do not have excel, you can upload it to Google Sheets</p></li>
+<li><p>Determine the delimiter of the two ‘.txt’ files</p></li>
 </ol>
 </section>
 <section id="import-delimited-data" class="slide level2">
 <h2>Import delimited data</h2>
 <p>Within the Base R ‘util’ package we can find a handful of useful functions including <code>read.csv()</code> and <code>read.delim()</code> to importing data.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a>?read.csv</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a>?read.csv</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <div class="cell">
 <div class="cell-output cell-output-stderr">
@@ -838,55 +805,54 @@ <h2>Import delimited data</h2>
 </section>
 <section id="import-.csv-files" class="slide level2">
 <h2>Import .csv files</h2>
-<p>Reminder</p>
+<p>Function signature reminder</p>
 <pre><code>read.csv(file, header = TRUE, sep = ",", quote = "\"",
          dec = ".", fill = TRUE, comment.char = "", ...)</code></pre>
-<p><code>file</code> is the first argument and is the path to your file, in quotes</p>
-<pre><code>-       can be path in your local computer -- absolute file path or relative file path 
--       can be path to a file on a website</code></pre>
+<pre><code>    -       `file` is the first argument and is the path to your file, in quotes 
+    
+            -       can be path in your local computer -- absolute file path or relative file path 
+            -       can be path to a file on a website</code></pre>
 </section>
 <section id="mini-exercise-1" class="slide level2">
-<h2>Mini Exercise</h2>
+<h2>Mini exercise</h2>
 <p>If your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data.</p>
 </section>
 <section id="import-.csv-files-1" class="slide level2">
 <h2>Import .csv files</h2>
 <p>Lets import a new data file</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a></a><span class="do">## Examples</span></span>
-<span id="cb6-2"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span>
-<span id="cb6-3"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"~/Dropbox/Git/SISMID-2024/modules/data/serodata.csv"</span>) <span class="co">#absolute path starting from my home directory</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a><span class="do">## Examples</span></span>
+<span id="cb6-2"><a href="#cb6-2"></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Note #1, I assigned the data frame to an object called <code>df</code>. I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object.</p>
-<p>Note #2, Look to the Environment window, you will see the <code>df</code> object ready to be used.</p>
+<p>Note #2, Look to the Environment pane, you will see the <code>df</code> object ready to be used.</p>
 </section>
 <section id="import-.txt-files" class="slide level2">
 <h2>Import .txt files</h2>
 <p><code>read.csv()</code> is a special case of <code>read.delim()</code> – a general function to read a delimited file into a data frame</p>
+<p>Reminder function signature</p>
 <pre><code>read.delim(file, header = TRUE, sep = "\t", quote = "\"",
            dec = ".", fill = TRUE, comment.char = "", ...)</code></pre>
-<ul>
-<li><code>file</code> is the path to your file, in quotes</li>
-<li><code>delim</code> is what separates the fields within a record. The default for csv is comma</li>
-</ul>
+<pre><code>    - `file` is the path to your file, in quotes 
+    - `delim` is what separates the fields within a record. The default for csv is comma</code></pre>
 </section>
 <section id="import-.txt-files-1" class="slide level2">
 <h2>Import .txt files</h2>
-<p>Lets first import ‘serodata1.txt’ which uses a tab delminiter and ‘serodata2.txt’ which uses a semicolon delminiter.</p>
+<p>Lets first import ‘serodata1.txt’ which uses a tab delimiter and ‘serodata2.txt’ which uses a semicolon delimiter.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a></a><span class="do">## Examples</span></span>
-<span id="cb8-2"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.delim</span>(<span class="at">file =</span> <span class="st">"data/serodata.txt"</span>, <span class="at">sep =</span> <span class="st">"</span><span class="sc">\t</span><span class="st">"</span>)</span>
-<span id="cb8-3"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.delim</span>(<span class="at">file =</span> <span class="st">"data/serodata.txt"</span>, <span class="at">sep =</span> <span class="st">";"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1"></a><span class="do">## Examples</span></span>
+<span id="cb9-2"><a href="#cb9-2"></a>df <span class="ot">&lt;-</span> <span class="fu">read.delim</span>(<span class="at">file =</span> <span class="st">"data/serodata.txt"</span>, <span class="at">sep =</span> <span class="st">"</span><span class="sc">\t</span><span class="st">"</span>)</span>
+<span id="cb9-3"><a href="#cb9-3"></a>df <span class="ot">&lt;-</span> <span class="fu">read.delim</span>(<span class="at">file =</span> <span class="st">"data/serodata.txt"</span>, <span class="at">sep =</span> <span class="st">";"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>The data is now successfully read into your R workspace, <strong>many times actually.</strong> Notice, that each time we imported the data we assigned the data to the <code>df</code> object, meaning we replaced it each time we reassinged the <code>df</code> object.</p>
+<p>The dataset is now successfully read into your R workspace, <strong>many times actually.</strong> Notice, that each time we imported the data we assigned the data to the <code>df</code> object, meaning we replaced it each time we reassinged the <code>df</code> object.</p>
 </section>
-<section id="what-if-we-have-a-.xlsx-file---what-do-we-do" class="slide level2 scrollable">
+<section id="what-if-we-have-a-.xlsx-file---what-do-we-do" class="slide level2">
 <h2>What if we have a .xlsx file - what do we do?</h2>
 <ol type="1">
 <li>Google / Ask ChatGPT</li>
 <li>Find and vet function and package you want</li>
 <li>Install package</li>
-<li>Call package</li>
+<li>Attach package</li>
 <li>Use function</li>
 </ol>
 </section>
@@ -894,25 +860,13 @@ <h2>What if we have a .xlsx file - what do we do?</h2>
 <h2>1. Internet Search</h2>
 <div class="cell">
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="images/ChatGPT.png" style="width:100.0%"></p>
-</figure>
-</div>
 </div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="images/GoogleSearch.png" style="width:100.0%"></p>
-</figure>
-</div>
 </div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="images/StackOverflow.png" style="width:100.0%"></p>
-</figure>
-</div>
 </div>
 </div>
 </section>
@@ -920,7 +874,7 @@ <h2>1. Internet Search</h2>
 <h2>2. Find and vet function and package you want</h2>
 <p>I am getting consistent message to use the the <code>read_excel()</code> function found in the <code>readxl</code> package. This package was developed by Hadley Wickham, who we know is reputable. Also, you can check that data was read in correctly, b/c this is a straightforward task.</p>
 </section>
-<section id="install-package" class="slide level2 scrollable">
+<section id="install-package" class="slide level2">
 <h2>3. Install Package</h2>
 <p>To use the bundle or “package” of code (and or possibly data) from a package, you need to install and also call the package.</p>
 <p>To install a package you can</p>
@@ -932,29 +886,28 @@ <h2>3. Install Package</h2>
 <li>use the following code:</li>
 </ol>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a></a><span class="fu">install.packages</span>(<span class="st">"package_name"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1"></a><span class="fu">install.packages</span>(<span class="st">"package_name"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Therefore,</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a></a><span class="fu">install.packages</span>(<span class="st">"readxl"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1"></a><span class="fu">install.packages</span>(<span class="st">"readxl"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
-<section id="call-package" class="slide level2">
-<h2>4. Call Package</h2>
-<p>Reminder – Installing and calling packages</p>
-<p>To call (i.e., be able to use the package) you can use the following code:</p>
+<section id="attach-package" class="slide level2">
+<h2>4. Attach Package</h2>
+<p>Reminder - To attach (i.e., be able to use the package) you can use the following code:</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a></a><span class="fu">library</span>(package_name)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1"></a><span class="fu">require</span>(package_name)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Therefore,</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a></a><span class="fu">library</span>(readxl)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a><span class="fu">require</span>(readxl)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="use-function" class="slide level2">
 <h2>5. Use Function</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a></a>?read_excel</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1"></a>?read_excel</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Read xls and xlsx files</p>
 <p>Description:</p>
@@ -1100,7 +1053,7 @@ <h2>5. Use Function</h2>
 </section>
 <section id="use-function-1" class="slide level2">
 <h2>5. Use Function</h2>
-<p>Reminder</p>
+<p>Reminder of function signature</p>
 <pre><code>read_excel(
   path,
   sheet = NULL,
@@ -1117,12 +1070,11 @@ <h2>5. Use Function</h2>
 )</code></pre>
 <p>Let’s practice</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read_excel</span>(<span class="at">path =</span> <span class="st">"data/serodata.xlsx"</span>, <span class="at">sheet =</span> <span class="st">"Data"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb24"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1"><a href="#cb24-1"></a>df <span class="ot">&lt;-</span> <span class="fu">read_excel</span>(<span class="at">path =</span> <span class="st">"data/serodata.xlsx"</span>, <span class="at">sheet =</span> <span class="st">"Data"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
-<section id="mini-exercise-2" class="slide level2 scrollable">
-<h2>Mini exercise</h2>
-<p>Lets make some mistakes</p>
+<section id="lets-make-some-mistakes" class="slide level2">
+<h2>Lets make some mistakes</h2>
 <ol type="1">
 <li><p>What if we read in the data without assigning it to an object (i.e., <code>read_excel(path = "data/serodata.xlsx", sheet = "Data")</code>)?</p></li>
 <li><p>What if we forget to specify the sheet argument? (i.e., <code>dd &lt;- read_excel(path = "data/serodata.xlsx")</code>)?</p></li>
@@ -1130,26 +1082,28 @@ <h2>Mini exercise</h2>
 </section>
 <section id="installing-and-calling-packages---common-confusion" class="slide level2">
 <h2>Installing and calling packages - Common confusion</h2>
-<p>You only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it.</p>
+<p><br></p>
+<p>You only need to install a package once (unless you update R or want to update the package), but you will need to call or load a package each time you want to use it.</p>
+<p><br></p>
 <p>The exception to this rule are the “base” set of packages (i.e., <strong>Base R</strong>) that are installed automatically when you install R and that automatically called whenever you open R or RStudio.</p>
 </section>
 <section id="common-error" class="slide level2">
 <h2>Common Error</h2>
-<p>Be prepared to see the error</p>
+<p>Be prepared to see this error</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb24"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1"><a></a>Error<span class="sc">:</span> could not find <span class="cf">function</span> <span class="st">"some_function"</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a>Error<span class="sc">:</span> could not find <span class="cf">function</span> <span class="st">"some_function_name"</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>This usually mean that either</p>
+<p>This usually means that either</p>
 <ul>
 <li>you called the function by the wrong name</li>
 <li>you have not installed a package that contains the function</li>
-<li>you have installed a package but you forgot to call it (i.e., <code>library(package_name)</code>) – <strong>most likely</strong></li>
+<li>you have installed a package but you forgot to attach it (i.e., <code>require(package_name)</code>) – <strong>most likely</strong></li>
 </ul>
 </section>
 <section id="export-write-data" class="slide level2">
 <h2>Export (write) Data</h2>
 <ul>
-<li>Exporting or ‘Writing out’ data allows you to save modified files to future use or sharing</li>
+<li>Exporting or ‘Writing out’ data allows you to save modified files for future use or sharing</li>
 <li>R can write almost any file format, especially with external, non-Base R, packages</li>
 <li>We are going to focus again on writing delimited files</li>
 </ul>
@@ -1367,17 +1321,18 @@ <h2>Export delimited data</h2>
 </section>
 <section id="export-delimited-data-1" class="slide level2">
 <h2>Export delimited data</h2>
+<p>Let’s practice exporting the data as three files with three different delimiters (comma, tab, semicolon)</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a></a><span class="fu">write.csv</span>(df, <span class="at">file=</span><span class="st">"data/serodata_new.csv"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#comma delimited</span></span>
-<span id="cb26-2"><a></a><span class="fu">write.table</span>(df, <span class="at">file=</span><span class="st">"data/serodata1_new.txt"</span>, <span class="at">sep=</span><span class="st">"</span><span class="sc">\t</span><span class="st">"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#tab delimited</span></span>
-<span id="cb26-3"><a></a><span class="fu">write.table</span>(df, <span class="at">file=</span><span class="st">"data/serodata2_new.txt"</span>, <span class="at">sep=</span><span class="st">";"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#semicolon delimited</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb27"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1"></a><span class="fu">write.csv</span>(df, <span class="at">file=</span><span class="st">"data/serodata_new.csv"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#comma delimited</span></span>
+<span id="cb27-2"><a href="#cb27-2"></a><span class="fu">write.table</span>(df, <span class="at">file=</span><span class="st">"data/serodata1_new.txt"</span>, <span class="at">sep=</span><span class="st">"</span><span class="sc">\t</span><span class="st">"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#tab delimited</span></span>
+<span id="cb27-3"><a href="#cb27-3"></a><span class="fu">write.table</span>(df, <span class="at">file=</span><span class="st">"data/serodata2_new.txt"</span>, <span class="at">sep=</span><span class="st">";"</span>, <span class="at">row.names =</span> <span class="cn">FALSE</span>) <span class="co">#semicolon delimited</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Note, I wrote the data to new file names. Even though we didn’t change the data at all in this module, it is good practice to keep raw data raw, and not to write over it.</p>
 </section>
 <section id="r-.rds-and-.rdardata-files" class="slide level2">
 <h2>R .rds and .rda/RData files</h2>
 <p>There are two file extensions worth discussing.</p>
-<p>R has two native data formats—Rdata (sometimes shortened to Rda) and Rds. These formats are used when R objects are saved for later use. Rdata is used to save multiple R objects, while Rds is used to save a single R object.</p>
+<p>R has two native data formats—‘Rdata’ (sometimes shortened to ‘Rda’) and ‘Rds’. These formats are used when R objects are saved for later use. ‘Rdata’ is used to save multiple R objects, while ‘Rds’ is used to save a single R object. ‘Rds’ is fast to write/read and is very small.</p>
 </section>
 <section id="rds-binary-file" class="slide level2">
 <h2>.rds binary file</h2>
@@ -1390,20 +1345,21 @@ <h2>.rds binary file</h2>
 <section id="rdardata-files" class="slide level2">
 <h2>.rda/RData files</h2>
 <p>The Base R functions <code>save()</code> and <code>load()</code> can be used to save and load multiple R objects.</p>
-<p><code>save()</code> writes an external representation of R objects to the specified file, and can by loaded back into the environment using <code>load()</code>. A nice feature about using <code>save</code> and <code>load</code> is that the R object is directly imported into the environment and you don’t have to assign it to an object. The files can be saved as <code>.RData</code> or <code>.rda</code> files.</p>
+<p><code>save()</code> writes an external representation of R objects to the specified file, and can by loaded back into the environment using <code>load()</code>. A nice feature about using <code>save</code> and <code>load</code> is that the R object(s) is directly imported into the environment and you don’t have to specify the name. The files can be saved as <code>.RData</code> or <code>.Rda</code> files.</p>
+<p>Function signature</p>
 <pre><code>save(object1, object2, file = "filename.RData")
 load("filename.RData")</code></pre>
-<p>Note, that when you read .RData files you don’t need to assign it to an abjecct. It simply reads in the objects as they were saved. Therefore, <code>load("filename.RData")</code> will read in <code>object1</code> and <code>object2</code> directly into the Global Environment.</p>
+<p>Note, that you separate the objects you want to save with commas.</p>
 </section>
 <section id="summary" class="slide level2">
 <h2>Summary</h2>
 <ul>
-<li>Importing or ‘Reading in’ data is the first step of any real project/analysis</li>
-<li>The Base R ‘util’ package we can find a handful of useful functions including <code>read.csv()</code> and <code>read.delim()</code> to importing/reading data or <code>write.csv()</code> and <code>write.table()</code> for exporti/writing data</li>
-<li>When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be called/used</li>
-<li>Properly read data can be found in the Environment window of RStudio</li>
-<li>You only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it.</li>
-<li>To complete a tasek you don’t know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Call package, 5. Use function</li>
+<li>Importing or ‘Reading in’ data are the first step of any real project / data analysis</li>
+<li>The Base R ‘util’ package has useful functions including <code>read.csv()</code> and <code>read.delim()</code> to importing/reading data or <code>write.csv()</code> and <code>write.table()</code> for exporting/writing data</li>
+<li>When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be used</li>
+<li>If data are imported correctly, they can be found in the Environment pane of RStudio</li>
+<li>You only need to install a package once (unless you update R or the package), but you will need to attach a package each time you want to use it.</li>
+<li>To complete a task you don’t know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Attach package, 5. Use function</li>
 </ul>
 </section>
 <section id="acknowledgements" class="slide level2">
@@ -1413,10 +1369,8 @@ <h2>Acknowledgements</h2>
 <li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -1445,6 +1399,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -1699,7 +1654,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -1731,50 +1697,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -1784,17 +1711,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -1808,11 +1726,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module06-DataSubset.html b/docs/modules/Module06-DataSubset.html
index eba0b1f..d141d1a 100644
--- a/docs/modules/Module06-DataSubset.html
+++ b/docs/modules/Module06-DataSubset.html
@@ -483,11 +483,11 @@ <h2>Quick summary of data</h2>
 </section>
 <section id="description-of-data" class="slide level2">
 <h2>Description of data</h2>
-<p>This is data based on a simulated pathogen X IgG antibody serological survey. The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization. We will use this dataset for lectures throughout the Workshop.</p>
+<p>This is data based on a simulated pathogen X IgG antibody serological survey. The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization. We will use this dataset for modules throughout the Workshop.</p>
 </section>
 <section id="view-the-data-as-a-whole-dataframe" class="slide level2">
 <h2>View the data as a whole dataframe</h2>
-<p>The <code>View()</code> function, one of the few Base R functions with a capital letter can be used to open a new tab in the Console and view the data as you would in excel.</p>
+<p>The <code>View()</code> function, one of the few Base R functions with a capital letter, and can be used to open a new tab in the Console and view the data as you would in excel.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb14"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1"></a><span class="fu">View</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
@@ -495,12 +495,13 @@ <h2>View the data as a whole dataframe</h2>
 <img data-src="images/ViewTab.png" style="width:100.0%" class="r-stretch"></section>
 <section id="view-the-data-as-a-whole-dataframe-1" class="slide level2">
 <h2>View the data as a whole dataframe</h2>
-<p>You can also open a new tab of the data by clicking on the data icon beside the object in the Environment window.</p>
+<p>You can also open a new tab of the data by clicking on the data icon beside the object in the Environment pane</p>
 
-<img data-src="images/View.png" style="width:90.0%" class="r-stretch"></section>
+<img data-src="images/View.png" style="width:90.0%" class="r-stretch"><p>You can also hold down <code>Cmd</code> or <code>CTRL</code> and click on the name of a data frame in your code.</p>
+</section>
 <section id="indexing" class="slide level2">
 <h2>Indexing</h2>
-<p>R contains several constructs which allow access to individual elements or subsets through indexing operations. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing syntax: <code>[ ]</code>, <code>[[ ]]</code> and <code>$</code>.</p>
+<p>R contains several operators which allow access to individual elements or subsets through indexing. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing operators: <code>[</code>, <code>[[</code> and <code>$</code>.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1"></a>x[i] <span class="co">#if x is a vector</span></span>
 <span id="cb15-2"><a href="#cb15-2"></a>x[i, j] <span class="co">#if x is a matrix/data frame</span></span>
@@ -511,7 +512,7 @@ <h2>Indexing</h2>
 </section>
 <section id="vectors-and-multi-dimensional-objects" class="slide level2">
 <h2>Vectors and multi-dimensional objects</h2>
-<p>To index a vector, <code>vector[i]</code> select the ith element. To index a multi-dimensional objects such as a matrix, <code>matrix[i, j]</code> selects the element in row i and column j, where as in a three dimensional <code>array[k, i, i, j]</code> selects the element in matrix k, row i, and column j.</p>
+<p>To index a vector, <code>vector[i]</code> select the ith element. To index a multi-dimensional objects such as a matrix, <code>matrix[i, j]</code> selects the element in row i and column j, where as in a three dimensional <code>array[k, i, j]</code> selects the element in matrix k, row i, and column j.</p>
 <p>Let’s practice by first creating the same objects as we did in Module 1.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb16"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1"></a>number.object <span class="ot">&lt;-</span> <span class="dv">3</span></span>
@@ -533,7 +534,7 @@ <h2>Vectors and multi-dimensional objects</h2>
 [2,]    4    5</code></pre>
 </div>
 </div>
-<p>Finally, let’s use indexing to pull our elements of the objects.</p>
+<p>Finally, let’s use indexing to pull out elements of the objects.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb21"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1"></a>vector.object1[<span class="dv">2</span>] <span class="co">#pulling the second element</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
@@ -574,158 +575,37 @@ <h2>List objects</h2>
 [2,]    4    5</code></pre>
 </div>
 </div>
-</section>
-<section id="for-indexing" class="slide level2">
-<h2>$ for indexing</h2>
-<p><code>$</code> allows only a literal character string or a symbol as the index.</p>
+<p>What happens if we use a single square bracket?</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a>df<span class="sc">$</span>IgG_concentration</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a>list.object[<span class="dv">3</span>]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01
-  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00
- [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02
- [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02
- [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01
- [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01
- [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01
- [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01
- [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00
- [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01
- [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA
- [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00
- [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01
- [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02
- [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01
- [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01
- [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02
- [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01
- [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00
- [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01
-[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01
-[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02
-[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01
-[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02
-[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01
-[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01
-[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02
-[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02
-[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00
-[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02
-[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02
-[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02
-[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01
-[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02
-[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02
-[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01
-[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01
-[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02
-[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02
-[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01
-[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01
-[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01
-[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02
-[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02
-[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03
-[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01
-[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02
-[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02
-[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00
-[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01
-[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01
-[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02
-[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00
-[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02
-[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02
-[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01
-[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01
-[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02
-[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02
-[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01
-[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01
-[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02
-[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02
-[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02
-[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01
-[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00
-[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01
-[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01
-[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00
-[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01
-[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02
-[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01
-[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01
-[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02
-[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01
-[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01
-[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01
-[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02
-[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01
-[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02
-[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01
-[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01
-[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02
-[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01
-[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00
-[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01
-[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01
-[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01
-[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02
-[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00
-[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01
-[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01
-[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01
-[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01
-[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01
-[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01
-[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02
-[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02
-[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01
-[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01
-[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02
-[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02
-[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01
-[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02
-[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02
-[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02
-[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01
-[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02
-[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02
-[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02
-[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01
-[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02
-[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02
-[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01
-[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01
-[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01
-[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA
-[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01
-[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01
-[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02
-[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02
-[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01
-[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02
-[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01
-[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01
-[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01
-[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02
-[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01
-[651] 2.251656e-01</code></pre>
-</div>
-</div>
-<p>Note, if you have spaces in your variable name, you will need to use back ticks <code>variable name</code> after the <code>$</code>. This is a good reason to not create variables / column names with spaces.</p>
+<pre><code>[[1]]
+     [,1] [,2]
+[1,]    2    3
+[2,]    4    5</code></pre>
+</div>
+</div>
+<p>The <code>[[</code> operator is called the “extract” operator and gives us the element from the list. The <code>[</code> operator is called the “subset” operator and gives us a subset of the list, that is still a list.</p>
+</section>
+<section id="for-indexing-for-data-frame" class="slide level2">
+<h2>$ for indexing for data frame</h2>
+<p><code>$</code> allows only a literal character string or a symbol as the index. For a data frame it extracts a variable.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb31"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb31-1"><a href="#cb31-1"></a>df<span class="sc">$</span>IgG_concentration</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>Note, if you have spaces in your variable name, you will need to use back ticks ` after the <code>$</code>. This is a good reason to not create variables / column names with spaces.</p>
 </section>
 <section id="for-indexing-with-lists" class="slide level2">
 <h2>$ for indexing with lists</h2>
+<p><code>$</code> allows only a literal character string or a symbol as the index. For a list it extracts a named element.</p>
 <p>List elements can be named</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb31"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb31-1"><a href="#cb31-1"></a>list.object.named <span class="ot">&lt;-</span> <span class="fu">list</span>(</span>
-<span id="cb31-2"><a href="#cb31-2"></a>  <span class="at">emory =</span> number.object,</span>
-<span id="cb31-3"><a href="#cb31-3"></a>  <span class="at">uga =</span> vector.object2,</span>
-<span id="cb31-4"><a href="#cb31-4"></a>  <span class="at">gsu =</span> matrix.object</span>
-<span id="cb31-5"><a href="#cb31-5"></a>)</span>
-<span id="cb31-6"><a href="#cb31-6"></a>list.object.named</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb32"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1"></a>list.object.named <span class="ot">&lt;-</span> <span class="fu">list</span>(</span>
+<span id="cb32-2"><a href="#cb32-2"></a>  <span class="at">emory =</span> number.object,</span>
+<span id="cb32-3"><a href="#cb32-3"></a>  <span class="at">uga =</span> vector.object2,</span>
+<span id="cb32-4"><a href="#cb32-4"></a>  <span class="at">gsu =</span> matrix.object</span>
+<span id="cb32-5"><a href="#cb32-5"></a>)</span>
+<span id="cb32-6"><a href="#cb32-6"></a>list.object.named</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>$emory
 [1] 3
@@ -739,13 +619,13 @@ <h2>$ for indexing with lists</h2>
 [2,]    4    5</code></pre>
 </div>
 </div>
-<p>If list elements are named, than you can reference data from list using <code>$</code> or using double square brackets, <code>[[ ]]</code></p>
+<p>If list elements are named, than you can reference data from list using <code>$</code> or using double square brackets, <code>[[</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb33"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb33-1"><a href="#cb33-1"></a>list.object.named<span class="sc">$</span>uga </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb34"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1"></a>list.object.named<span class="sc">$</span>uga </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "blue"   "red"    "yellow"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb35"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb35-1"><a href="#cb35-1"></a>list.object.named[[<span class="st">"uga"</span>]] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb36"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1"></a>list.object.named[[<span class="st">"uga"</span>]] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "blue"   "red"    "yellow"</code></pre>
 </div>
@@ -755,1633 +635,716 @@ <h2>$ for indexing with lists</h2>
 <h2>Using indexing to rename columns</h2>
 <p>As mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb37"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb37-1"><a href="#cb37-1"></a><span class="fu">colnames</span>(df) <span class="co"># just prints</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1"></a><span class="fu">colnames</span>(df) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "observation_id"    "IgG_concentration" "age"              
 [4] "gender"            "slum"             </code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb39"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><a href="#cb39-1"></a><span class="fu">colnames</span>(df)[<span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>] <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"IgG_concentration_mIU/mL"</span>, <span class="st">"age_year"</span>) <span class="co"># reassigns</span></span>
-<span id="cb39-2"><a href="#cb39-2"></a><span class="fu">colnames</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1"></a><span class="fu">colnames</span>(df)[<span class="dv">2</span><span class="sc">:</span><span class="dv">3</span>] <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"IgG_concentration_IU/mL"</span>, <span class="st">"age_year"</span>) <span class="co"># reassigns</span></span>
+<span id="cb40-2"><a href="#cb40-2"></a><span class="fu">colnames</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] "IgG_concentration_mIU/mL" "age_year"                
-[3] "age"                      "gender"                  
-[5] "slum"                    </code></pre>
+<pre><code>[1] "observation_id"          "IgG_concentration_IU/mL"
+[3] "age_year"                "gender"                 
+[5] "slum"                   </code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb41"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb41-1"><a href="#cb41-1"></a><span class="fu">colnames</span>(df)[<span class="dv">1</span><span class="sc">:</span><span class="dv">2</span>] <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"IgG_concentration"</span>, <span class="st">"age"</span>) <span class="co">#reset</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>For the sake of the module, I am going to reassign them back to the original variable names</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1"></a><span class="fu">colnames</span>(df)[<span class="dv">2</span><span class="sc">:</span><span class="dv">3</span>] <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"IgG_concentration"</span>, <span class="st">"age"</span>) <span class="co">#reset</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="using-indexing-to-subset-by-columns" class="slide level2">
 <h2>Using indexing to subset by columns</h2>
-<p>We can also subset a data frames and matrices (2-dimensional objects) using the bracket <code>[ row , column ]</code>. We can subset by columns and pull the <code>x</code> column using the index of the column or the column name.</p>
-<p>For example, here I am pulling the 3nd column, which has the variable name <code>age</code></p>
+<p>We can also subset data frames and matrices (2-dimensional objects) using the bracket <code>[ row , column ]</code>. We can subset by columns and pull the <code>x</code> column using the index of the column or the column name. Leaving either row or column dimension blank means to select all of them.</p>
+<p>For example, here I am pulling the 3rd column, which has the variable name <code>age</code>, for all of rows.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1"></a>df[ , <span class="st">"age"</span>] <span class="co">#same as df[ , 3]</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-<div class="cell-output cell-output-stdout">
-<pre><code>  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01
-  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00
- [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02
- [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02
- [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01
- [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01
- [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01
- [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01
- [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00
- [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01
- [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA
- [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00
- [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01
- [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02
- [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01
- [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01
- [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02
- [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01
- [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00
- [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01
-[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01
-[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02
-[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01
-[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02
-[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01
-[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01
-[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02
-[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02
-[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00
-[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02
-[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02
-[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02
-[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01
-[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02
-[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02
-[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01
-[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01
-[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02
-[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02
-[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01
-[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01
-[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01
-[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02
-[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02
-[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03
-[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01
-[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02
-[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02
-[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00
-[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01
-[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01
-[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02
-[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00
-[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02
-[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02
-[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01
-[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01
-[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02
-[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02
-[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01
-[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01
-[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02
-[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02
-[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02
-[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01
-[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00
-[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01
-[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01
-[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00
-[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01
-[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02
-[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01
-[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01
-[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02
-[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01
-[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01
-[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01
-[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02
-[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01
-[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02
-[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01
-[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01
-[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02
-[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01
-[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00
-[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01
-[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01
-[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01
-[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02
-[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00
-[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01
-[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01
-[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01
-[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01
-[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01
-[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01
-[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02
-[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02
-[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01
-[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01
-[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02
-[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02
-[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01
-[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02
-[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02
-[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02
-[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01
-[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02
-[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02
-[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02
-[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01
-[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02
-[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02
-[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01
-[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01
-[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01
-[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA
-[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01
-[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01
-[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02
-[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02
-[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01
-[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02
-[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01
-[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01
-[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01
-[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02
-[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01
-[651] 2.251656e-01</code></pre>
-</div>
-</div>
-<p>We can select multiple columns using multiple column names:</p>
+<div class="sourceCode cell-code" id="cb43"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb43-1"><a href="#cb43-1"></a>df[ , <span class="st">"age"</span>] <span class="co">#same as df[ , 3]</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>We can select multiple columns using multiple column names, again this is selecting these variables for all of the rows.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1"></a>df[, <span class="fu">c</span>(<span class="st">"age"</span>, <span class="st">"gender"</span>)] <span class="co">#same as df[ , c(3,4)]</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>             age gender
-1   3.176895e-01 Female
-2   3.436823e+00 Female
-3   3.000000e-01   Male
-4   1.432363e+02   Male
-5   4.476534e-01   Male
-6   2.527076e-02   Male
-7   6.101083e-01 Female
-8   3.000000e-01 Female
-9   2.916968e+00   Male
-10  1.649819e+00   Male
-11  4.574007e+00   Male
-12  1.583904e+02 Female
-13            NA   Male
-14  1.065068e+02   Male
-15  1.113870e+02   Male
-16  4.144893e+01   Male
-17  3.000000e-01   Male
-18  2.527076e-01 Female
-19  8.159247e+01 Female
-20  1.825342e+02   Male
-21  4.244656e+01   Male
-22  1.193493e+02 Female
-23  3.000000e-01   Male
-24  3.000000e-01 Female
-25  9.025271e-01 Female
-26  3.501805e-01   Male
-27  3.000000e-01   Male
-28  1.227437e+00 Female
-29  1.702055e+02 Female
-30  3.000000e-01 Female
-31  4.801444e-01   Male
-32  2.527076e-02   Male
-33  3.000000e-01 Female
-34  5.776173e-02   Male
-35  4.801444e-01 Female
-36  3.826715e-01 Female
-37  3.000000e-01   Male
-38  4.048558e+02   Male
-39  3.000000e-01   Male
-40  5.451264e-01   Male
-41  3.000000e-01 Female
-42  5.590753e+01   Male
-43  2.202166e-01 Female
-44  1.709760e+02   Male
-45  1.227437e+00   Male
-46  4.567527e+02   Male
-47  4.838480e+01   Male
-48  1.227437e-01 Female
-49  1.877256e-01 Female
-50  3.000000e-01 Female
-51  3.501805e-01   Male
-52  3.339350e+00   Male
-53  3.000000e-01 Female
-54  5.451264e-01 Female
-55            NA   Male
-56  2.104693e+00   Male
-57            NA   Male
-58  3.826715e-01 Female
-59  3.926366e+01 Female
-60  1.129964e+00   Male
-61  3.501805e+00 Female
-62  7.542808e+01 Female
-63  4.800475e+01 Female
-64  1.000000e+00   Male
-65  4.068884e+01   Male
-66  3.000000e-01 Female
-67  4.377672e+01 Female
-68  1.193493e+02   Male
-69  6.977740e+01   Male
-70  1.373288e+02 Female
-71  1.642979e+02   Male
-72            NA Female
-73  1.542808e+02   Male
-74  6.033058e-01   Male
-75  2.809917e-01   Male
-76  1.966942e+00   Male
-77  2.041322e+00   Male
-78  2.115702e+00 Female
-79  4.663043e+02   Male
-80  3.000000e-01   Male
-81  1.500796e+02   Male
-82  1.543790e+02 Female
-83  2.561983e-01 Female
-84  1.596338e+02   Male
-85  1.732484e+02 Female
-86  4.641304e+02 Female
-87  3.736364e+01   Male
-88  1.572452e+02 Female
-89  3.000000e-01   Male
-90  3.000000e-01   Male
-91  8.264463e-02   Male
-92  6.776859e-01 Female
-93  7.272727e-01   Male
-94  2.066116e-01 Female
-95  1.966942e+00   Male
-96  3.000000e-01   Male
-97  3.000000e-01   Male
-98  2.809917e-01 Female
-99  8.016529e-01 Female
-100 1.818182e-01 Female
-101 1.818182e-01   Male
-102 8.264463e-02 Female
-103 3.422727e+01 Female
-104 8.743506e+00   Male
-105 3.000000e-01   Male
-106 1.641720e+02 Female
-107 4.049587e-01   Male
-108 1.001592e+02   Male
-109 4.489130e+02 Female
-110 1.101911e+02 Female
-111 4.440909e+01   Male
-112 1.288217e+02 Female
-113 2.840909e+01   Male
-114 1.003981e+02 Female
-115 8.512397e-01 Female
-116 1.322314e-01   Male
-117 1.297521e+00 Female
-118 1.570248e-01   Male
-119 1.966942e+00 Female
-120 1.536624e+02   Male
-121 3.000000e-01 Female
-122 3.000000e-01 Female
-123 1.074380e+00   Male
-124 1.099174e+00 Female
-125 3.057851e-01 Female
-126 3.000000e-01 Female
-127 5.785124e-02 Female
-128 4.391304e+02 Female
-129 6.130435e+02 Female
-130 1.074380e-01   Male
-131 7.125796e+01   Male
-132 4.222727e+01   Male
-133 1.620223e+02 Female
-134 3.750000e+01 Female
-135 1.534236e+02 Female
-136 6.239130e+02 Female
-137 5.521739e+02   Male
-138 5.785124e-02 Female
-139 6.547945e-01 Female
-140 8.767123e-02 Female
-141 3.000000e-01   Male
-142 2.849315e+00 Female
-143 3.835616e-02   Male
-144 2.849315e-01   Male
-145 4.649315e+00   Male
-146 1.369863e-01 Female
-147 3.589041e-01   Male
-148 1.049315e+00   Male
-149 4.668998e+01 Female
-150 1.473510e+02 Female
-151 4.589744e+01   Male
-152 2.109589e-01   Male
-153 1.741722e+02 Female
-154 2.496503e+01 Female
-155 1.850993e+02   Male
-156 1.863014e-01   Male
-157 1.863014e-01   Male
-158 4.589744e+01 Female
-159 1.942881e+02 Female
-160 5.079646e+02 Female
-161 8.767123e-01   Male
-162 2.750685e+00   Male
-163 1.503311e+02 Female
-164 3.000000e-01   Male
-165 3.095890e-01   Male
-166 3.000000e-01   Male
-167 6.371681e+02 Female
-168 6.054795e-01 Female
-169 1.955298e+02 Female
-170 1.786424e+02   Male
-171 1.120861e+02 Female
-172 1.331954e+02   Male
-173 2.159292e+02   Male
-174 5.628319e+02   Male
-175 1.900662e+02 Female
-176 6.547945e-01   Male
-177 1.665753e+00   Male
-178 1.739238e+02   Male
-179 9.991722e+01   Male
-180 9.321192e+01   Male
-181 8.767123e-02 Female
-182           NA   Male
-183 6.794521e-01 Female
-184 5.808219e-01   Male
-185 1.369863e-01 Female
-186 2.060274e+00 Female
-187 1.610099e+02   Male
-188 4.082192e-01 Female
-189 8.273973e-01   Male
-190 4.601770e+02 Female
-191 1.389073e+02 Female
-192 3.867133e+01 Female
-193 9.260274e-01 Female
-194 5.918874e+01 Female
-195 1.870861e+02 Female
-196 4.328767e-01   Male
-197 6.301370e-02   Male
-198 3.000000e-01 Female
-199 1.548013e+02   Male
-200 5.819536e+01 Female
-201 1.724338e+02 Female
-202 1.932401e+01 Female
-203 2.164420e+00 Female
-204 9.757412e-01 Female
-205 1.509434e-01   Male
-206 1.509434e-01 Female
-207 7.766571e+01   Male
-208 4.319563e+01 Female
-209 1.752022e-01   Male
-210 3.094775e+01 Female
-211 1.266846e-01   Male
-212 2.919806e+01   Male
-213 9.545455e+00 Female
-214 2.735115e+01 Female
-215 1.314841e+02 Female
-216 3.643985e+01   Male
-217 1.498559e+02 Female
-218 9.363636e+00 Female
-219 2.479784e-01   Male
-220 5.390836e-02 Female
-221 8.787062e-01 Female
-222 1.994609e-01   Male
-223 3.000000e-01 Female
-224 3.000000e-01   Male
-225 5.390836e-03 Female
-226 4.177898e-01 Female
-227 3.000000e-01 Female
-228 2.479784e-01   Male
-229 2.964960e-02   Male
-230 2.964960e-01   Male
-231 5.148248e+00 Female
-232 1.994609e-01   Male
-233 3.000000e-01   Male
-234 1.779539e+02   Male
-235 3.290210e+02 Female
-236 3.000000e-01   Male
-237 1.809798e+02 Female
-238 4.905660e-01   Male
-239 1.266846e-01   Male
-240 1.543948e+02 Female
-241 1.379683e+02 Female
-242 6.153846e+02   Male
-243 1.474784e+02   Male
-244 3.000000e-01 Female
-245 1.024259e+00   Male
-246 4.444056e+02 Female
-247 3.000000e-01   Male
-248 2.504043e+00 Female
-249 3.000000e-01 Female
-250 3.000000e-01 Female
-251 7.816712e-02 Female
-252 3.000000e-01 Female
-253 5.390836e-02   Male
-254 1.494236e+02 Female
-255 5.972622e+01   Male
-256 6.361186e-01 Female
-257 1.837896e+02 Female
-258 1.320809e+02 Female
-259 1.571906e-01   Male
-260 1.520231e+02   Male
-261 3.000000e-01 Female
-262 3.000000e-01 Female
-263 1.823699e+02   Male
-264 3.000000e-01   Male
-265 2.173913e+00   Male
-266 2.142202e+01   Male
-267 3.000000e-01 Female
-268 3.408027e+00   Male
-269 4.155963e+01   Male
-270 9.698997e-02   Male
-271 1.238532e+01 Female
-272 9.528926e+00   Male
-273 1.916185e+02 Female
-274 1.060201e+00   Male
-275 3.679104e+02 Female
-276 4.288991e+01   Male
-277 9.971098e+01   Male
-278 3.000000e-01   Male
-279 1.208092e+02   Male
-280 3.000000e-01   Male
-281 6.688963e-03 Female
-282 2.505017e+00 Female
-283 1.481605e+00   Male
-284 3.000000e-01 Female
-285 5.183946e-01 Female
-286 3.000000e-01 Female
-287 1.872910e-01   Male
-288 3.678930e-01 Female
-289 3.000000e-01   Male
-290 4.529851e+02 Female
-291 3.169725e+01 Female
-292 3.000000e-01   Male
-293 4.922018e+01   Male
-294 2.548507e+02   Male
-295 1.661850e+02   Male
-296 9.164179e+02   Male
-297 3.678930e-01 Female
-298 1.236994e+02   Male
-299 6.705202e+01   Male
-300 3.834862e+01   Male
-301 1.963211e+00 Female
-302 3.000000e-01   Male
-303 2.474916e-01   Male
-304 3.000000e-01 Female
-305 2.173913e-01   Male
-306 8.193980e-01   Male
-307 2.444816e+00 Female
-308 3.000000e-01   Male
-309 1.571906e-01 Female
-310 1.849711e+02   Male
-311 6.119403e+02 Female
-312 3.000000e-01 Female
-313 4.280936e-01 Female
-314 9.698997e-02   Male
-315 3.678930e-02 Female
-316 4.832090e+02   Male
-317 1.390173e+02 Female
-318 3.000000e-01   Male
-319 6.555970e+02 Female
-320 1.526012e+02 Female
-321 3.000000e-01 Female
-322 7.222222e-01   Male
-323 7.724426e+01   Male
-324 3.000000e-01   Male
-325 6.111111e-01 Female
-326 1.555556e+00 Female
-327 3.055556e-01   Male
-328 1.500000e+00   Male
-329 1.470772e+02   Male
-330 1.694444e+00 Female
-331 3.138298e+02 Female
-332 1.414405e+02 Female
-333 1.990605e+02 Female
-334 4.212766e+02   Male
-335 3.000000e-01   Male
-336 3.000000e-01   Male
-337 6.478723e+02   Male
-338 3.000000e-01   Male
-339 2.222222e+00 Female
-340 3.000000e-01   Male
-341 2.055556e+00   Male
-342 2.777778e-02 Female
-343 8.333333e-02   Male
-344 1.032359e+02 Female
-345 1.611111e+00 Female
-346 8.333333e-02 Female
-347 2.333333e+00 Female
-348 5.755319e+02   Male
-349 1.686848e+02 Female
-350 1.111111e-01   Male
-351 3.000000e-01   Male
-352 8.372340e+02 Female
-353 3.000000e-01   Male
-354 3.784504e+01   Male
-355 3.819149e+02   Male
-356 5.555556e-02 Female
-357 3.000000e+02 Female
-358 1.855950e+02   Male
-359 1.944444e-01 Female
-360 3.000000e-01   Male
-361 5.555556e-02 Female
-362 1.138889e+00   Male
-363 4.254237e+01 Female
-364 3.000000e-01   Male
-365 3.000000e-01   Male
-366 3.000000e-01 Female
-367 3.000000e-01 Female
-368 3.138298e+02 Female
-369 1.235908e+02   Male
-370 4.159574e+02   Male
-371 3.009685e+01 Female
-372 1.567850e+02 Female
-373 1.367432e+02 Female
-374 3.731235e+01 Female
-375 9.164927e+01   Male
-376 2.936170e+02 Female
-377 8.820459e+01 Female
-378 1.035491e+02   Male
-379 7.379958e+01 Female
-380 3.000000e-01   Male
-381 1.718750e+02   Male
-382 2.128527e+00   Male
-383 1.253918e+00 Female
-384 2.382445e-01   Male
-385 4.639498e-01 Female
-386 1.253918e-01   Male
-387 1.253918e-01   Male
-388 3.000000e-01 Female
-389 1.000000e+00   Male
-390 1.570043e+02   Male
-391 4.344086e+02 Female
-392 2.184953e+00   Male
-393 1.507837e+00 Female
-394 3.228840e-01 Female
-395 4.588024e+01   Male
-396 1.660560e+02   Male
-397 3.000000e-01   Male
-398 3.043011e+02   Male
-399 2.612903e+02 Female
-400 1.621767e+02   Male
-401 3.228840e-01   Male
-402 4.639498e-01 Female
-403 2.495298e+00 Female
-404 3.257053e+00 Female
-405 3.793103e-01 Female
-406           NA   Male
-407 6.896552e-02 Female
-408 3.000000e-01   Male
-409 1.423197e+00 Female
-410 3.000000e-01 Female
-411 3.000000e-01 Female
-412 1.786638e+02   Male
-413 3.279570e+02   Male
-414           NA Female
-415 1.903017e+02   Male
-416 1.654095e+02 Female
-417 4.639498e-01 Female
-418 1.815733e+02   Male
-419 1.366771e+00   Male
-420 1.536050e-01 Female
-421 1.306587e+01   Male
-422 2.129032e+02 Female
-423 1.925647e+02   Male
-424 3.000000e-01 Female
-425 1.028213e+00 Female
-426 3.793103e-01 Female
-427 8.025078e-01 Female
-428 4.860215e+02 Female
-429 3.000000e-01 Female
-430 2.100313e-01   Male
-431 2.767665e+01 Female
-432 1.592476e+00   Male
-433 9.717868e-02 Female
-434 1.028213e+00 Female
-435 3.793103e-01   Male
-436 1.292026e+02   Male
-437 4.425150e+01 Female
-438 3.193548e+02 Female
-439 1.860991e+02 Female
-440 6.614420e-01 Female
-441 5.203762e-01   Male
-442 1.330819e+02   Male
-443 1.673491e+02 Female
-444 3.000000e-01   Male
-445 1.117457e+02   Male
-446 3.045509e+01 Female
-447 3.000000e-01   Male
-448 8.280255e-02 Female
-449 3.000000e-01 Female
-450 1.200637e+00 Female
-451 1.687898e-01   Male
-452 7.367273e+02 Female
-453 8.280255e-02   Male
-454 5.127389e-01   Male
-455 1.974522e-01   Male
-456 7.993631e-01 Female
-457 3.000000e-01   Male
-458 3.298182e+02   Male
-459 9.736842e+01 Female
-460 3.000000e-01 Female
-461 3.000000e-01 Female
-462 4.214545e+02 Female
-463 3.000000e-01   Male
-464 2.578182e+02 Female
-465 2.261147e-01   Male
-466 3.000000e-01 Female
-467 1.883901e+02   Male
-468 9.458204e+01 Female
-469 3.000000e-01 Female
-470 3.000000e-01   Male
-471 7.707006e-01 Female
-472 5.032727e+02   Male
-473 1.544586e+00 Female
-474 1.431115e+02 Female
-475 3.000000e-01   Male
-476 1.458599e+00   Male
-477 1.247678e+02 Female
-478           NA Female
-479 4.334545e+02   Male
-480 3.000000e-01 Female
-481 6.156364e+02 Female
-482 9.574303e+01   Male
-483 1.928019e+02   Male
-484 1.888545e+02   Male
-485 1.598297e+02 Female
-486 5.127389e-01   Male
-487 1.171053e+02 Female
-488           NA   Male
-489 2.547771e-02 Female
-490 1.707430e+02 Female
-491 3.000000e-01   Male
-492 1.869969e+02   Male
-493 4.731481e+01   Male
-494 1.988390e+02 Female
-495 3.000000e-01   Male
-496 8.808050e+01   Male
-497 2.003185e+00 Female
-498 3.000000e-01   Male
-499 3.509259e+01 Female
-500 9.365325e+01 Female
-501 3.000000e-01   Male
-502 3.736111e+01 Female
-503 1.674923e+02 Female
-504 8.808050e+01   Male
-505 1.656347e+02 Female
-506 3.722222e+01 Female
-507 6.756364e+02 Female
-508 3.000000e-01   Male
-509 1.698142e+02   Male
-510 1.628483e+02 Female
-511 5.985130e-01   Male
-512 1.903346e+00 Female
-513 3.000000e-01   Male
-514 3.000000e-01   Male
-515 8.996283e-01   Male
-516 3.977695e-01 Female
-517 3.000000e-01   Male
-518 3.000000e-01   Male
-519 3.000000e-01   Male
-520 3.000000e-01 Female
-521 7.446809e+02   Male
-522 6.095745e+02 Female
-523 1.427445e+02   Male
-524 3.000000e-01 Female
-525 2.973978e-02   Male
-526 3.977695e-01 Female
-527 4.095745e+02 Female
-528 4.595745e+02   Male
-529 3.000000e-01 Female
-530 1.976341e+02 Female
-531 3.776596e+02 Female
-532 1.777603e+02 Female
-533 4.312268e-01   Male
-534 6.765957e+02 Female
-535 7.978723e+02   Male
-536 9.665427e-02   Male
-537 1.879338e+02   Male
-538 4.358670e+01 Female
-539 3.000000e-01 Female
-540 3.000000e-01   Male
-541 2.638955e+01   Male
-542 3.180523e+01 Female
-543 1.746845e+02   Male
-544 1.876972e+02   Male
-545 1.044164e+02   Male
-546 1.202681e+02   Male
-547 1.630915e+02 Female
-548 1.276025e+02 Female
-549 8.880126e+01   Male
-550 3.563830e+02   Male
-551 2.212766e+02   Male
-552 1.969121e+01 Female
-553 3.755319e+02 Female
-554 1.214511e+02   Male
-555 1.034700e+02 Female
-556 3.000000e-01 Female
-557 3.643123e-01 Female
-558 6.319703e-02 Female
-559 3.000000e-01   Male
-560 3.000000e-01   Male
-561 3.000000e-01 Female
-562 3.000000e-01 Female
-563 3.000000e-01   Male
-564 3.000000e-01   Male
-565 3.000000e-01 Female
-566 3.000000e-01   Male
-567 1.664038e+02 Female
-568 2.946809e+02 Female
-569 4.391924e+01   Male
-570 1.874606e+02 Female
-571 1.143533e+02   Male
-572 1.600158e+02   Male
-573 1.635688e-01   Male
-574 8.809148e+01 Female
-575 1.337539e+02   Male
-576 1.985804e+02   Male
-577 1.578864e+02 Female
-578 3.000000e-01 Female
-579 3.000000e-01   Male
-580 1.953642e-01 Female
-581 1.119205e+00   Male
-582 2.523636e+02   Male
-583 3.000000e-01   Male
-584 4.844371e+00 Female
-585 3.000000e-01   Male
-586 1.492553e+02 Female
-587 1.993617e+02   Male
-588 2.847682e-01 Female
-589 3.145695e-01 Female
-590 3.000000e-01   Male
-591 3.406429e+01 Female
-592 6.595745e+01   Male
-593 3.000000e-01   Male
-594 2.174545e+02   Male
-595           NA Female
-596 5.957447e+01 Female
-597 7.236364e+02 Female
-598 3.000000e-01   Male
-599 3.000000e-01 Female
-600 3.000000e-01   Male
-601 2.676364e+02   Male
-602 1.891489e+02   Male
-603 3.036364e+02 Female
-604 3.000000e-01 Female
-605 3.000000e-01   Male
-606 3.000000e-01   Male
-607 3.000000e-01 Female
-608 3.000000e-01   Male
-609 1.447020e+00   Male
-610 2.130909e+02 Female
-611 1.357616e-01 Female
-612 3.000000e-01 Female
-613 3.000000e-01 Female
-614 5.534545e+02 Female
-615 1.891489e+02 Female
-616 7.202128e+01 Female
-617 3.250287e+01   Male
-618 1.655629e-02   Male
-619 3.123636e+02   Male
-620 3.000000e-01   Male
-621 7.138298e+01   Male
-622 3.000000e-01 Female
-623 6.946809e+01 Female
-624 4.012629e+01   Male
-625 1.629787e+02 Female
-626 1.508511e+02 Female
-627 1.655629e-02   Male
-628 3.000000e-01   Male
-629 4.635762e-02   Male
-630 3.000000e-01 Female
-631 3.000000e-01 Female
-632 3.000000e-01   Male
-633 1.942553e+02   Male
-634 3.690909e+02   Male
-635 3.000000e-01 Female
-636 3.000000e-01 Female
-637 2.847682e+00   Male
-638 1.435106e+02 Female
-639 3.000000e-01   Male
-640 4.752009e+01 Female
-641 2.621125e+01 Female
-642 1.055319e+02 Female
-643 3.000000e-01 Female
-644 1.149007e+00   Male
-645 2.927273e+02 Female
-646 3.000000e-01 Female
-647 3.000000e-01 Female
-648 4.839265e+01   Male
-649 3.000000e-01   Male
-650 3.000000e-01 Female
-651 2.251656e-01 Female</code></pre>
+<pre><code>    age gender
+1     2 Female
+2     4 Female
+3     4   Male
+4     4   Male
+5     1   Male
+6     4   Male
+7     4 Female
+8    NA Female
+9     4   Male
+10    2   Male
+11    3   Male
+12   15 Female
+13    8   Male
+14   12   Male
+15   15   Male
+16    9   Male
+17    8   Male
+18    7 Female
+19   11 Female
+20   10   Male
+21    8   Male
+22   11 Female
+23    2   Male
+24    2 Female
+25    3 Female
+26    5   Male
+27    1   Male
+28    3 Female
+29    5 Female
+30    5 Female
+31    3   Male
+32    1   Male
+33    4 Female
+34    3   Male
+35    2 Female
+36   11 Female
+37    7   Male
+38    8   Male
+39    6   Male
+40    6   Male
+41   11 Female
+42   10   Male
+43    6 Female
+44   12   Male
+45   11   Male
+46   10   Male
+47   11   Male
+48   13 Female
+49    3 Female
+50    4 Female
+51    3   Male
+52    1   Male
+53    2 Female
+54    2 Female
+55    4   Male
+56    2   Male
+57    2   Male
+58    3 Female
+59    3 Female
+60    4   Male
+61    1 Female
+62   13 Female
+63   13 Female
+64    6   Male
+65   13   Male
+66    5 Female
+67   13 Female
+68   14   Male
+69   13   Male
+70    8 Female
+71    7   Male
+72    6 Female
+73   13   Male
+74    3   Male
+75    4   Male
+76    2   Male
+77   NA   Male
+78    5 Female
+79    3   Male
+80    3   Male
+81   14   Male
+82   11 Female
+83    7 Female
+84    7   Male
+85   11 Female
+86    9 Female
+87   14   Male
+88   13 Female
+89    1   Male
+90    1   Male
+91    4   Male
+92    1 Female
+93    2   Male
+94    3 Female
+95    2   Male
+96    1   Male
+97    2   Male
+98    2 Female
+99    4 Female
+100   5 Female
+101   5   Male
+102   6 Female
+103  14 Female
+104  14   Male
+105  10   Male
+106   6 Female
+107   6   Male
+108   8   Male
+109   6 Female
+110  12 Female
+111  12   Male
+112  14 Female
+113  15   Male
+114  12 Female
+115   4 Female
+116   4   Male
+117   3 Female
+118  NA   Male
+119   2 Female
+120   3   Male
+121  NA Female
+122   3 Female
+123   3   Male
+124   2 Female
+125   4 Female
+126  10 Female
+127   7 Female
+128  11 Female
+129   6 Female
+130  11   Male
+131   9   Male
+132   6   Male
+133  13 Female
+134  10 Female
+135   6 Female
+136  11 Female
+137   7   Male
+138   6 Female
+139   4 Female
+140   4 Female
+141   4   Male
+142   4 Female
+143   4   Male
+144   4   Male
+145   3   Male
+146   4 Female
+147   3   Male
+148   3   Male
+149  13 Female
+150   7 Female
+151  10   Male
+152   6   Male
+153  10 Female
+154  12 Female
+155  10   Male
+156  10   Male
+157  13   Male
+158  13 Female
+159   5 Female
+160   3 Female
+161   4   Male
+162   1   Male
+163   3 Female
+164   4   Male
+165   4   Male
+166   1   Male
+167   5 Female
+168   6 Female
+169  14 Female
+170   6   Male
+171  13 Female
+172   9   Male
+173  11   Male
+174  10   Male
+175   5 Female
+176  14   Male
+177   7   Male
+178  10   Male
+179   6   Male
+180   5   Male
+181   3 Female
+182   4   Male
+183   2 Female
+184   3   Male
+185   3 Female
+186   2 Female
+187   3   Male
+188   5 Female
+189   2   Male
+190   3 Female
+191  14 Female
+192   9 Female
+193  14 Female
+194   9 Female
+195   8 Female
+196   7   Male
+197  13   Male
+198   8 Female
+199   6   Male
+200  12 Female
+201  14 Female
+202  15 Female
+203   2 Female
+204   4 Female
+205   3   Male
+206   3 Female
+207   3   Male
+208   4 Female
+209   3   Male
+210  14 Female
+211   8   Male
+212   7   Male
+213  14 Female
+214  13 Female
+215  13 Female
+216   7   Male
+217   8 Female
+218  10 Female
+219   9   Male
+220   9 Female
+221   3 Female
+222   4   Male
+223   4 Female
+224   4   Male
+225   2 Female
+226   1 Female
+227   3 Female
+228   2   Male
+229   3   Male
+230   5   Male
+231   2 Female
+232   2   Male
+233   9   Male
+234  13   Male
+235  10 Female
+236   6   Male
+237  13 Female
+238  11   Male
+239  10   Male
+240   8 Female
+241   9 Female
+242  10   Male
+243  14   Male
+244   1 Female
+245   2   Male
+246   3 Female
+247   2   Male
+248   3 Female
+249   2 Female
+250   3 Female
+251   5 Female
+252  10 Female
+253   7   Male
+254  13 Female
+255  15   Male
+256  11 Female
+257  10 Female
+258   3 Female
+259   2   Male
+260   3   Male
+261   3 Female
+262   3 Female
+263   4   Male
+264   3   Male
+265   2   Male
+266   4   Male
+267   2 Female
+268   8   Male
+269  11   Male
+270   6   Male
+271  14 Female
+272  14   Male
+273   5 Female
+274   5   Male
+275  10 Female
+276  13   Male
+277   6   Male
+278   5   Male
+279  12   Male
+280   2   Male
+281   3 Female
+282   1 Female
+283   1   Male
+284   1 Female
+285   2 Female
+286   5 Female
+287   5   Male
+288   4 Female
+289   2   Male
+290  NA Female
+291   6 Female
+292   8   Male
+293  15   Male
+294  11   Male
+295  14   Male
+296   6   Male
+297  10 Female
+298  12   Male
+299  14   Male
+300  10   Male
+301   1 Female
+302   3   Male
+303   2   Male
+304   3 Female
+305   4   Male
+306   3   Male
+307   4 Female
+308   4   Male
+309   1 Female
+310   7   Male
+311  11 Female
+312   7 Female
+313   5 Female
+314  10   Male
+315   9 Female
+316  13   Male
+317  11 Female
+318  13   Male
+319   9 Female
+320  15 Female
+321   7 Female
+322   4   Male
+323   1   Male
+324   1   Male
+325   2 Female
+326   2 Female
+327   3   Male
+328   2   Male
+329   3   Male
+330   4 Female
+331   7 Female
+332  11 Female
+333  10 Female
+334   5   Male
+335   8   Male
+336  15   Male
+337  14   Male
+338   2   Male
+339   2 Female
+340   2   Male
+341   5   Male
+342   4 Female
+343   3   Male
+344   5 Female
+345   4 Female
+346   2 Female
+347   1 Female
+348   7   Male
+349   8 Female
+350  NA   Male
+351   9   Male
+352   8 Female
+353   5   Male
+354  14   Male
+355  14   Male
+356   7 Female
+357  13 Female
+358   2   Male
+359   1 Female
+360   1   Male
+361   4 Female
+362   3   Male
+363   4 Female
+364   3   Male
+365   1   Male
+366   5 Female
+367   4 Female
+368   4 Female
+369   4   Male
+370  11   Male
+371  15 Female
+372  12 Female
+373  11 Female
+374   8 Female
+375  13   Male
+376  10 Female
+377  10 Female
+378  15   Male
+379   8 Female
+380  14   Male
+381   4   Male
+382   1   Male
+383   5 Female
+384   2   Male
+385   2 Female
+386   4   Male
+387   4   Male
+388   2 Female
+389   3   Male
+390  11   Male
+391  10 Female
+392   6   Male
+393  12 Female
+394  10 Female
+395   8   Male
+396   8   Male
+397  13   Male
+398  10   Male
+399  13 Female
+400  10   Male
+401   2   Male
+402   4 Female
+403   3 Female
+404   2 Female
+405   1 Female
+406   3   Male
+407   3 Female
+408   4   Male
+409   5 Female
+410   5 Female
+411   1 Female
+412  11   Male
+413   6   Male
+414  14 Female
+415   8   Male
+416   8 Female
+417   9 Female
+418   7   Male
+419   6   Male
+420  12 Female
+421   8   Male
+422  11 Female
+423  14   Male
+424   3 Female
+425   1 Female
+426   5 Female
+427   2 Female
+428   3 Female
+429   4 Female
+430   2   Male
+431   3 Female
+432   4   Male
+433   1 Female
+434   7 Female
+435  10   Male
+436  11   Male
+437   7 Female
+438  10 Female
+439  14 Female
+440   7 Female
+441  11   Male
+442  12   Male
+443  10 Female
+444   6   Male
+445  13   Male
+446   8 Female
+447   2   Male
+448   3 Female
+449   1 Female
+450   2 Female
+451  NA   Male
+452  NA Female
+453   4   Male
+454   4   Male
+455   1   Male
+456   2 Female
+457   2   Male
+458  12   Male
+459  12 Female
+460   8 Female
+461  14 Female
+462  13 Female
+463   6   Male
+464  11 Female
+465  11   Male
+466  10 Female
+467  12   Male
+468  14 Female
+469  11 Female
+470   1   Male
+471   2 Female
+472   3   Male
+473   3 Female
+474   5 Female
+475   3   Male
+476   1   Male
+477   4 Female
+478   4 Female
+479   4   Male
+480   2 Female
+481   5 Female
+482   7   Male
+483   8   Male
+484  10   Male
+485   6 Female
+486   7   Male
+487  10 Female
+488   6   Male
+489   6 Female
+490  15 Female
+491   5   Male
+492   3   Male
+493   5   Male
+494   3 Female
+495   5   Male
+496   5   Male
+497   1 Female
+498   1   Male
+499   7 Female
+500  14 Female
+501   9   Male
+502  10 Female
+503  10 Female
+504  11   Male
+505  11 Female
+506  12 Female
+507  11 Female
+508  12   Male
+509  12   Male
+510  10 Female
+511   1   Male
+512   2 Female
+513   4   Male
+514   2   Male
+515   3   Male
+516   3 Female
+517   2   Male
+518   4   Male
+519   3   Male
+520   1 Female
+521   4   Male
+522  12 Female
+523   6   Male
+524   7 Female
+525   7   Male
+526  13 Female
+527   8 Female
+528   7   Male
+529   8 Female
+530   8 Female
+531  11 Female
+532  14 Female
+533   3   Male
+534   2 Female
+535   2   Male
+536   3   Male
+537   2   Male
+538   2 Female
+539   3 Female
+540   2   Male
+541   5   Male
+542  10 Female
+543  14   Male
+544   9   Male
+545   6   Male
+546   7   Male
+547  14 Female
+548   7 Female
+549   7   Male
+550   9   Male
+551  14   Male
+552  10 Female
+553  13 Female
+554   5   Male
+555   4 Female
+556   4 Female
+557   5 Female
+558   4 Female
+559   4   Male
+560   4   Male
+561   3 Female
+562   1 Female
+563   4   Male
+564   1   Male
+565   1 Female
+566   7   Male
+567  13 Female
+568  10 Female
+569  14   Male
+570  12 Female
+571  14   Male
+572   8   Male
+573   7   Male
+574  11 Female
+575   8   Male
+576  12   Male
+577   9 Female
+578   5 Female
+579   4   Male
+580   3 Female
+581   2   Male
+582   2   Male
+583   3   Male
+584   4 Female
+585   4   Male
+586   4 Female
+587   5   Male
+588   3 Female
+589   6 Female
+590   3   Male
+591  11 Female
+592  11   Male
+593   7   Male
+594   8   Male
+595   6 Female
+596  10 Female
+597   8 Female
+598   8   Male
+599   9 Female
+600   8   Male
+601  13   Male
+602  11   Male
+603   8 Female
+604   2 Female
+605   4   Male
+606   2   Male
+607   2 Female
+608   4   Male
+609   2   Male
+610   4 Female
+611   2 Female
+612   4 Female
+613   1 Female
+614   4 Female
+615  12 Female
+616   7 Female
+617  11   Male
+618   6   Male
+619   8   Male
+620  14   Male
+621  11   Male
+622   7 Female
+623  14 Female
+624   6   Male
+625  13 Female
+626  13 Female
+627   3   Male
+628   1   Male
+629   3   Male
+630   1 Female
+631   1 Female
+632   2   Male
+633   4   Male
+634   4   Male
+635   2 Female
+636   4 Female
+637   5   Male
+638   3 Female
+639   3   Male
+640   6 Female
+641  11 Female
+642   9 Female
+643   7 Female
+644   8   Male
+645  NA Female
+646   8 Female
+647  14 Female
+648  10   Male
+649  10   Male
+650  11 Female
+651  13 Female</code></pre>
 </div>
 </div>
 <p>We can remove select columns using indexing as well, OR by simply changing the column to <code>NULL</code></p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1"></a>df[, <span class="sc">-</span><span class="dv">5</span>] <span class="co">#remove column 5, "slum" variable</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-<div class="cell-output cell-output-stdout">
-<pre><code>    IgG_concentration          age age.1 gender
-1                5772 3.176895e-01     2 Female
-2                8095 3.436823e+00     4 Female
-3                9784 3.000000e-01     4   Male
-4                9338 1.432363e+02     4   Male
-5                6369 4.476534e-01     1   Male
-6                6885 2.527076e-02     4   Male
-7                6252 6.101083e-01     4 Female
-8                8913 3.000000e-01    NA Female
-9                7332 2.916968e+00     4   Male
-10               6941 1.649819e+00     2   Male
-11               5104 4.574007e+00     3   Male
-12               9078 1.583904e+02    15 Female
-13               9960           NA     8   Male
-14               9651 1.065068e+02    12   Male
-15               9229 1.113870e+02    15   Male
-16               5210 4.144893e+01     9   Male
-17               5105 3.000000e-01     8   Male
-18               7607 2.527076e-01     7 Female
-19               7582 8.159247e+01    11 Female
-20               8179 1.825342e+02    10   Male
-21               5660 4.244656e+01     8   Male
-22               6696 1.193493e+02    11 Female
-23               7842 3.000000e-01     2   Male
-24               6578 3.000000e-01     2 Female
-25               9619 9.025271e-01     3 Female
-26               9838 3.501805e-01     5   Male
-27               6935 3.000000e-01     1   Male
-28               5885 1.227437e+00     3 Female
-29               9657 1.702055e+02     5 Female
-30               9146 3.000000e-01     5 Female
-31               7056 4.801444e-01     3   Male
-32               9144 2.527076e-02     1   Male
-33               8696 3.000000e-01     4 Female
-34               7042 5.776173e-02     3   Male
-35               5278 4.801444e-01     2 Female
-36               6541 3.826715e-01    11 Female
-37               6070 3.000000e-01     7   Male
-38               5490 4.048558e+02     8   Male
-39               6527 3.000000e-01     6   Male
-40               5389 5.451264e-01     6   Male
-41               9003 3.000000e-01    11 Female
-42               6682 5.590753e+01    10   Male
-43               7844 2.202166e-01     6 Female
-44               8257 1.709760e+02    12   Male
-45               7767 1.227437e+00    11   Male
-46               8391 4.567527e+02    10   Male
-47               8317 4.838480e+01    11   Male
-48               7397 1.227437e-01    13 Female
-49               8495 1.877256e-01     3 Female
-50               8093 3.000000e-01     4 Female
-51               7375 3.501805e-01     3   Male
-52               5255 3.339350e+00     1   Male
-53               8445 3.000000e-01     2 Female
-54               8959 5.451264e-01     2 Female
-55               8400           NA     4   Male
-56               7420 2.104693e+00     2   Male
-57               5206           NA     2   Male
-58               7431 3.826715e-01     3 Female
-59               7230 3.926366e+01     3 Female
-60               8208 1.129964e+00     4   Male
-61               8538 3.501805e+00     1 Female
-62               6125 7.542808e+01    13 Female
-63               5767 4.800475e+01    13 Female
-64               5487 1.000000e+00     6   Male
-65               5539 4.068884e+01    13   Male
-66               5759 3.000000e-01     5 Female
-67               6845 4.377672e+01    13 Female
-68               7170 1.193493e+02    14   Male
-69               6588 6.977740e+01    13   Male
-70               7939 1.373288e+02     8 Female
-71               5006 1.642979e+02     7   Male
-72               9180           NA     6 Female
-73               9638 1.542808e+02    13   Male
-74               7781 6.033058e-01     3   Male
-75               6932 2.809917e-01     4   Male
-76               8120 1.966942e+00     2   Male
-77               9292 2.041322e+00    NA   Male
-78               9228 2.115702e+00     5 Female
-79               8185 4.663043e+02     3   Male
-80               6797 3.000000e-01     3   Male
-81               5970 1.500796e+02    14   Male
-82               7219 1.543790e+02    11 Female
-83               6870 2.561983e-01     7 Female
-84               7653 1.596338e+02     7   Male
-85               8824 1.732484e+02    11 Female
-86               8311 4.641304e+02     9 Female
-87               9458 3.736364e+01    14   Male
-88               8275 1.572452e+02    13 Female
-89               6786 3.000000e-01     1   Male
-90               6595 3.000000e-01     1   Male
-91               5264 8.264463e-02     4   Male
-92               9188 6.776859e-01     1 Female
-93               6611 7.272727e-01     2   Male
-94               6840 2.066116e-01     3 Female
-95               5663 1.966942e+00     2   Male
-96               9611 3.000000e-01     1   Male
-97               7717 3.000000e-01     2   Male
-98               8374 2.809917e-01     2 Female
-99               5134 8.016529e-01     4 Female
-100              8122 1.818182e-01     5 Female
-101              6192 1.818182e-01     5   Male
-102              9668 8.264463e-02     6 Female
-103              9577 3.422727e+01    14 Female
-104              6403 8.743506e+00    14   Male
-105              9464 3.000000e-01    10   Male
-106              8157 1.641720e+02     6 Female
-107              9451 4.049587e-01     6   Male
-108              6615 1.001592e+02     8   Male
-109              9074 4.489130e+02     6 Female
-110              7479 1.101911e+02    12 Female
-111              8946 4.440909e+01    12   Male
-112              5296 1.288217e+02    14 Female
-113              6238 2.840909e+01    15   Male
-114              6303 1.003981e+02    12 Female
-115              6662 8.512397e-01     4 Female
-116              6251 1.322314e-01     4   Male
-117              9110 1.297521e+00     3 Female
-118              8480 1.570248e-01    NA   Male
-119              5229 1.966942e+00     2 Female
-120              9173 1.536624e+02     3   Male
-121              9896 3.000000e-01    NA Female
-122              5057 3.000000e-01     3 Female
-123              7732 1.074380e+00     3   Male
-124              6882 1.099174e+00     2 Female
-125              9587 3.057851e-01     4 Female
-126              9930 3.000000e-01    10 Female
-127              6960 5.785124e-02     7 Female
-128              6335 4.391304e+02    11 Female
-129              6286 6.130435e+02     6 Female
-130              9035 1.074380e-01    11   Male
-131              5720 7.125796e+01     9   Male
-132              7368 4.222727e+01     6   Male
-133              5170 1.620223e+02    13 Female
-134              6691 3.750000e+01    10 Female
-135              6173 1.534236e+02     6 Female
-136              8170 6.239130e+02    11 Female
-137              9637 5.521739e+02     7   Male
-138              9482 5.785124e-02     6 Female
-139              7880 6.547945e-01     4 Female
-140              6307 8.767123e-02     4 Female
-141              8822 3.000000e-01     4   Male
-142              8190 2.849315e+00     4 Female
-143              7554 3.835616e-02     4   Male
-144              6519 2.849315e-01     4   Male
-145              9764 4.649315e+00     3   Male
-146              8792 1.369863e-01     4 Female
-147              6721 3.589041e-01     3   Male
-148              9042 1.049315e+00     3   Male
-149              7407 4.668998e+01    13 Female
-150              7229 1.473510e+02     7 Female
-151              7532 4.589744e+01    10   Male
-152              6516 2.109589e-01     6   Male
-153              7941 1.741722e+02    10 Female
-154              8124 2.496503e+01    12 Female
-155              7869 1.850993e+02    10   Male
-156              5647 1.863014e-01    10   Male
-157              9120 1.863014e-01    13   Male
-158              6608 4.589744e+01    13 Female
-159              8635 1.942881e+02     5 Female
-160              9341 5.079646e+02     3 Female
-161              9982 8.767123e-01     4   Male
-162              6976 2.750685e+00     1   Male
-163              6008 1.503311e+02     3 Female
-164              5432 3.000000e-01     4   Male
-165              5749 3.095890e-01     4   Male
-166              6428 3.000000e-01     1   Male
-167              5947 6.371681e+02     5 Female
-168              6027 6.054795e-01     6 Female
-169              5064 1.955298e+02    14 Female
-170              5861 1.786424e+02     6   Male
-171              6702 1.120861e+02    13 Female
-172              7851 1.331954e+02     9   Male
-173              8310 2.159292e+02    11   Male
-174              5897 5.628319e+02    10   Male
-175              9249 1.900662e+02     5 Female
-176              9163 6.547945e-01    14   Male
-177              6550 1.665753e+00     7   Male
-178              5859 1.739238e+02    10   Male
-179              5607 9.991722e+01     6   Male
-180              8746 9.321192e+01     5   Male
-181              5274 8.767123e-02     3 Female
-182              9412           NA     4   Male
-183              5691 6.794521e-01     2 Female
-184              9016 5.808219e-01     3   Male
-185              9128 1.369863e-01     3 Female
-186              8539 2.060274e+00     2 Female
-187              5703 1.610099e+02     3   Male
-188              9573 4.082192e-01     5 Female
-189              5852 8.273973e-01     2   Male
-190              5971 4.601770e+02     3 Female
-191              7015 1.389073e+02    14 Female
-192              8221 3.867133e+01     9 Female
-193              6752 9.260274e-01    14 Female
-194              7436 5.918874e+01     9 Female
-195              6869 1.870861e+02     8 Female
-196              8947 4.328767e-01     7   Male
-197              7360 6.301370e-02    13   Male
-198              7494 3.000000e-01     8 Female
-199              8243 1.548013e+02     6   Male
-200              6176 5.819536e+01    12 Female
-201              6818 1.724338e+02    14 Female
-202              8083 1.932401e+01    15 Female
-203              6711 2.164420e+00     2 Female
-204              8890 9.757412e-01     4 Female
-205              5576 1.509434e-01     3   Male
-206              8396 1.509434e-01     3 Female
-207              5986 7.766571e+01     3   Male
-208              9758 4.319563e+01     4 Female
-209              5444 1.752022e-01     3   Male
-210              6394 3.094775e+01    14 Female
-211              5694 1.266846e-01     8   Male
-212              9604 2.919806e+01     7   Male
-213              7895 9.545455e+00    14 Female
-214              5141 2.735115e+01    13 Female
-215              8034 1.314841e+02    13 Female
-216              6566 3.643985e+01     7   Male
-217              6827 1.498559e+02     8 Female
-218              7400 9.363636e+00    10 Female
-219              9094 2.479784e-01     9   Male
-220              9474 5.390836e-02     9 Female
-221              7984 8.787062e-01     3 Female
-222              9524 1.994609e-01     4   Male
-223              9598 3.000000e-01     4 Female
-224              9664 3.000000e-01     4   Male
-225              9910 5.390836e-03     2 Female
-226              9216 4.177898e-01     1 Female
-227              9706 3.000000e-01     3 Female
-228              5320 2.479784e-01     2   Male
-229              5256 2.964960e-02     3   Male
-230              9006 2.964960e-01     5   Male
-231              6413 5.148248e+00     2 Female
-232              8717 1.994609e-01     2   Male
-233              9873 3.000000e-01     9   Male
-234              6699 1.779539e+02    13   Male
-235              8228 3.290210e+02    10 Female
-236              6494 3.000000e-01     6   Male
-237              9294 1.809798e+02    13 Female
-238              7680 4.905660e-01    11   Male
-239              7534 1.266846e-01    10   Male
-240              9920 1.543948e+02     8 Female
-241              9814 1.379683e+02     9 Female
-242              5363 6.153846e+02    10   Male
-243              5842 1.474784e+02    14   Male
-244              7992 3.000000e-01     1 Female
-245              5565 1.024259e+00     2   Male
-246              5258 4.444056e+02     3 Female
-247              8200 3.000000e-01     2   Male
-248              8795 2.504043e+00     3 Female
-249              7676 3.000000e-01     2 Female
-250              7029 3.000000e-01     3 Female
-251              7535 7.816712e-02     5 Female
-252              5026 3.000000e-01    10 Female
-253              8630 5.390836e-02     7   Male
-254              6989 1.494236e+02    13 Female
-255              8454 5.972622e+01    15   Male
-256              9741 6.361186e-01    11 Female
-257              6418 1.837896e+02    10 Female
-258              9922 1.320809e+02     3 Female
-259              8504 1.571906e-01     2   Male
-260              6491 1.520231e+02     3   Male
-261              6002 3.000000e-01     3 Female
-262              7127 3.000000e-01     3 Female
-263              8540 1.823699e+02     4   Male
-264              7115 3.000000e-01     3   Male
-265              7268 2.173913e+00     2   Male
-266              8279 2.142202e+01     4   Male
-267              8880 3.000000e-01     2 Female
-268              8076 3.408027e+00     8   Male
-269              6250 4.155963e+01    11   Male
-270              8542 9.698997e-02     6   Male
-271              5393 1.238532e+01    14 Female
-272              9197 9.528926e+00    14   Male
-273              6651 1.916185e+02     5 Female
-274              7473 1.060201e+00     5   Male
-275              6589 3.679104e+02    10 Female
-276              6867 4.288991e+01    13   Male
-277              5413 9.971098e+01     6   Male
-278              6765 3.000000e-01     5   Male
-279              8933 1.208092e+02    12   Male
-280              6294 3.000000e-01     2   Male
-281              8688 6.688963e-03     3 Female
-282              8108 2.505017e+00     1 Female
-283              6926 1.481605e+00     1   Male
-284              5880 3.000000e-01     1 Female
-285              5529 5.183946e-01     2 Female
-286              8963 3.000000e-01     5 Female
-287              9594 1.872910e-01     5   Male
-288              8075 3.678930e-01     4 Female
-289              5680 3.000000e-01     2   Male
-290              5617 4.529851e+02    NA Female
-291              5080 3.169725e+01     6 Female
-292              7719 3.000000e-01     8   Male
-293              6780 4.922018e+01    15   Male
-294              8768 2.548507e+02    11   Male
-295              7031 1.661850e+02    14   Male
-296              7740 9.164179e+02     6   Male
-297              8855 3.678930e-01    10 Female
-298              7241 1.236994e+02    12   Male
-299              8156 6.705202e+01    14   Male
-300              7333 3.834862e+01    10   Male
-301              6906 1.963211e+00     1 Female
-302              9511 3.000000e-01     3   Male
-303              9336 2.474916e-01     2   Male
-304              6644 3.000000e-01     3 Female
-305              5554 2.173913e-01     4   Male
-306              8094 8.193980e-01     3   Male
-307              8836 2.444816e+00     4 Female
-308              7147 3.000000e-01     4   Male
-309              7745 1.571906e-01     1 Female
-310              9345 1.849711e+02     7   Male
-311              5606 6.119403e+02    11 Female
-312              9766 3.000000e-01     7 Female
-313              6666 4.280936e-01     5 Female
-314              9965 9.698997e-02    10   Male
-315              7927 3.678930e-02     9 Female
-316              6266 4.832090e+02    13   Male
-317              9487 1.390173e+02    11 Female
-318              7089 3.000000e-01    13   Male
-319              5731 6.555970e+02     9 Female
-320              7962 1.526012e+02    15 Female
-321              9532 3.000000e-01     7 Female
-322              6687 7.222222e-01     4   Male
-323              6570 7.724426e+01     1   Male
-324              5781 3.000000e-01     1   Male
-325              8935 6.111111e-01     2 Female
-326              5780 1.555556e+00     2 Female
-327              9029 3.055556e-01     3   Male
-328              5668 1.500000e+00     2   Male
-329              8203 1.470772e+02     3   Male
-330              7381 1.694444e+00     4 Female
-331              7734 3.138298e+02     7 Female
-332              7257 1.414405e+02    11 Female
-333              8418 1.990605e+02    10 Female
-334              8259 4.212766e+02     5   Male
-335              5587 3.000000e-01     8   Male
-336              8499 3.000000e-01    15   Male
-337              7897 6.478723e+02    14   Male
-338              8300 3.000000e-01     2   Male
-339              9691 2.222222e+00     2 Female
-340              5873 3.000000e-01     2   Male
-341              6690 2.055556e+00     5   Male
-342              9970 2.777778e-02     4 Female
-343              8978 8.333333e-02     3   Male
-344              6181 1.032359e+02     5 Female
-345              8218 1.611111e+00     4 Female
-346              5387 8.333333e-02     2 Female
-347              7850 2.333333e+00     1 Female
-348              7326 5.755319e+02     7   Male
-349              8448 1.686848e+02     8 Female
-350              7264 1.111111e-01    NA   Male
-351              8361 3.000000e-01     9   Male
-352              7497 8.372340e+02     8 Female
-353              5559 3.000000e-01     5   Male
-354              7321 3.784504e+01    14   Male
-355              8372 3.819149e+02    14   Male
-356              5030 5.555556e-02     7 Female
-357              6936 3.000000e+02    13 Female
-358              9628 1.855950e+02     2   Male
-359              8558 1.944444e-01     1 Female
-360              7840 3.000000e-01     1   Male
-361              5100 5.555556e-02     4 Female
-362              8244 1.138889e+00     3   Male
-363              9115 4.254237e+01     4 Female
-364              5489 3.000000e-01     3   Male
-365              5766 3.000000e-01     1   Male
-366              5024 3.000000e-01     5 Female
-367              8599 3.000000e-01     4 Female
-368              8895 3.138298e+02     4 Female
-369              7708 1.235908e+02     4   Male
-370              7646 4.159574e+02    11   Male
-371              6640 3.009685e+01    15 Female
-372              8958 1.567850e+02    12 Female
-373              6477 1.367432e+02    11 Female
-374              7910 3.731235e+01     8 Female
-375              7829 9.164927e+01    13   Male
-376              7503 2.936170e+02    10 Female
-377              5209 8.820459e+01    10 Female
-378              6763 1.035491e+02    15   Male
-379              8976 7.379958e+01     8 Female
-380              9223 3.000000e-01    14   Male
-381              7692 1.718750e+02     4   Male
-382              7453 2.128527e+00     1   Male
-383              9775 1.253918e+00     5 Female
-384              9662 2.382445e-01     2   Male
-385              8733 4.639498e-01     2 Female
-386              5695 1.253918e-01     4   Male
-387              7714 1.253918e-01     4   Male
-388              9224 3.000000e-01     2 Female
-389              7635 1.000000e+00     3   Male
-390              7176 1.570043e+02    11   Male
-391              6102 4.344086e+02    10 Female
-392              7817 2.184953e+00     6   Male
-393              9719 1.507837e+00    12 Female
-394              9740 3.228840e-01    10 Female
-395              9528 4.588024e+01     8   Male
-396              7142 1.660560e+02     8   Male
-397              5689 3.000000e-01    13   Male
-398              5439 3.043011e+02    10   Male
-399              6718 2.612903e+02    13 Female
-400              6569 1.621767e+02    10   Male
-401              9444 3.228840e-01     2   Male
-402              6964 4.639498e-01     4 Female
-403              6420 2.495298e+00     3 Female
-404              9189 3.257053e+00     2 Female
-405              9368 3.793103e-01     1 Female
-406              6360           NA     3   Male
-407              8196 6.896552e-02     3 Female
-408              8297 3.000000e-01     4   Male
-409              6674 1.423197e+00     5 Female
-410              5269 3.000000e-01     5 Female
-411              6599 3.000000e-01     1 Female
-412              7713 1.786638e+02    11   Male
-413              8644 3.279570e+02     6   Male
-414              9680           NA    14 Female
-415              6305 1.903017e+02     8   Male
-416              8493 1.654095e+02     8 Female
-417              5297 4.639498e-01     9 Female
-418              7723 1.815733e+02     7   Male
-419              7510 1.366771e+00     6   Male
-420              5102 1.536050e-01    12 Female
-421              7816 1.306587e+01     8   Male
-422              5143 2.129032e+02    11 Female
-423              7414 1.925647e+02    14   Male
-424              5127 3.000000e-01     3 Female
-425              5830 1.028213e+00     1 Female
-426              8929 3.793103e-01     5 Female
-427              7993 8.025078e-01     2 Female
-428              8092 4.860215e+02     3 Female
-429              9750 3.000000e-01     4 Female
-430              6660 2.100313e-01     2   Male
-431              8054 2.767665e+01     3 Female
-432              6086 1.592476e+00     4   Male
-433              6878 9.717868e-02     1 Female
-434              8125 1.028213e+00     7 Female
-435              9500 3.793103e-01    10   Male
-436              8105 1.292026e+02    11   Male
-437              9593 4.425150e+01     7 Female
-438              5202 3.193548e+02    10 Female
-439              7207 1.860991e+02    14 Female
-440              5518 6.614420e-01     7 Female
-441              9820 5.203762e-01    11   Male
-442              6958 1.330819e+02    12   Male
-443              9445 1.673491e+02    10 Female
-444              8774 3.000000e-01     6   Male
-445              9614 1.117457e+02    13   Male
-446              9810 3.045509e+01     8 Female
-447              7271 3.000000e-01     2   Male
-448              8031 8.280255e-02     3 Female
-449              7232 3.000000e-01     1 Female
-450              7452 1.200637e+00     2 Female
-451              5921 1.687898e-01    NA   Male
-452              8136 7.367273e+02    NA Female
-453              6605 8.280255e-02     4   Male
-454              5125 5.127389e-01     4   Male
-455              5911 1.974522e-01     1   Male
-456              9644 7.993631e-01     2 Female
-457              5760 3.000000e-01     2   Male
-458              7055 3.298182e+02    12   Male
-459              9064 9.736842e+01    12 Female
-460              6925 3.000000e-01     8 Female
-461              7757 3.000000e-01    14 Female
-462              8527 4.214545e+02    13 Female
-463              8521 3.000000e-01     6   Male
-464              6260 2.578182e+02    11 Female
-465              9578 2.261147e-01    11   Male
-466              9570 3.000000e-01    10 Female
-467              6246 1.883901e+02    12   Male
-468              9622 9.458204e+01    14 Female
-469              7661 3.000000e-01    11 Female
-470              9374 3.000000e-01     1   Male
-471              8446 7.707006e-01     2 Female
-472              8332 5.032727e+02     3   Male
-473              8008 1.544586e+00     3 Female
-474              9365 1.431115e+02     5 Female
-475              9819 3.000000e-01     3   Male
-476              5173 1.458599e+00     1   Male
-477              6722 1.247678e+02     4 Female
-478              7668           NA     4 Female
-479              8980 4.334545e+02     4   Male
-480              5204 3.000000e-01     2 Female
-481              6412 6.156364e+02     5 Female
-482              6404 9.574303e+01     7   Male
-483              5693 1.928019e+02     8   Male
-484              8100 1.888545e+02    10   Male
-485              9760 1.598297e+02     6 Female
-486              6377 5.127389e-01     7   Male
-487              6012 1.171053e+02    10 Female
-488              6224           NA     6   Male
-489              6561 2.547771e-02     6 Female
-490              8475 1.707430e+02    15 Female
-491              6629 3.000000e-01     5   Male
-492              7200 1.869969e+02     3   Male
-493              9453 4.731481e+01     5   Male
-494              6449 1.988390e+02     3 Female
-495              9452 3.000000e-01     5   Male
-496              7162 8.808050e+01     5   Male
-497              8962 2.003185e+00     1 Female
-498              7328 3.000000e-01     1   Male
-499              9097 3.509259e+01     7 Female
-500              9131 9.365325e+01    14 Female
-501              7280 3.000000e-01     9   Male
-502              5783 3.736111e+01    10 Female
-503              9895 1.674923e+02    10 Female
-504              7986 8.808050e+01    11   Male
-505              7146 1.656347e+02    11 Female
-506              8671 3.722222e+01    12 Female
-507              5273 6.756364e+02    11 Female
-508              5063 3.000000e-01    12   Male
-509              6729 1.698142e+02    12   Male
-510              9085 1.628483e+02    10 Female
-511              9929 5.985130e-01     1   Male
-512              8479 1.903346e+00     2 Female
-513              7395 3.000000e-01     4   Male
-514              6374 3.000000e-01     2   Male
-515              7878 8.996283e-01     3   Male
-516              9603 3.977695e-01     3 Female
-517              7994 3.000000e-01     2   Male
-518              5277 3.000000e-01     4   Male
-519              5054 3.000000e-01     3   Male
-520              5440 3.000000e-01     1 Female
-521              6551 7.446809e+02     4   Male
-522              5281 6.095745e+02    12 Female
-523              7145 1.427445e+02     6   Male
-524              5275 3.000000e-01     7 Female
-525              9542 2.973978e-02     7   Male
-526              9371 3.977695e-01    13 Female
-527              5598 4.095745e+02     8 Female
-528              7148 4.595745e+02     7   Male
-529              5624 3.000000e-01     8 Female
-530              6998 1.976341e+02     8 Female
-531              9286 3.776596e+02    11 Female
-532              7589 1.777603e+02    14 Female
-533              7095 4.312268e-01     3   Male
-534              5455 6.765957e+02     2 Female
-535              6257 7.978723e+02     2   Male
-536              8627 9.665427e-02     3   Male
-537              9786 1.879338e+02     2   Male
-538              8176 4.358670e+01     2 Female
-539              9198 3.000000e-01     3 Female
-540              6586 3.000000e-01     2   Male
-541              8850 2.638955e+01     5   Male
-542              9560 3.180523e+01    10 Female
-543              7144 1.746845e+02    14   Male
-544              8230 1.876972e+02     9   Male
-545              7559 1.044164e+02     6   Male
-546              5312 1.202681e+02     7   Male
-547              6560 1.630915e+02    14 Female
-548              6091 1.276025e+02     7 Female
-549              5578 8.880126e+01     7   Male
-550              5837 3.563830e+02     9   Male
-551              8347 2.212766e+02    14   Male
-552              6453 1.969121e+01    10 Female
-553              5758 3.755319e+02    13 Female
-554              5569 1.214511e+02     5   Male
-555              8766 1.034700e+02     4 Female
-556              8002 3.000000e-01     4 Female
-557              7839 3.643123e-01     5 Female
-558              5434 6.319703e-02     4 Female
-559              7636 3.000000e-01     4   Male
-560              6164 3.000000e-01     4   Male
-561              9243 3.000000e-01     3 Female
-562              5872 3.000000e-01     1 Female
-563              8079 3.000000e-01     4   Male
-564              9762 3.000000e-01     1   Male
-565              9476 3.000000e-01     1 Female
-566              8345 3.000000e-01     7   Male
-567              8128 1.664038e+02    13 Female
-568              7956 2.946809e+02    10 Female
-569              8677 4.391924e+01    14   Male
-570              5881 1.874606e+02    12 Female
-571              7498 1.143533e+02    14   Male
-572              8134 1.600158e+02     8   Male
-573              7748 1.635688e-01     7   Male
-574              7990 8.809148e+01    11 Female
-575              6184 1.337539e+02     8   Male
-576              6339 1.985804e+02    12   Male
-577              5113 1.578864e+02     9 Female
-578              9449 3.000000e-01     5 Female
-579              8110 3.000000e-01     4   Male
-580              9307 1.953642e-01     3 Female
-581              5555 1.119205e+00     2   Male
-582              9152 2.523636e+02     2   Male
-583              7969 3.000000e-01     3   Male
-584              6116 4.844371e+00     4 Female
-585              8294 3.000000e-01     4   Male
-586              8938 1.492553e+02     4 Female
-587              9539 1.993617e+02     5   Male
-588              9470 2.847682e-01     3 Female
-589              6677 3.145695e-01     6 Female
-590              8752 3.000000e-01     3   Male
-591              5574 3.406429e+01    11 Female
-592              5989 6.595745e+01    11   Male
-593              9813 3.000000e-01     7   Male
-594              6150 2.174545e+02     8   Male
-595              5730           NA     6 Female
-596              8038 5.957447e+01    10 Female
-597              5964 7.236364e+02     8 Female
-598              9043 3.000000e-01     8   Male
-599              5095 3.000000e-01     9 Female
-600              8922 3.000000e-01     8   Male
-601              5469 2.676364e+02    13   Male
-602              6726 1.891489e+02    11   Male
-603              7495 3.036364e+02     8 Female
-604              8159 3.000000e-01     2 Female
-605              6709 3.000000e-01     4   Male
-606              5855 3.000000e-01     2   Male
-607              6058 3.000000e-01     2 Female
-608              7292 3.000000e-01     4   Male
-609              6437 1.447020e+00     2   Male
-610              9326 2.130909e+02     4 Female
-611              8222 1.357616e-01     2 Female
-612              6789 3.000000e-01     4 Female
-613              6348 3.000000e-01     1 Female
-614              5958 5.534545e+02     4 Female
-615              9211 1.891489e+02    12 Female
-616              9450 7.202128e+01     7 Female
-617              6540 3.250287e+01    11   Male
-618              8796 1.655629e-02     6   Male
-619              7971 3.123636e+02     8   Male
-620              7549 3.000000e-01    14   Male
-621              9799 7.138298e+01    11   Male
-622              7013 3.000000e-01     7 Female
-623              5599 6.946809e+01    14 Female
-624              8601 4.012629e+01     6   Male
-625              7383 1.629787e+02    13 Female
-626              6656 1.508511e+02    13 Female
-627              5641 1.655629e-02     3   Male
-628              6222 3.000000e-01     1   Male
-629              7674 4.635762e-02     3   Male
-630              5293 3.000000e-01     1 Female
-631              6715 3.000000e-01     1 Female
-632              7057 3.000000e-01     2   Male
-633              7072 1.942553e+02     4   Male
-634              6380 3.690909e+02     4   Male
-635              6762 3.000000e-01     2 Female
-636              5799 3.000000e-01     4 Female
-637              6681 2.847682e+00     5   Male
-638              8755 1.435106e+02     3 Female
-639              6896 3.000000e-01     3   Male
-640              5945 4.752009e+01     6 Female
-641              5035 2.621125e+01    11 Female
-642              6776 1.055319e+02     9 Female
-643              7863 3.000000e-01     7 Female
-644              9836 1.149007e+00     8   Male
-645              7860 2.927273e+02    NA Female
-646              5248 3.000000e-01     8 Female
-647              5677 3.000000e-01    14 Female
-648              9576 4.839265e+01    10   Male
-649              5824 3.000000e-01    10   Male
-650              9184 3.000000e-01    11 Female
-651              5397 2.251656e-01    13 Female</code></pre>
-</div>
 </div>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1"></a>df<span class="sc">$</span>slum <span class="ot">&lt;-</span> <span class="cn">NULL</span> <span class="co"># this is the same as above</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb47"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb47-1"><a href="#cb47-1"></a>df<span class="sc">$</span>slum <span class="ot">&lt;-</span> <span class="cn">NULL</span> <span class="co"># this is the same as above</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>We can also grab the <code>age</code> column using the <code>$</code> operator.</p>
+<p>We can also grab the <code>age</code> column using the <code>$</code> operator, again this is selecting the variable for all of the rows.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb49"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb49-1"><a href="#cb49-1"></a>df<span class="sc">$</span>age</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-<div class="cell-output cell-output-stdout">
-<pre><code>  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01
-  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00
- [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02
- [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02
- [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01
- [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01
- [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01
- [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01
- [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00
- [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01
- [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA
- [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00
- [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01
- [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02
- [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01
- [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01
- [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02
- [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01
- [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00
- [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01
-[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01
-[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02
-[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01
-[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02
-[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01
-[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01
-[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02
-[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02
-[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00
-[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02
-[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02
-[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02
-[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01
-[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02
-[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02
-[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01
-[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01
-[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02
-[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02
-[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01
-[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01
-[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01
-[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02
-[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02
-[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03
-[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01
-[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02
-[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02
-[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00
-[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01
-[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01
-[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02
-[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00
-[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02
-[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02
-[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01
-[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01
-[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02
-[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02
-[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01
-[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01
-[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02
-[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02
-[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02
-[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01
-[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00
-[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01
-[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01
-[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00
-[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01
-[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02
-[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01
-[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01
-[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02
-[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01
-[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01
-[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01
-[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02
-[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01
-[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02
-[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01
-[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01
-[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02
-[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01
-[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00
-[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01
-[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01
-[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01
-[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02
-[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00
-[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01
-[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01
-[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01
-[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01
-[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01
-[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01
-[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02
-[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02
-[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01
-[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01
-[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02
-[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02
-[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01
-[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02
-[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02
-[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02
-[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01
-[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02
-[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02
-[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02
-[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01
-[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01
-[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02
-[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02
-[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01
-[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01
-[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01
-[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA
-[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01
-[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01
-[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02
-[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02
-[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01
-[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02
-[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01
-[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01
-[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01
-[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02
-[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01
-[651] 2.251656e-01</code></pre>
-</div>
+<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1"></a>df<span class="sc">$</span>age</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="using-indexing-to-subset-by-rows" class="slide level2">
 <h2>Using indexing to subset by rows</h2>
 <p>We can use indexing to also subset by rows. For example, here we pull the 100th observation/row.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb51"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb51-1"><a href="#cb51-1"></a>df[<span class="dv">100</span>,] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb49"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb49-1"><a href="#cb49-1"></a>df[<span class="dv">100</span>,] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>    IgG_concentration       age age gender     slum
-100              8122 0.1818182   5 Female Non slum</code></pre>
+<pre><code>    observation_id IgG_concentration age gender     slum
+100           8122         0.1818182   5 Female Non slum</code></pre>
 </div>
 </div>
 <p>And, here we pull the <code>age</code> of the 100th observation/row.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb53"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb53-1"><a href="#cb53-1"></a>df[<span class="dv">100</span>,<span class="st">"age"</span>] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb51"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb51-1"><a href="#cb51-1"></a>df[<span class="dv">100</span>,<span class="st">"age"</span>] </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] 0.1818182</code></pre>
+<pre><code>[1] 5</code></pre>
 </div>
 </div>
 </section>
@@ -2424,8 +1387,8 @@ <h2>Logical operators</h2>
 </tr>
 <tr class="even">
 <td><code>!=</code></td>
-<td>not equal to</td>
-<td style="text-align: right;"></td>
+<td></td>
+<td style="text-align: right;">not equal to</td>
 </tr>
 <tr class="odd">
 <td><code>x&amp;y</code></td>
@@ -2454,26 +1417,26 @@ <h2>Logical operators</h2>
 <h2>Logical operators examples</h2>
 <p>Let’s practice. First, here is a reminder of what the number.object contains.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb55"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb55-1"><a href="#cb55-1"></a>number.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb53"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb53-1"><a href="#cb53-1"></a>number.object</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 3</code></pre>
 </div>
 </div>
 <p>Now, we will use logical operators to evaluate the object.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb57"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb57-1"><a href="#cb57-1"></a>number.object<span class="sc">&lt;</span><span class="dv">4</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb55"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb55-1"><a href="#cb55-1"></a>number.object<span class="sc">&lt;</span><span class="dv">4</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] TRUE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb59"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb59-1"><a href="#cb59-1"></a>number.object<span class="sc">&gt;=</span><span class="dv">3</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb57"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb57-1"><a href="#cb57-1"></a>number.object<span class="sc">&gt;=</span><span class="dv">3</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] TRUE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb61"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb61-1"><a href="#cb61-1"></a>number.object<span class="sc">!=</span><span class="dv">5</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb59"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb59-1"><a href="#cb59-1"></a>number.object<span class="sc">!=</span><span class="dv">5</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] TRUE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb63"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb63-1"><a href="#cb63-1"></a>number.object <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">6</span>,<span class="dv">7</span>,<span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb61"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb61-1"><a href="#cb61-1"></a>number.object <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">6</span>,<span class="dv">7</span>,<span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] FALSE</code></pre>
 </div>
@@ -2485,41 +1448,54 @@ <h2>Using indexing and logical operators to rename columns</h2>
 <li>We can assign the column names from data frame <code>df</code> to an object <code>cn</code>, then we can modify <code>cn</code> directly using indexing and logical operators, finally we reassign the column names, <code>cn</code>, back to the data frame <code>df</code>:</li>
 </ol>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb65"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb65-1"><a href="#cb65-1"></a>cn <span class="ot">&lt;-</span> <span class="fu">colnames</span>(df)</span>
-<span id="cb65-2"><a href="#cb65-2"></a>cn</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb63"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb63-1"><a href="#cb63-1"></a>cn <span class="ot">&lt;-</span> <span class="fu">colnames</span>(df)</span>
+<span id="cb63-2"><a href="#cb63-2"></a>cn</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] "IgG_concentration" "age"               "age"              
+<pre><code>[1] "observation_id"    "IgG_concentration" "age"              
 [4] "gender"            "slum"             </code></pre>
 </div>
+<div class="sourceCode cell-code" id="cb65"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb65-1"><a href="#cb65-1"></a>cn<span class="sc">==</span><span class="st">"IgG_concentration"</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] FALSE  TRUE FALSE FALSE FALSE</code></pre>
+</div>
 <div class="sourceCode cell-code" id="cb67"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb67-1"><a href="#cb67-1"></a>cn[cn<span class="sc">==</span><span class="st">"IgG_concentration"</span>] <span class="ot">&lt;-</span><span class="st">"IgG_concentration_mIU"</span> <span class="co">#rename cn to "IgG_concentration_mIU" when cn is "IgG_concentration"</span></span>
-<span id="cb67-2"><a href="#cb67-2"></a><span class="fu">colnames</span>(df) <span class="ot">&lt;-</span> cn</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<span id="cb67-2"><a href="#cb67-2"></a><span class="fu">colnames</span>(df) <span class="ot">&lt;-</span> cn</span>
+<span id="cb67-3"><a href="#cb67-3"></a><span class="fu">colnames</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] "observation_id"        "IgG_concentration_mIU" "age"                  
+[4] "gender"                "slum"                 </code></pre>
+</div>
 </div>
 <p>Note, I am resetting the column name back to the original name for the sake of the rest of the module.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb68"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb68-1"><a href="#cb68-1"></a><span class="fu">colnames</span>(df)[<span class="fu">colnames</span>(df)<span class="sc">==</span><span class="st">"IgG_concentration_mIU"</span>] <span class="ot">&lt;-</span> <span class="st">"IgG_concentration"</span> <span class="co">#reset</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb69"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb69-1"><a href="#cb69-1"></a><span class="fu">colnames</span>(df)[<span class="fu">colnames</span>(df)<span class="sc">==</span><span class="st">"IgG_concentration_mIU"</span>] <span class="ot">&lt;-</span> <span class="st">"IgG_concentration"</span> <span class="co">#reset</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="using-indexing-and-logical-operators-to-subset-data" class="slide level2">
 <h2>Using indexing and logical operators to subset data</h2>
 <p>In this example, we subset by rows and pull only observations with an age of less than or equal to 10 and then saved the subset data to <code>df_lt10</code>. Note that the logical operators <code>df$age&lt;=10</code> is before the comma because I want to subset by rows (the first dimension).</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb69"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb69-1"><a href="#cb69-1"></a>df_lte10 <span class="ot">&lt;-</span> df[df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span>, ]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-</div>
-<p>In this example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.</p>
-<div class="cell">
-<div class="sourceCode cell-code" id="cb70"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb70-1"><a href="#cb70-1"></a>df_lte5_gt10 <span class="ot">&lt;-</span> df[df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">5</span> <span class="sc">|</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">10</span>, ]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb70"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb70-1"><a href="#cb70-1"></a>df_lte10 <span class="ot">&lt;-</span> df[df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span>, ]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Lets check that my subsets worked using the <code>summary()</code> function.</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb71"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb71-1"><a href="#cb71-1"></a><span class="fu">summary</span>(df_lte10<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
-0.005391 0.300000 0.300000 0.724742 0.640788 9.545455       10 </code></pre>
+<pre><code>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
+    1.0     3.0     4.0     4.8     7.0    10.0       9 </code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb73"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb73-1"><a href="#cb73-1"></a><span class="fu">summary</span>(df_lte5_gt10<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p><br></p>
+<p>In the next example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb73"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb73-1"><a href="#cb73-1"></a>df_lte5_gt10 <span class="ot">&lt;-</span> df[df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">5</span> <span class="sc">|</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">10</span>, ]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>Lets check that my subsets worked using the <code>summary()</code> function.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb74"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb74-1"><a href="#cb74-1"></a><span class="fu">summary</span>(df_lte5_gt10<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
-  0.0054   0.3000   1.6018  87.9886 142.8362 916.4179       10 </code></pre>
+<pre><code>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
+   1.00    2.50    4.00    6.08   11.00   15.00       9 </code></pre>
 </div>
 </div>
 </section>
@@ -2528,10 +1504,12 @@ <h2>Missing values</h2>
 <p>Missing data need to be carefully described and dealt with in data analysis. Understanding the different types of missing data and how you can identify them, is the first step to data cleaning.</p>
 <p>Types of “missing” values:</p>
 <ul>
-<li><code>NA</code> - general missing data</li>
+<li><code>NA</code> - <strong>N</strong>ot <strong>A</strong>pplicable general missing data</li>
 <li><code>NaN</code> - stands for “<strong>N</strong>ot <strong>a</strong> <strong>N</strong>umber”, happens when you do 0/0.</li>
 <li><code>Inf</code> and <code>-Inf</code> - Infinity, happens when you divide a positive number (or negative number) by 0.</li>
 <li>blank space - sometimes when data is read it, there is a blank space left</li>
+<li>an empty string (e.g., <code>""</code>)</li>
+<li><code>NULL</code>- undefined value that represents something that does not exist</li>
 </ul>
 </section>
 <section id="logical-operators-to-help-identify-and-missing-data" class="slide level2">
@@ -2540,45 +1518,50 @@ <h2>Logical operators to help identify and missing data</h2>
 <thead>
 <tr class="header">
 <th>operator</th>
-<th>operator option</th>
-<th style="text-align: right;">description</th>
+<th>description</th>
+<th style="text-align: right;"></th>
 </tr>
 </thead>
 <tbody>
 <tr class="odd">
 <td><code>is.na</code></td>
-<td></td>
-<td style="text-align: right;">is NAN or NA</td>
+<td>is NAN or NA</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="even">
 <td><code>is.nan</code></td>
-<td></td>
-<td style="text-align: right;">is NAN</td>
+<td>is NAN</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="odd">
 <td><code>!is.na</code></td>
-<td></td>
-<td style="text-align: right;">is not NAN or NA</td>
+<td>is not NAN or NA</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="even">
 <td><code>!is.nan</code></td>
-<td></td>
-<td style="text-align: right;">is not NAN</td>
+<td>is not NAN</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="odd">
 <td><code>is.infinite</code></td>
-<td></td>
-<td style="text-align: right;">is infinite</td>
+<td>is infinite</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="even">
 <td><code>any</code></td>
-<td></td>
-<td style="text-align: right;">are any TRUE</td>
+<td>are any TRUE</td>
+<td style="text-align: right;"></td>
 </tr>
 <tr class="odd">
+<td><code>all</code></td>
+<td>all are TRUE</td>
+<td style="text-align: right;"></td>
+</tr>
+<tr class="even">
 <td><code>which</code></td>
-<td></td>
-<td style="text-align: right;">which are TRUE</td>
+<td>which are TRUE</td>
+<td style="text-align: right;"></td>
 </tr>
 </tbody>
 </table>
@@ -2586,20 +1569,20 @@ <h2>Logical operators to help identify and missing data</h2>
 <section id="more-logical-operators-examples" class="slide level2">
 <h2>More logical operators examples</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb75"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb75-1"><a href="#cb75-1"></a>test <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">0</span>,<span class="cn">NA</span>, <span class="sc">-</span><span class="dv">1</span>)<span class="sc">/</span><span class="dv">0</span></span>
-<span id="cb75-2"><a href="#cb75-2"></a>test</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb76"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb76-1"><a href="#cb76-1"></a>test <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="dv">0</span>,<span class="cn">NA</span>, <span class="sc">-</span><span class="dv">1</span>)<span class="sc">/</span><span class="dv">0</span></span>
+<span id="cb76-2"><a href="#cb76-2"></a>test</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  NaN   NA -Inf</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb77"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb77-1"><a href="#cb77-1"></a><span class="fu">is.na</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb78"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb78-1"><a href="#cb78-1"></a><span class="fu">is.na</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  TRUE  TRUE FALSE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb79"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb79-1"><a href="#cb79-1"></a><span class="fu">is.nan</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb80"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb80-1"><a href="#cb80-1"></a><span class="fu">is.nan</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  TRUE FALSE FALSE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb81"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb81-1"><a href="#cb81-1"></a><span class="fu">is.infinite</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb82"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb82-1"><a href="#cb82-1"></a><span class="fu">is.infinite</span>(test)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] FALSE FALSE  TRUE</code></pre>
 </div>
@@ -2609,22 +1592,22 @@ <h2>More logical operators examples</h2>
 <h2>More logical operators examples</h2>
 <p><code>any(is.na(x))</code> means do we have any <code>NA</code>’s in the object <code>x</code>?</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb83"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb83-1"><a href="#cb83-1"></a><span class="fu">any</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>IgG_concentration)) <span class="co"># are there any NAs - YES/TRUE</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb84"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb84-1"><a href="#cb84-1"></a><span class="fu">any</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>IgG_concentration)) <span class="co"># are there any NAs - YES/TRUE</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] FALSE</code></pre>
+<pre><code>[1] TRUE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb85"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb85-1"><a href="#cb85-1"></a><span class="fu">any</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>slum)) <span class="co"># are there any NAs- NO/FALSE</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb86"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb86-1"><a href="#cb86-1"></a><span class="fu">any</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>slum)) <span class="co"># are there any NAs- NO/FALSE</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] FALSE</code></pre>
 </div>
 </div>
 <p><code>which(is.na(x))</code> means which of the elements in object <code>x</code> are <code>NA</code>’s?</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb87"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb87-1"><a href="#cb87-1"></a><span class="fu">which</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>IgG_concentration)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb88"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb88-1"><a href="#cb88-1"></a><span class="fu">which</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>IgG_concentration)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>integer(0)</code></pre>
+<pre><code> [1]  13  55  57  72 182 406 414 478 488 595</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb89"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb89-1"><a href="#cb89-1"></a><span class="fu">which</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>slum)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb90"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb90-1"><a href="#cb90-1"></a><span class="fu">which</span>(<span class="fu">is.na</span>(df<span class="sc">$</span>slum)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>integer(0)</code></pre>
 </div>
@@ -2634,7 +1617,7 @@ <h2>More logical operators examples</h2>
 <h2><code>subset()</code> function</h2>
 <p>The Base R <code>subset()</code> function is a slightly easier way to select variables and observations.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb91"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb91-1"><a href="#cb91-1"></a>?subset</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb92"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb92-1"><a href="#cb92-1"></a>?subset</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <pre><code>Registered S3 method overwritten by 'printr':
   method                from     
@@ -2720,15 +1703,15 @@ <h2><code>subset()</code> function</h2>
 <h2>Subsetting use the <code>subset()</code> function</h2>
 <p>Here are a few examples using the <code>subset()</code> function</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb103"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb103-1"><a href="#cb103-1"></a>df_lte10_v2 <span class="ot">&lt;-</span> <span class="fu">subset</span>(df, df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span>, <span class="at">select=</span><span class="fu">c</span>(IgG_concentration, age))</span>
-<span id="cb103-2"><a href="#cb103-2"></a>df_lt5_f <span class="ot">&lt;-</span> <span class="fu">subset</span>(df, df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">5</span> <span class="sc">&amp;</span> gender<span class="sc">==</span><span class="st">"Female"</span>, <span class="at">select=</span><span class="fu">c</span>(IgG_concentration, slum))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb104"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb104-1"><a href="#cb104-1"></a>df_lte10_v2 <span class="ot">&lt;-</span> <span class="fu">subset</span>(df, df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span>, <span class="at">select=</span><span class="fu">c</span>(IgG_concentration, age))</span>
+<span id="cb104-2"><a href="#cb104-2"></a>df_lt5_f <span class="ot">&lt;-</span> <span class="fu">subset</span>(df, df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">5</span> <span class="sc">&amp;</span> gender<span class="sc">==</span><span class="st">"Female"</span>, <span class="at">select=</span><span class="fu">c</span>(IgG_concentration, slum))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="subset-function-vs-logical-operators" class="slide level2">
 <h2><code>subset()</code> function vs logical operators</h2>
 <p><code>subset()</code> automatically removes NAs, which is a different behavior from doing logical operations on NAs.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb104"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb104-1"><a href="#cb104-1"></a><span class="fu">summary</span>(df_lte10<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb105"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb105-1"><a href="#cb105-1"></a><span class="fu">summary</span>(df_lte10<span class="sc">$</span>age) <span class="co">#created with indexing</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2744,18 +1727,18 @@ <h2><code>subset()</code> function vs logical operators</h2>
 </thead>
 <tbody>
 <tr class="odd">
-<td style="text-align: right;">0.0053908</td>
-<td style="text-align: right;">0.3</td>
-<td style="text-align: right;">0.3</td>
-<td style="text-align: right;">0.7247421</td>
-<td style="text-align: right;">0.6407876</td>
-<td style="text-align: right;">9.545454</td>
+<td style="text-align: right;">1</td>
+<td style="text-align: right;">3</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: right;">4.8</td>
+<td style="text-align: right;">7</td>
 <td style="text-align: right;">10</td>
+<td style="text-align: right;">9</td>
 </tr>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb105"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb105-1"><a href="#cb105-1"></a><span class="fu">summary</span>(df_lte10_v2<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb106"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb106-1"><a href="#cb106-1"></a><span class="fu">summary</span>(df_lte10_v2<span class="sc">$</span>age) <span class="co">#created with the subset function</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2770,12 +1753,12 @@ <h2><code>subset()</code> function vs logical operators</h2>
 </thead>
 <tbody>
 <tr class="odd">
-<td style="text-align: right;">0.0053908</td>
-<td style="text-align: right;">0.3</td>
-<td style="text-align: right;">0.3</td>
-<td style="text-align: right;">0.7247421</td>
-<td style="text-align: right;">0.6407876</td>
-<td style="text-align: right;">9.545454</td>
+<td style="text-align: right;">1</td>
+<td style="text-align: right;">3</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: right;">4.8</td>
+<td style="text-align: right;">7</td>
+<td style="text-align: right;">10</td>
 </tr>
 </tbody>
 </table>
@@ -2783,24 +1766,24 @@ <h2><code>subset()</code> function vs logical operators</h2>
 </div>
 <p>We can also see this by looking at the number or rows in each dataset.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb106"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb106-1"><a href="#cb106-1"></a><span class="fu">nrow</span>(df_lte10)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb107"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb107-1"><a href="#cb107-1"></a><span class="fu">nrow</span>(df_lte10)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] 370</code></pre>
+<pre><code>[1] 504</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb108"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb108-1"><a href="#cb108-1"></a><span class="fu">nrow</span>(df_lte10_v2)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb109"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb109-1"><a href="#cb109-1"></a><span class="fu">nrow</span>(df_lte10_v2)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>[1] 360</code></pre>
+<pre><code>[1] 495</code></pre>
 </div>
 </div>
 </section>
 <section id="summary" class="slide level2">
 <h2>Summary</h2>
 <ul>
-<li><code>colnames()</code>, <code>str()</code> and <code>summary()</code>functions from Base R are great functions to assess the data type and some summary statistics</li>
-<li>There are three basic indexing syntax: <code>[ ]</code>, <code>[[ ]]</code> and <code>$</code></li>
+<li><code>colnames()</code>, <code>str()</code> and <code>summary()</code>functions from Base R are functions to assess the data type and some summary statistics</li>
+<li>There are three basic indexing syntax: <code>[</code>, <code>[[</code> and <code>$</code></li>
 <li>Indexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)</li>
 <li>Logical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing</li>
-<li>There are 5 “types” of missing values, the most common being “NA”</li>
+<li>There are 7 “types” of missing values, the most common being “NA”</li>
 <li>Logical operators meant to determine missing values are very helpful for data cleaning</li>
 <li>The Base R <code>subset()</code> function is a slightly easier way to select variables and observations.</li>
 </ul>
diff --git a/docs/modules/Module07-VarCreationClassesSummaries.html b/docs/modules/Module07-VarCreationClassesSummaries.html
index 5d1113b..76173e9 100644
--- a/docs/modules/Module07-VarCreationClassesSummaries.html
+++ b/docs/modules/Module07-VarCreationClassesSummaries.html
@@ -8,7 +8,7 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.3.433">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
@@ -407,43 +407,6 @@ <h1 class="title">Module 7: Variable Creation, Classes, and Summaries</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/import-data-for-this-module" id="/toc-import-data-for-this-module">Import data for this module</a></li>
-<li><a href="#/adding-new-columns" id="/toc-adding-new-columns">Adding new columns</a></li>
-<li><a href="#/creating-conditional-variables" id="/toc-creating-conditional-variables">Creating conditional variables</a></li>
-<li><a href="#/ifelse-example" id="/toc-ifelse-example"><code>ifelse</code> example</a></li>
-<li><a href="#/nesting-ifelse-statements-example" id="/toc-nesting-ifelse-statements-example">Nesting <code>ifelse</code> statements example</a></li>
-<li><a href="#/data-classes" id="/toc-data-classes">Data Classes</a>
-<ul>
-<li><a href="#/overview---data-classes" id="/toc-overview---data-classes">Overview - Data Classes</a></li>
-<li><a href="#/class-function" id="/toc-class-function"><code>class()</code> function</a></li>
-<li><a href="#/one-dimensional-data-types" id="/toc-one-dimensional-data-types">One dimensional data types</a></li>
-<li><a href="#/character-and-numeric" id="/toc-character-and-numeric">Character and numeric</a></li>
-<li><a href="#/numeric-subclasses" id="/toc-numeric-subclasses">Numeric Subclasses</a></li>
-<li><a href="#/logical" id="/toc-logical">Logical</a></li>
-<li><a href="#/other-useful-functions-for-evaluatingsetting-classes" id="/toc-other-useful-functions-for-evaluatingsetting-classes">Other useful functions for evaluating/setting classes</a></li>
-<li><a href="#/examples-is.class_namex" id="/toc-examples-is.class_namex">Examples <code>is.CLASS_NAME(x)</code></a></li>
-<li><a href="#/examples-as.class_namex" id="/toc-examples-as.class_namex">Examples <code>as.CLASS_NAME(x)</code></a></li>
-<li><a href="#/factors" id="/toc-factors">Factors</a></li>
-<li><a href="#/reference-groups" id="/toc-reference-groups">Reference Groups</a></li>
-<li><a href="#/changing-factor-reference" id="/toc-changing-factor-reference">Changing factor reference</a></li>
-<li><a href="#/changing-factor-reference-examples" id="/toc-changing-factor-reference-examples">Changing factor reference examples</a></li>
-<li><a href="#/two-dimensional-data-classes" id="/toc-two-dimensional-data-classes">Two-dimensional data classes</a></li>
-<li><a href="#/matrices" id="/toc-matrices">Matrices</a></li>
-<li><a href="#/data-frame" id="/toc-data-frame">Data Frame</a></li>
-<li><a href="#/numeric-variable-data-summary" id="/toc-numeric-variable-data-summary">Numeric variable data summary</a></li>
-<li><a href="#/numeric-variable-data-summary-examples" id="/toc-numeric-variable-data-summary-examples">Numeric variable data summary examples</a></li>
-<li><a href="#/character-variable-data-summaries" id="/toc-character-variable-data-summaries">Character Variable Data Summaries</a></li>
-<li><a href="#/character-variable-data-summary-examples" id="/toc-character-variable-data-summary-examples">Character variable data summary examples</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
@@ -628,7 +591,7 @@ <h2>Import data for this module</h2>
 </section>
 <section id="adding-new-columns" class="slide level2">
 <h2>Adding new columns</h2>
-<p>You can add a new column, called <code>newcol</code> to <code>df</code>, using the <code>$</code> operator:</p>
+<p>You can add a new column, called <code>log_IgG</code> to <code>df</code>, using the <code>$</code> operator:</p>
 <div class="cell">
 <div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a>df<span class="sc">$</span>log_IgG <span class="ot">&lt;-</span> <span class="fu">log</span>(df<span class="sc">$</span>IgG_concentration)</span>
 <span id="cb13-2"><a href="#cb13-2"></a><span class="fu">head</span>(df,<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
@@ -673,6 +636,161 @@ <h2>Adding new columns</h2>
 </table>
 </div>
 </div>
+<p>Note, my use of the underscore in the variable name rather than a space. This is good coding practice and make calling variables much less prone to error.</p>
+</section>
+<section id="adding-new-columns-1" class="slide level2">
+<h2>Adding new columns</h2>
+<p>We can also add a new column using the <code>transform()</code> function:</p>
+<div class="cell">
+<div class="cell-output cell-output-stdout">
+<pre><code>Transform an Object, for Example a Data Frame
+
+Description:
+
+     'transform' is a generic function, which-at least currently-only
+     does anything useful with data frames.  'transform.default'
+     converts its first argument to a data frame if possible and calls
+     'transform.data.frame'.
+
+Usage:
+
+     transform(`_data`, ...)
+     
+Arguments:
+
+   _data: The object to be transformed
+
+     ...: Further arguments of the form 'tag=value'
+
+Details:
+
+     The '...' arguments to 'transform.data.frame' are tagged vector
+     expressions, which are evaluated in the data frame '_data'.  The
+     tags are matched against 'names(_data)', and for those that match,
+     the value replace the corresponding variable in '_data', and the
+     others are appended to '_data'.
+
+Value:
+
+     The modified value of '_data'.
+
+Warning:
+
+     This is a convenience function intended for use interactively.
+     For programming it is better to use the standard subsetting
+     arithmetic functions, and in particular the non-standard
+     evaluation of argument 'transform' can have unanticipated
+     consequences.
+
+Note:
+
+     If some of the values are not vectors of the appropriate length,
+     you deserve whatever you get!
+
+Author(s):
+
+     Peter Dalgaard
+
+See Also:
+
+     'within' for a more flexible approach, 'subset', 'list',
+     'data.frame'
+
+Examples:
+
+     transform(airquality, Ozone = -Ozone)
+     transform(airquality, new = -Ozone, Temp = (Temp-32)/1.8)
+     
+     attach(airquality)
+     transform(Ozone, logOzone = log(Ozone)) # marginally interesting ...
+     detach(airquality)</code></pre>
+</div>
+</div>
+<p>For example, adding a binary column for seropositivity called <code>seropos</code>:</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1"></a>df <span class="ot">&lt;-</span> <span class="fu">transform</span>(df, <span class="at">seropos =</span> IgG_concentration <span class="sc">&gt;=</span> <span class="dv">10</span>)</span>
+<span id="cb15-2"><a href="#cb15-2"></a><span class="fu">head</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<colgroup>
+<col style="width: 20%">
+<col style="width: 25%">
+<col style="width: 5%">
+<col style="width: 9%">
+<col style="width: 12%">
+<col style="width: 15%">
+<col style="width: 11%">
+</colgroup>
+<thead>
+<tr class="header">
+<th style="text-align: right;">observation_id</th>
+<th style="text-align: right;">IgG_concentration</th>
+<th style="text-align: right;">age</th>
+<th style="text-align: left;">gender</th>
+<th style="text-align: left;">slum</th>
+<th style="text-align: right;">log_IgG</th>
+<th style="text-align: left;">seropos</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: right;">5772</td>
+<td style="text-align: right;">0.3176895</td>
+<td style="text-align: right;">2</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">-1.1466807</td>
+<td style="text-align: left;">FALSE</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">8095</td>
+<td style="text-align: right;">3.4368231</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1.2345475</td>
+<td style="text-align: left;">FALSE</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">9784</td>
+<td style="text-align: right;">0.3000000</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">-1.2039728</td>
+<td style="text-align: left;">FALSE</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">9338</td>
+<td style="text-align: right;">143.2363014</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">4.9644957</td>
+<td style="text-align: left;">TRUE</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">6369</td>
+<td style="text-align: right;">0.4476534</td>
+<td style="text-align: right;">1</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">-0.8037359</td>
+<td style="text-align: left;">FALSE</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">6885</td>
+<td style="text-align: right;">0.0252708</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">-3.6781074</td>
+<td style="text-align: left;">FALSE</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
 </section>
 <section id="creating-conditional-variables" class="slide level2">
 <h2>Creating conditional variables</h2>
@@ -768,18 +886,19 @@ <h2>Creating conditional variables</h2>
 <h2><code>ifelse</code> example</h2>
 <p>Reminder of the first three arguments in the <code>ifelse()</code> function are <code>ifelse(test, yes, no)</code>.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, <span class="st">"old"</span>)</span>
-<span id="cb23-2"><a href="#cb23-2"></a><span class="fu">head</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, <span class="st">"old"</span>)</span>
+<span id="cb25-2"><a href="#cb25-2"></a><span class="fu">head</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <colgroup>
-<col style="width: 20%">
-<col style="width: 24%">
-<col style="width: 5%">
+<col style="width: 18%">
+<col style="width: 21%">
+<col style="width: 4%">
+<col style="width: 8%">
+<col style="width: 10%">
+<col style="width: 13%">
 <col style="width: 9%">
 <col style="width: 12%">
-<col style="width: 14%">
-<col style="width: 13%">
 </colgroup>
 <thead>
 <tr class="header">
@@ -789,6 +908,7 @@ <h2><code>ifelse</code> example</h2>
 <th style="text-align: left;">gender</th>
 <th style="text-align: left;">slum</th>
 <th style="text-align: right;">log_IgG</th>
+<th style="text-align: left;">seropos</th>
 <th style="text-align: left;">age_group</th>
 </tr>
 </thead>
@@ -800,6 +920,7 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Female</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">-1.1466807</td>
+<td style="text-align: left;">FALSE</td>
 <td style="text-align: left;">young</td>
 </tr>
 <tr class="even">
@@ -809,6 +930,7 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Female</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">1.2345475</td>
+<td style="text-align: left;">FALSE</td>
 <td style="text-align: left;">young</td>
 </tr>
 <tr class="odd">
@@ -818,6 +940,7 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Male</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">-1.2039728</td>
+<td style="text-align: left;">FALSE</td>
 <td style="text-align: left;">young</td>
 </tr>
 <tr class="even">
@@ -827,6 +950,7 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Male</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">4.9644957</td>
+<td style="text-align: left;">TRUE</td>
 <td style="text-align: left;">young</td>
 </tr>
 <tr class="odd">
@@ -836,6 +960,7 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Male</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">-0.8037359</td>
+<td style="text-align: left;">FALSE</td>
 <td style="text-align: left;">young</td>
 </tr>
 <tr class="even">
@@ -845,101 +970,321 @@ <h2><code>ifelse</code> example</h2>
 <td style="text-align: left;">Male</td>
 <td style="text-align: left;">Non slum</td>
 <td style="text-align: right;">-3.6781074</td>
+<td style="text-align: left;">FALSE</td>
 <td style="text-align: left;">young</td>
 </tr>
 </tbody>
 </table>
 </div>
 </div>
+<p>Let’s delve into what is actually happening, with a focus on the NA values in <code>age</code> variable.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1"></a>df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE
+ [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
+ [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
+ [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+ [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+ [61]  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
+ [73] FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
+ [85] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+ [97]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[109] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE
+[121]    NA  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[133] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[145]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[157] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
+[169] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE
+[181]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
+[193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
+[205]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[217] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[229]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[241] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
+[253] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[265]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
+[277] FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[289]  TRUE    NA FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[301]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
+[313]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
+[325]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE
+[337] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
+[349] FALSE    NA FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
+[361]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
+[373] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
+[385]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[397] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[409]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[421] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[433]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[445] FALSE FALSE  TRUE  TRUE  TRUE  TRUE    NA    NA  TRUE  TRUE  TRUE  TRUE
+[457]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[469] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[481]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
+[493]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
+[505] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[517]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[529] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[541]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[553] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[565]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[577] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[589] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[601] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[613]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[625] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
+[637]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE    NA FALSE FALSE FALSE
+[649] FALSE FALSE FALSE</code></pre>
+</div>
+<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>age_group, <span class="at">useNA=</span><span class="st">"always"</span>, <span class="at">dnn=</span><span class="fu">list</span>(<span class="st">"age"</span>, <span class="st">""</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: left;">age/</th>
+<th style="text-align: right;">old</th>
+<th style="text-align: right;">young</th>
+<th style="text-align: right;">NA</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: left;">1</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">44</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">2</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">72</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">3</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">79</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">4</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">80</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">5</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">41</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">6</td>
+<td style="text-align: right;">38</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">7</td>
+<td style="text-align: right;">38</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">8</td>
+<td style="text-align: right;">39</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">9</td>
+<td style="text-align: right;">20</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">10</td>
+<td style="text-align: right;">44</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">11</td>
+<td style="text-align: right;">41</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">12</td>
+<td style="text-align: right;">23</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">13</td>
+<td style="text-align: right;">35</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">14</td>
+<td style="text-align: right;">37</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">15</td>
+<td style="text-align: right;">11</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">NA</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">9</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
 </section>
 <section id="nesting-ifelse-statements-example" class="slide level2">
 <h2>Nesting <code>ifelse</code> statements example</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb24"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1"><a href="#cb24-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
-<span id="cb24-2"><a href="#cb24-2"></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, </span>
-<span id="cb24-3"><a href="#cb24-3"></a>                              <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">10</span>, <span class="st">"old"</span>, <span class="cn">NA</span>)))</span>
-<span id="cb24-4"><a href="#cb24-4"></a><span class="fu">head</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
+<span id="cb29-2"><a href="#cb29-2"></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span>
+<span id="cb29-3"><a href="#cb29-3"></a><span class="fu">table</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>age_group, <span class="at">useNA=</span><span class="st">"always"</span>, <span class="at">dnn=</span><span class="fu">list</span>(<span class="st">"age"</span>, <span class="st">""</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
-<colgroup>
-<col style="width: 20%">
-<col style="width: 24%">
-<col style="width: 5%">
-<col style="width: 9%">
-<col style="width: 12%">
-<col style="width: 14%">
-<col style="width: 13%">
-</colgroup>
 <thead>
 <tr class="header">
-<th style="text-align: right;">observation_id</th>
-<th style="text-align: right;">IgG_concentration</th>
-<th style="text-align: right;">age</th>
-<th style="text-align: left;">gender</th>
-<th style="text-align: left;">slum</th>
-<th style="text-align: right;">log_IgG</th>
-<th style="text-align: left;">age_group</th>
+<th style="text-align: left;">age/</th>
+<th style="text-align: right;">middle</th>
+<th style="text-align: right;">old</th>
+<th style="text-align: right;">young</th>
+<th style="text-align: right;">NA</th>
 </tr>
 </thead>
 <tbody>
 <tr class="odd">
-<td style="text-align: right;">5772</td>
-<td style="text-align: right;">0.3176895</td>
-<td style="text-align: right;">2</td>
-<td style="text-align: left;">Female</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">-1.1466807</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">1</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">44</td>
+<td style="text-align: right;">0</td>
 </tr>
 <tr class="even">
-<td style="text-align: right;">8095</td>
-<td style="text-align: right;">3.4368231</td>
-<td style="text-align: right;">4</td>
-<td style="text-align: left;">Female</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">1.2345475</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">2</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">72</td>
+<td style="text-align: right;">0</td>
 </tr>
 <tr class="odd">
-<td style="text-align: right;">9784</td>
-<td style="text-align: right;">0.3000000</td>
-<td style="text-align: right;">4</td>
-<td style="text-align: left;">Male</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">-1.2039728</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">3</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">79</td>
+<td style="text-align: right;">0</td>
 </tr>
 <tr class="even">
-<td style="text-align: right;">9338</td>
-<td style="text-align: right;">143.2363014</td>
-<td style="text-align: right;">4</td>
-<td style="text-align: left;">Male</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">4.9644957</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">4</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">80</td>
+<td style="text-align: right;">0</td>
 </tr>
 <tr class="odd">
-<td style="text-align: right;">6369</td>
-<td style="text-align: right;">0.4476534</td>
-<td style="text-align: right;">1</td>
-<td style="text-align: left;">Male</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">-0.8037359</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">5</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">41</td>
+<td style="text-align: right;">0</td>
 </tr>
 <tr class="even">
-<td style="text-align: right;">6885</td>
-<td style="text-align: right;">0.0252708</td>
-<td style="text-align: right;">4</td>
-<td style="text-align: left;">Male</td>
-<td style="text-align: left;">Non slum</td>
-<td style="text-align: right;">-3.6781074</td>
-<td style="text-align: left;">young</td>
+<td style="text-align: left;">6</td>
+<td style="text-align: right;">38</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">7</td>
+<td style="text-align: right;">38</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">8</td>
+<td style="text-align: right;">39</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">9</td>
+<td style="text-align: right;">20</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">10</td>
+<td style="text-align: right;">44</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">11</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">41</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">12</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">23</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">13</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">35</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">14</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">37</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">15</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">11</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">NA</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">0</td>
+<td style="text-align: right;">9</td>
 </tr>
 </tbody>
 </table>
 </div>
 </div>
+<p>Note, it puts the variable levels in alphabetical order, we will show how to change this later.</p>
 </section>
 <section>
 <section id="data-classes" class="title-slide slide level1 center">
@@ -958,173 +1303,19 @@ <h2>Overview - Data Classes</h2>
 <h2><code>class()</code> function</h2>
 <p>The <code>class()</code> function allows you to evaluate the class of an object.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a><span class="fu">class</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb30"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1"></a><span class="fu">class</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "numeric"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb27"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb32"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "integer"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a><span class="fu">class</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb34"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1"></a><span class="fu">class</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "character"</code></pre>
 </div>
 </div>
-<p>Return the First or Last Parts of an Object</p>
-<p>Description:</p>
-<pre><code> Returns the first or last parts of a vector, matrix, table, data
- frame or function.  Since 'head()' and 'tail()' are generic
- functions, they may also have been extended to other classes.</code></pre>
-<p>Usage:</p>
-<pre><code> head(x, ...)
- ## Default S3 method:
- head(x, n = 6L, ...)
- 
- ## S3 method for class 'matrix'
- head(x, n = 6L, ...) # is exported as head.matrix()
- ## NB: The methods for 'data.frame' and 'array'  are identical to the 'matrix' one
- 
- ## S3 method for class 'ftable'
- head(x, n = 6L, ...)
- ## S3 method for class 'function'
- head(x, n = 6L, ...)
- 
- 
- tail(x, ...)
- ## Default S3 method:
- tail(x, n = 6L, keepnums = FALSE, addrownums, ...)
- ## S3 method for class 'matrix'
- tail(x, n = 6L, keepnums = TRUE, addrownums, ...) # exported as tail.matrix()
- ## NB: The methods for 'data.frame', 'array', and 'table'
- ##     are identical to the  'matrix'  one
- 
- ## S3 method for class 'ftable'
- tail(x, n = 6L, keepnums = FALSE, addrownums, ...)
- ## S3 method for class 'function'
- tail(x, n = 6L, ...)
- </code></pre>
-<p>Arguments:</p>
-<pre><code>   x: an object
-
-   n: an integer vector of length up to 'dim(x)' (or 1, for
-      non-dimensioned objects).  A 'logical' is silently coerced to
-      integer.  Values specify the indices to be selected in the
-      corresponding dimension (or along the length) of the object.
-      A positive value of 'n[i]' includes the first/last 'n[i]'
-      indices in that dimension, while a negative value excludes
-      the last/first 'abs(n[i])', including all remaining indices.
-      'NA' or non-specified values (when 'length(n) &lt;
-      length(dim(x))') select all indices in that dimension. Must
-      contain at least one non-missing value.</code></pre>
-<p>keepnums: in each dimension, if no names in that dimension are present, create them using the indices included in that dimension. Ignored if ‘dim(x)’ is ‘NULL’ or its length 1.</p>
-<p>addrownums: deprecated - ‘keepnums’ should be used instead. Taken as the value of ‘keepnums’ if it is explicitly set when ‘keepnums’ is not.</p>
-<pre><code> ...: arguments to be passed to or from other methods.</code></pre>
-<p>Details:</p>
-<pre><code> For vector/array based objects, 'head()' ('tail()') returns a
- subset of the same dimensionality as 'x', usually of the same
- class. For historical reasons, by default they select the first
- (last) 6 indices in the first dimension ("rows") or along the
- length of a non-dimensioned vector, and the full extent (all
- indices) in any remaining dimensions. 'head.matrix()' and
- 'tail.matrix()' are exported.
-
- The default and array(/matrix) methods for 'head()' and 'tail()'
- are quite general. They will work as is for any class which has a
- 'dim()' method, a 'length()' method (only required if 'dim()'
- returns 'NULL'), and a '[' method (that accepts the 'drop'
- argument and can subset in all dimensions in the dimensioned
- case).
-
- For functions, the lines of the deparsed function are returned as
- character strings.
-
- When 'x' is an array(/matrix) of dimensionality two and more,
- 'tail()' will add dimnames similar to how they would appear in a
- full printing of 'x' for all dimensions 'k' where 'n[k]' is
- specified and non-missing and 'dimnames(x)[[k]]' (or 'dimnames(x)'
- itself) is 'NULL'.  Specifically, the form of the added dimnames
- will vary for different dimensions as follows:
-
- 'k=1' (rows): '"[n,]"' (right justified with whitespace padding)
-
- 'k=2' (columns): '"[,n]"' (with _no_ whitespace padding)
-
- 'k&gt;2' (higher dims): '"n"', i.e., the indices as _character_
-      values
-
- Setting 'keepnums = FALSE' suppresses this behaviour.
-
- As 'data.frame' subsetting ('indexing') keeps 'attributes', so do
- the 'head()' and 'tail()' methods for data frames.</code></pre>
-<p>Value:</p>
-<pre><code> An object (usually) like 'x' but generally smaller.  Hence, for
- 'array's, the result corresponds to 'x[.., drop=FALSE]'.  For
- 'ftable' objects 'x', a transformed 'format(x)'.</code></pre>
-<p>Note:</p>
-<pre><code> For array inputs the output of 'tail' when 'keepnums' is 'TRUE',
- any dimnames vectors added for dimensions '&gt;2' are the original
- numeric indices in that dimension _as character vectors_.  This
- means that, e.g., for 3-dimensional array 'arr', 'tail(arr,
- c(2,2,-1))[ , , 2]' and 'tail(arr, c(2,2,-1))[ , , "2"]' may both
- be valid but have completely different meanings.</code></pre>
-<p>Author(s):</p>
-<pre><code> Patrick Burns, improved and corrected by R-Core. Negative argument
- added by Vincent Goulet.  Multi-dimension support added by Gabriel
- Becker.</code></pre>
-<p>Examples:</p>
-<pre><code> head(letters)
- head(letters, n = -6L)
- 
- head(freeny.x, n = 10L)
- head(freeny.y)
- 
- head(iris3)
- head(iris3, c(6L, 2L))
- head(iris3, c(6L, -1L, 2L))
- 
- tail(letters)
- tail(letters, n = -6L)
- 
- tail(freeny.x)
- ## the bottom-right "corner" :
- tail(freeny.x, n = c(4, 2))
- tail(freeny.y)
- 
- tail(iris3)
- tail(iris3, c(6L, 2L))
- tail(iris3, c(6L, -1L, 2L))
- 
- ## iris with dimnames stripped
- a3d &lt;- iris3 ; dimnames(a3d) &lt;- NULL
- tail(a3d, c(6, -1, 2)) # keepnums = TRUE is default here!
- tail(a3d, c(6, -1, 2), keepnums = FALSE)
- 
- ## data frame w/ a (non-standard) attribute:
- treeS &lt;- structure(trees, foo = "bar")
- (n &lt;- nrow(treeS))
- stopifnot(exprs = { # attribute is kept
-     identical(htS &lt;- head(treeS), treeS[1:6, ])
-     identical(attr(htS, "foo") , "bar")
-     identical(tlS &lt;- tail(treeS), treeS[(n-5):n, ])
-     ## BUT if I use "useAttrib(.)", this is *not* ok, when n is of length 2:
-     ## --- because [i,j]-indexing of data frames *also* drops "other" attributes ..
-     identical(tail(treeS, 3:2), treeS[(n-2):n, 2:3] )
- })
- 
- tail(library) # last lines of function
- 
- head(stats::ftable(Titanic))
- 
- ## 1d-array (with named dim) :
- a1 &lt;- array(1:7, 7); names(dim(a1)) &lt;- "O2"
- stopifnot(exprs = {
-   identical( tail(a1, 10), a1)
-   identical( head(a1, 10), a1)
-   identical( head(a1, 1), a1 [1 , drop=FALSE] ) # was a1[1] in R &lt;= 3.6.x
-   identical( tail(a1, 2), a1[6:7])
-   identical( tail(a1, 1), a1 [7 , drop=FALSE] ) # was a1[7] in R &lt;= 3.6.x
- })</code></pre>
 </section>
 <section id="one-dimensional-data-types" class="slide level2">
 <h2>One dimensional data types</h2>
@@ -1144,19 +1335,19 @@ <h2>Character and numeric</h2>
 <p>This can also be a bit tricky.</p>
 <p>If only one character in the whole vector, the class is assumed to be character</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">2</span>, <span class="st">"tree"</span>)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb36"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">2</span>, <span class="st">"tree"</span>)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "character"</code></pre>
 </div>
 </div>
 <p>Here because integers are in quotations, it is read as a character class by R.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7"</span>)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7"</span>)) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "character"</code></pre>
 </div>
 </div>
-<p>Note, this is the first time we have shown you nested functions. Here, instead of creating a new vector object (e.g., <code>x &lt;- c("1", "4", "7")</code>) and then feeding the vector object <code>x</code> into the first argument of the <code>class()</code> function (e.g., <code>class(x)</code>), we combined the two steps and directly fed a vector object into the class function.</p>
+<p>Note, instead of creating a new vector object (e.g., <code>x &lt;- c("1", "4", "7")</code>) and then feeding the vector object <code>x</code> into the first argument of the <code>class()</code> function (e.g., <code>class(x)</code>), we combined the two steps and directly fed a vector object into the class function.</p>
 </section>
 <section id="numeric-subclasses" class="slide level2">
 <h2>Numeric Subclasses</h2>
@@ -1167,19 +1358,19 @@ <h2>Numeric Subclasses</h2>
 </ol>
 <p><code>typeof()</code> identifies the vector type (double, integer, logical, or character), whereas <code>class()</code> identifies the root class. The difference between the two will be more clear when we look at two dimensional classes below.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1"></a><span class="fu">class</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1"></a><span class="fu">class</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "numeric"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "integer"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1"></a><span class="fu">typeof</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1"></a><span class="fu">typeof</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "double"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb50"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb50-1"><a href="#cb50-1"></a><span class="fu">typeof</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1"></a><span class="fu">typeof</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "integer"</code></pre>
 </div>
@@ -1187,35 +1378,35 @@ <h2>Numeric Subclasses</h2>
 </section>
 <section id="logical" class="slide level2">
 <h2>Logical</h2>
-<p>Reminder <code>logical</code> is a type that only has two possible elements: <code>TRUE</code> and <code>FALSE</code>.</p>
+<p>Reminder <code>logical</code> is a type that only has three possible elements: <code>TRUE</code> and <code>FALSE</code> and <code>NA</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb52"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb52-1"><a href="#cb52-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="cn">TRUE</span>, <span class="cn">FALSE</span>, <span class="cn">TRUE</span>, <span class="cn">TRUE</span>, <span class="cn">FALSE</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1"></a><span class="fu">class</span>(<span class="fu">c</span>(<span class="cn">TRUE</span>, <span class="cn">FALSE</span>, <span class="cn">TRUE</span>, <span class="cn">TRUE</span>, <span class="cn">FALSE</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "logical"</code></pre>
 </div>
 </div>
-<p>Note that <code>logical</code> elements are NOT in quotes. Putting R special classes (e.g., <code>NA</code> or <code>FALSE</code>) in quotations turns them into character value.</p>
+<p>Note that when creating <code>logical</code> object the <code>TRUE</code> and <code>FALSE</code> are NOT in quotes. Putting R special classes (e.g., <code>NA</code> or <code>FALSE</code>) in quotations turns them into character value.</p>
 </section>
 <section id="other-useful-functions-for-evaluatingsetting-classes" class="slide level2">
 <h2>Other useful functions for evaluating/setting classes</h2>
 <p>There are two useful functions associated with practically all R classes:</p>
 <ul>
 <li><code>is.CLASS_NAME(x)</code> to <strong>logically check</strong> whether or not <code>x</code> is of certain class. For example, <code>is.integer</code> or <code>is.character</code> or <code>is.numeric</code></li>
-<li><code>as.CLASS_NAME(x)</code> to <strong>coerce between classes</strong> <code>x</code> from current <code>x</code> class into a certain class. For example, <code>as.integer</code> or <code>as.character</code> or <code>as.numeric</code>. This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).</li>
+<li><code>as.CLASS_NAME(x)</code> to <strong>coerce between classes</strong> <code>x</code> from current <code>x</code> class into a another class. For example, <code>as.integer</code> or <code>as.character</code> or <code>as.numeric</code>. This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).</li>
 </ul>
 </section>
 <section id="examples-is.class_namex" class="slide level2">
 <h2>Examples <code>is.CLASS_NAME(x)</code></h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb54"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb54-1"><a href="#cb54-1"></a><span class="fu">is.numeric</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb50"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb50-1"><a href="#cb50-1"></a><span class="fu">is.numeric</span>(df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] TRUE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb56"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb56-1"><a href="#cb56-1"></a><span class="fu">is.character</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb52"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb52-1"><a href="#cb52-1"></a><span class="fu">is.character</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] FALSE</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb58"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb58-1"><a href="#cb58-1"></a><span class="fu">is.character</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb54"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb54-1"><a href="#cb54-1"></a><span class="fu">is.character</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] TRUE</code></pre>
 </div>
@@ -1225,29 +1416,29 @@ <h2>Examples <code>is.CLASS_NAME(x)</code></h2>
 <h2>Examples <code>as.CLASS_NAME(x)</code></h2>
 <p>In some cases, coercing is seamless</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb60"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb60-1"><a href="#cb60-1"></a><span class="fu">as.character</span>(<span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">4</span>, <span class="dv">7</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb56"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb56-1"><a href="#cb56-1"></a><span class="fu">as.character</span>(<span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">4</span>, <span class="dv">7</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "1" "4" "7"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb62"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb62-1"><a href="#cb62-1"></a><span class="fu">as.numeric</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb58"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb58-1"><a href="#cb58-1"></a><span class="fu">as.numeric</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 1 4 7</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb64"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb64-1"><a href="#cb64-1"></a><span class="fu">as.logical</span>(<span class="fu">c</span>(<span class="st">"TRUE"</span>, <span class="st">"FALSE"</span>, <span class="st">"FALSE"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb60"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb60-1"><a href="#cb60-1"></a><span class="fu">as.logical</span>(<span class="fu">c</span>(<span class="st">"TRUE"</span>, <span class="st">"FALSE"</span>, <span class="st">"FALSE"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  TRUE FALSE FALSE</code></pre>
 </div>
 </div>
-<p>In some cases the coercing is not possible; if executed, will return <code>NA</code> (an R constant representing “<strong>N</strong>ot <strong>A</strong>vailable” i.e.&nbsp;missing value)</p>
+<p>In some cases the coercing is not possible; if executed, will return <code>NA</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb66"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb66-1"><a href="#cb66-1"></a><span class="fu">as.numeric</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7a"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb62"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb62-1"><a href="#cb62-1"></a><span class="fu">as.numeric</span>(<span class="fu">c</span>(<span class="st">"1"</span>, <span class="st">"4"</span>, <span class="st">"7a"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stderr">
 <pre><code>Warning: NAs introduced by coercion</code></pre>
 </div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  1  4 NA</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb69"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb69-1"><a href="#cb69-1"></a><span class="fu">as.logical</span>(<span class="fu">c</span>(<span class="st">"TRUE"</span>, <span class="st">"FALSE"</span>, <span class="st">"UNKNOWN"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb65"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb65-1"><a href="#cb65-1"></a><span class="fu">as.logical</span>(<span class="fu">c</span>(<span class="st">"TRUE"</span>, <span class="st">"FALSE"</span>, <span class="st">"UNKNOWN"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  TRUE FALSE    NA</code></pre>
 </div>
@@ -1257,21 +1448,22 @@ <h2>Examples <code>as.CLASS_NAME(x)</code></h2>
 <h2>Factors</h2>
 <p>A <code>factor</code> is a special <code>character</code> vector where the elements have pre-defined groups or ‘levels’. You can think of these as qualitative or categorical variables. Use the <code>factor()</code> function to create factors from character values.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb71"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb71-1"><a href="#cb71-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age_group)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb67"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb67-1"><a href="#cb67-1"></a><span class="fu">class</span>(df<span class="sc">$</span>age_group)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "character"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb73"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb73-1"><a href="#cb73-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group)</span>
-<span id="cb73-2"><a href="#cb73-2"></a><span class="fu">class</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb69"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb69-1"><a href="#cb69-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group)</span>
+<span id="cb69-2"><a href="#cb69-2"></a><span class="fu">class</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "factor"</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb75"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb75-1"><a href="#cb75-1"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb71"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb71-1"><a href="#cb71-1"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "middle" "old"    "young" </code></pre>
 </div>
 </div>
-<p>Note that levels are, by default, set to <strong>alphanumerical</strong> order! And, the first is always the “reference” group. However, we often prefer a different reference group.</p>
+<p>Note 1, that levels are, by default, set to <strong>alphanumerical</strong> order! And, the first is always the “reference” group. However, we often prefer a different reference group.</p>
+<p>Note 2, we can also make ordered factors using <code>factor(... ordered=TRUE)</code>, but we won’t talk more about that.</p>
 </section>
 <section id="reference-groups" class="slide level2">
 <h2>Reference Groups</h2>
@@ -1281,7 +1473,12 @@ <h2>Reference Groups</h2>
 </section>
 <section id="changing-factor-reference" class="slide level2">
 <h2>Changing factor reference</h2>
-<p>Changing the reference group of a factor variable. - If the object is already a factor then use <code>relevel()</code> function and the <code>ref</code> argument to specify the reference. - If the object is a character then use <code>factor()</code> function and <code>levels</code> argument to specify the order of the values, the first being the reference.</p>
+<p>Changing the reference group of a factor variable.</p>
+<ul>
+<li>If the object is already a factor then use <code>relevel()</code> function and the <code>ref</code> argument to specify the reference.</li>
+<li>If the object is a character then use <code>factor()</code> function and <code>levels</code> argument to specify the order of the values, the first being the reference.</li>
+</ul>
+<p>Let’s look at the <code>relevel()</code> help file</p>
 <p>Reorder Levels of Factor</p>
 <p>Description:</p>
 <pre><code> The levels of a factor are re-ordered so that the level specified
@@ -1307,6 +1504,8 @@ <h2>Changing factor reference</h2>
 <p>Examples:</p>
 <pre><code> warpbreaks$tension &lt;- relevel(warpbreaks$tension, ref = "M")
  summary(lm(breaks ~ wool + tension, data = warpbreaks))</code></pre>
+<p><br></p>
+<p>Let’s look at the <code>factor()</code> help file</p>
 <p>Factors</p>
 <p>Description:</p>
 <pre><code> The function 'factor' is used to encode a vector as a factor (the
@@ -1529,16 +1728,16 @@ <h2>Changing factor reference</h2>
 <section id="changing-factor-reference-examples" class="slide level2">
 <h2>Changing factor reference examples</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb96"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb96-1"><a href="#cb96-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">relevel</span>(df<span class="sc">$</span>age_group_factor, <span class="at">ref=</span><span class="st">"young"</span>)</span>
-<span id="cb96-2"><a href="#cb96-2"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb92"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb92-1"><a href="#cb92-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">relevel</span>(df<span class="sc">$</span>age_group_factor, <span class="at">ref=</span><span class="st">"young"</span>)</span>
+<span id="cb92-2"><a href="#cb92-2"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "young"  "middle" "old"   </code></pre>
 </div>
 </div>
 <p>OR</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb98"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb98-1"><a href="#cb98-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span>
-<span id="cb98-2"><a href="#cb98-2"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb94"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb94-1"><a href="#cb94-1"></a>df<span class="sc">$</span>age_group_factor <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span>
+<span id="cb94-2"><a href="#cb94-2"></a><span class="fu">levels</span>(df<span class="sc">$</span>age_group_factor)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] "young"  "middle" "old"   </code></pre>
 </div>
@@ -1559,7 +1758,7 @@ <h2>Matrices</h2>
 <p><code>as.matrix()</code> creates a matrix from a data frame (where all values are the same class).</p>
 <p>You can also create a matrix from scratch using <code>matrix()</code> Use <code>?matrix</code> to see the arguments.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb100"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb100-1"><a href="#cb100-1"></a><span class="fu">matrix</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol =</span> <span class="dv">2</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb96"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb96-1"><a href="#cb96-1"></a><span class="fu">matrix</span>(<span class="at">data=</span><span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol =</span> <span class="dv">2</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <tbody>
@@ -1578,7 +1777,7 @@ <h2>Matrices</h2>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb101"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb101-1"><a href="#cb101-1"></a><span class="fu">matrix</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb97"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb97-1"><a href="#cb97-1"></a><span class="fu">matrix</span>(<span class="at">data=</span><span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol=</span><span class="dv">2</span>, <span class="at">byrow=</span><span class="cn">TRUE</span>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <tbody>
@@ -1598,13 +1797,13 @@ <h2>Matrices</h2>
 </table>
 </div>
 </div>
-<p>Notice, the first matrix filled in numbers 1-6 by columns first and then rows because default <code>byrow</code> argument is FALSE. In the second matrix, we changed the argument <code>byrow</code> to <code>TRUE</code>, and now numbers 1-6 are filled by rows first and then columns.</p>
+<p>Note, the first matrix filled in numbers 1-6 by columns first and then rows because default <code>byrow</code> argument is FALSE. In the second matrix, we changed the argument <code>byrow</code> to <code>TRUE</code>, and now numbers 1-6 are filled by rows first and then columns.</p>
 </section>
 <section id="data-frame" class="slide level2">
-<h2>Data Frame</h2>
-<p>You can transform an existing matrix into data frames and tibble using <code>as.data.frame()</code>.</p>
+<h2>Data frame</h2>
+<p>You can transform an existing matrix into data frames using <code>as.data.frame()</code></p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb102"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb102-1"><a href="#cb102-1"></a><span class="fu">as.data.frame</span>(<span class="fu">matrix</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol =</span> <span class="dv">2</span>) ) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb98"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb98-1"><a href="#cb98-1"></a><span class="fu">as.data.frame</span>(<span class="fu">matrix</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>, <span class="at">ncol =</span> <span class="dv">2</span>) ) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -1633,8 +1832,18 @@ <h2>Data Frame</h2>
 </section>
 <section id="numeric-variable-data-summary" class="slide level2">
 <h2>Numeric variable data summary</h2>
-<p>Data summarization on numeric vectors/variables: - <code>mean()</code>: takes the mean of x - <code>sd()</code>: takes the standard deviation of x - <code>median()</code>: takes the median of x - <code>quantile()</code>: displays sample quantiles of x. Default is min, IQR, max - <code>range()</code>: displays the range. Same as <code>c(min(), max())</code> - <code>sum()</code>: sum of x - <code>max()</code>: maximum value in x - <code>min()</code>: minimum value in x</p>
-<p>Note, <strong>all have the </strong> <code>na.rm =</code> <strong>argument for missing data</strong></p>
+<p>Data summarization on numeric vectors/variables:</p>
+<ul>
+<li><code>mean()</code>: takes the mean of x</li>
+<li><code>sd()</code>: takes the standard deviation of x</li>
+<li><code>median()</code>: takes the median of x</li>
+<li><code>quantile()</code>: displays sample quantiles of x. Default is min, IQR, max</li>
+<li><code>range()</code>: displays the range. Same as <code>c(min(), max())</code></li>
+<li><code>sum()</code>: sum of x</li>
+<li><code>max()</code>: maximum value in x</li>
+<li><code>min()</code>: minimum value in x</li>
+</ul>
+<p>Note, <strong>all have the </strong> <code>na.rm</code> <strong>argument for missing data</strong></p>
 <p>Arithmetic Mean</p>
 <p>Description:</p>
 <pre><code> Generic function for the (trimmed) arithmetic mean.</code></pre>
@@ -1677,19 +1886,20 @@ <h2>Numeric variable data summary</h2>
 <section id="numeric-variable-data-summary-examples" class="slide level2">
 <h2>Numeric variable data summary examples</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb111"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb111-1"><a href="#cb111-1"></a><span class="fu">summary</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb107"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb107-1"><a href="#cb107-1"></a><span class="fu">summary</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table style="width:100%;">
 <colgroup>
 <col style="width: 2%">
+<col style="width: 10%">
+<col style="width: 12%">
+<col style="width: 10%">
 <col style="width: 11%">
-<col style="width: 13%">
 <col style="width: 11%">
-<col style="width: 12%">
-<col style="width: 12%">
+<col style="width: 10%">
+<col style="width: 9%">
+<col style="width: 11%">
 <col style="width: 11%">
-<col style="width: 12%">
-<col style="width: 12%">
 </colgroup>
 <thead>
 <tr class="header">
@@ -1700,6 +1910,7 @@ <h2>Numeric variable data summary examples</h2>
 <th style="text-align: left;">gender</th>
 <th style="text-align: left;">slum</th>
 <th style="text-align: left;">log_IgG</th>
+<th style="text-align: left;">seropos</th>
 <th style="text-align: left;">age_group</th>
 <th style="text-align: left;">age_group_factor</th>
 </tr>
@@ -1713,6 +1924,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">Length:651</td>
 <td style="text-align: left;">Length:651</td>
 <td style="text-align: left;">Min. :-5.2231</td>
+<td style="text-align: left;">Mode :logical</td>
 <td style="text-align: left;">Length:651</td>
 <td style="text-align: left;">young :316</td>
 </tr>
@@ -1724,6 +1936,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">Class :character</td>
 <td style="text-align: left;">Class :character</td>
 <td style="text-align: left;">1st Qu.:-1.2040</td>
+<td style="text-align: left;">FALSE:360</td>
 <td style="text-align: left;">Class :character</td>
 <td style="text-align: left;">middle:179</td>
 </tr>
@@ -1735,6 +1948,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">Mode :character</td>
 <td style="text-align: left;">Mode :character</td>
 <td style="text-align: left;">Median : 0.5103</td>
+<td style="text-align: left;">TRUE :281</td>
 <td style="text-align: left;">Mode :character</td>
 <td style="text-align: left;">old :147</td>
 </tr>
@@ -1746,6 +1960,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">Mean : 1.6074</td>
+<td style="text-align: left;">NA’s :10</td>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">NA’s : 9</td>
 </tr>
@@ -1759,6 +1974,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">3rd Qu.: 4.9519</td>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">NA</td>
+<td style="text-align: left;">NA</td>
 </tr>
 <tr class="even">
 <td style="text-align: left;"></td>
@@ -1770,6 +1986,7 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">Max. : 6.8205</td>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">NA</td>
+<td style="text-align: left;">NA</td>
 </tr>
 <tr class="odd">
 <td style="text-align: left;"></td>
@@ -1781,27 +1998,28 @@ <h2>Numeric variable data summary examples</h2>
 <td style="text-align: left;">NA’s :10</td>
 <td style="text-align: left;">NA</td>
 <td style="text-align: left;">NA</td>
+<td style="text-align: left;">NA</td>
 </tr>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb112"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb112-1"><a href="#cb112-1"></a><span class="fu">range</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb108"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb108-1"><a href="#cb108-1"></a><span class="fu">range</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] NA NA</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb114"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb114-1"><a href="#cb114-1"></a><span class="fu">range</span>(df<span class="sc">$</span>age, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb110"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb110-1"><a href="#cb110-1"></a><span class="fu">range</span>(df<span class="sc">$</span>age, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1]  1 15</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb116"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb116-1"><a href="#cb116-1"></a><span class="fu">median</span>(df<span class="sc">$</span>IgG_concentration, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb112"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb112-1"><a href="#cb112-1"></a><span class="fu">median</span>(df<span class="sc">$</span>IgG_concentration, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 1.665753</code></pre>
 </div>
 </div>
 </section>
 <section id="character-variable-data-summaries" class="slide level2">
-<h2>Character Variable Data Summaries</h2>
-<p>Data summarization on character or factor vectors/variables * <code>table()</code></p>
+<h2>Character variable data summaries</h2>
+<p>Data summarization on character or factor vectors/variables using <code>table()</code></p>
 <p>Cross Tabulation and Table Creation</p>
 <p>Description:</p>
 <pre><code> 'table' uses cross-classifying factors to build a contingency
@@ -1965,7 +2183,7 @@ <h2>Character Variable Data Summaries</h2>
 <h2>Character variable data summary examples</h2>
 <p>Number of observations in each category</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb128"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb128-1"><a href="#cb128-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb124"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb124-1"><a href="#cb124-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -1982,7 +2200,7 @@ <h2>Character variable data summary examples</h2>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb129"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb129-1"><a href="#cb129-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender, <span class="at">useNA=</span><span class="st">"always"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb125"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb125-1"><a href="#cb125-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender, <span class="at">useNA=</span><span class="st">"always"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2001,7 +2219,7 @@ <h2>Character variable data summary examples</h2>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb130"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb130-1"><a href="#cb130-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group, <span class="at">useNA=</span><span class="st">"always"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb126"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb126-1"><a href="#cb126-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group, <span class="at">useNA=</span><span class="st">"always"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2023,9 +2241,8 @@ <h2>Character variable data summary examples</h2>
 </table>
 </div>
 </div>
-<p>Percent of observations in each category (xxzane - better way in base r?)</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb131"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb131-1"><a href="#cb131-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender)<span class="sc">/</span><span class="fu">nrow</span>(df) <span class="co">#if no NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb127"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb127-1"><a href="#cb127-1"></a><span class="fu">table</span>(df<span class="sc">$</span>gender)<span class="sc">/</span><span class="fu">nrow</span>(df) <span class="co">#if no NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2042,7 +2259,7 @@ <h2>Character variable data summary examples</h2>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb132"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb132-1"><a href="#cb132-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group)<span class="sc">/</span><span class="fu">nrow</span>(df[<span class="sc">!</span><span class="fu">is.na</span>(df<span class="sc">$</span>age_group),]) <span class="co">#if there are NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb128"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb128-1"><a href="#cb128-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group)<span class="sc">/</span><span class="fu">nrow</span>(df[<span class="sc">!</span><span class="fu">is.na</span>(df<span class="sc">$</span>age_group),]) <span class="co">#if there are NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2061,7 +2278,7 @@ <h2>Character variable data summary examples</h2>
 </tbody>
 </table>
 </div>
-<div class="sourceCode cell-code" id="cb133"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb133-1"><a href="#cb133-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group)<span class="sc">/</span><span class="fu">nrow</span>(<span class="fu">subset</span>(df, <span class="sc">!</span><span class="fu">is.na</span>(df<span class="sc">$</span>age_group),)) <span class="co">#if there are NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb129"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb129-1"><a href="#cb129-1"></a><span class="fu">table</span>(df<span class="sc">$</span>age_group)<span class="sc">/</span><span class="fu">nrow</span>(<span class="fu">subset</span>(df, <span class="sc">!</span><span class="fu">is.na</span>(df<span class="sc">$</span>age_group),)) <span class="co">#if there are NA values</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <table>
 <thead>
@@ -2091,12 +2308,12 @@ <h2>Summary</h2>
 <li><code>is.CLASS_NAME(x)</code> can be used to test the class of an object x</li>
 <li><code>as.CLASS_NAME(x)</code> can be used to change the class of an object x</li>
 <li>Factors are a special character class that has levels</li>
-<li>…</li>
+<li>…xxamy complete</li>
 </ul>
 </section>
 <section id="acknowledgements" class="slide level2">
 <h2>Acknowledgements</h2>
-<p>These are the materials I looked through, modified, or extracted to complete this module’s lecture.</p>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 <ul>
 <li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
 </ul>
diff --git a/docs/modules/Module08-DataMergeReshape.html b/docs/modules/Module08-DataMergeReshape.html
new file mode 100644
index 0000000..9b0bcad
--- /dev/null
+++ b/docs/modules/Module08-DataMergeReshape.html
@@ -0,0 +1,1587 @@
+<!DOCTYPE html>
+<html lang="en"><head>
+<script src="../site_libs/clipboard/clipboard.min.js"></script>
+<script src="../site_libs/quarto-html/tabby.min.js"></script>
+<script src="../site_libs/quarto-html/popper.min.js"></script>
+<script src="../site_libs/quarto-html/tippy.umd.min.js"></script>
+<link href="../site_libs/quarto-html/tippy.css" rel="stylesheet">
+<link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
+<link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
+<link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
+  <meta name="generator" content="quarto-1.3.353">
+
+  <meta name="author" content="Amy Winter">
+  <meta name="author" content="Zane Billings">
+  <title>SISMID Module NUMBER Materials (2025) - Module 8: Data Merging and Reshaping</title>
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/reset.css">
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/reveal.css">
+  <style>
+    code{white-space: pre-wrap;}
+    span.smallcaps{font-variant: small-caps;}
+    div.columns{display: flex; gap: min(4vw, 1.5em);}
+    div.column{flex: auto; overflow-x: auto;}
+    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+    ul.task-list{list-style: none;}
+    ul.task-list li input[type="checkbox"] {
+      width: 0.8em;
+      margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
+      vertical-align: middle;
+    }
+    /* CSS for syntax highlighting */
+    pre > code.sourceCode { white-space: pre; position: relative; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
+    pre > code.sourceCode > span:empty { height: 1.2em; }
+    .sourceCode { overflow: visible; }
+    code.sourceCode > span { color: inherit; text-decoration: inherit; }
+    div.sourceCode { margin: 1em 0; }
+    pre.sourceCode { margin: 0; }
+    @media screen {
+    div.sourceCode { overflow: auto; }
+    }
+    @media print {
+    pre > code.sourceCode { white-space: pre-wrap; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+    }
+    pre.numberSource code
+      { counter-reset: source-line 0; }
+    pre.numberSource code > span
+      { position: relative; left: -4em; counter-increment: source-line; }
+    pre.numberSource code > span > a:first-child::before
+      { content: counter(source-line);
+        position: relative; left: -1em; text-align: right; vertical-align: baseline;
+        border: none; display: inline-block;
+        -webkit-touch-callout: none; -webkit-user-select: none;
+        -khtml-user-select: none; -moz-user-select: none;
+        -ms-user-select: none; user-select: none;
+        padding: 0 4px; width: 4em;
+        color: #aaaaaa;
+      }
+    pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
+    div.sourceCode
+      { color: #003b4f; background-color: #f1f3f5; }
+    @media screen {
+    pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+    }
+    code span { color: #003b4f; } /* Normal */
+    code span.al { color: #ad0000; } /* Alert */
+    code span.an { color: #5e5e5e; } /* Annotation */
+    code span.at { color: #657422; } /* Attribute */
+    code span.bn { color: #ad0000; } /* BaseN */
+    code span.bu { } /* BuiltIn */
+    code span.cf { color: #003b4f; } /* ControlFlow */
+    code span.ch { color: #20794d; } /* Char */
+    code span.cn { color: #8f5902; } /* Constant */
+    code span.co { color: #5e5e5e; } /* Comment */
+    code span.cv { color: #5e5e5e; font-style: italic; } /* CommentVar */
+    code span.do { color: #5e5e5e; font-style: italic; } /* Documentation */
+    code span.dt { color: #ad0000; } /* DataType */
+    code span.dv { color: #ad0000; } /* DecVal */
+    code span.er { color: #ad0000; } /* Error */
+    code span.ex { } /* Extension */
+    code span.fl { color: #ad0000; } /* Float */
+    code span.fu { color: #4758ab; } /* Function */
+    code span.im { color: #00769e; } /* Import */
+    code span.in { color: #5e5e5e; } /* Information */
+    code span.kw { color: #003b4f; } /* Keyword */
+    code span.op { color: #5e5e5e; } /* Operator */
+    code span.ot { color: #003b4f; } /* Other */
+    code span.pp { color: #ad0000; } /* Preprocessor */
+    code span.sc { color: #5e5e5e; } /* SpecialChar */
+    code span.ss { color: #20794d; } /* SpecialString */
+    code span.st { color: #20794d; } /* String */
+    code span.va { color: #111111; } /* Variable */
+    code span.vs { color: #20794d; } /* VerbatimString */
+    code span.wa { color: #5e5e5e; font-style: italic; } /* Warning */
+  </style>
+  <link rel="stylesheet" href="../site_libs/revealjs/dist/theme/quarto.css">
+  <link href="../site_libs/revealjs/plugin/quarto-line-highlight/line-highlight.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/reveal-menu/menu.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/reveal-menu/quarto-menu.css" rel="stylesheet">
+  <link href="../site_libs/revealjs/plugin/quarto-support/footer.css" rel="stylesheet">
+  <style type="text/css">
+
+  .callout {
+    margin-top: 1em;
+    margin-bottom: 1em;  
+    border-radius: .25rem;
+  }
+
+  .callout.callout-style-simple { 
+    padding: 0em 0.5em;
+    border-left: solid #acacac .3rem;
+    border-right: solid 1px silver;
+    border-top: solid 1px silver;
+    border-bottom: solid 1px silver;
+    display: flex;
+  }
+
+  .callout.callout-style-default {
+    border-left: solid #acacac .3rem;
+    border-right: solid 1px silver;
+    border-top: solid 1px silver;
+    border-bottom: solid 1px silver;
+  }
+
+  .callout .callout-body-container {
+    flex-grow: 1;
+  }
+
+  .callout.callout-style-simple .callout-body {
+    font-size: 1rem;
+    font-weight: 400;
+  }
+
+  .callout.callout-style-default .callout-body {
+    font-size: 0.9rem;
+    font-weight: 400;
+  }
+
+  .callout.callout-titled.callout-style-simple .callout-body {
+    margin-top: 0.2em;
+  }
+
+  .callout:not(.callout-titled) .callout-body {
+      display: flex;
+  }
+
+  .callout:not(.no-icon).callout-titled.callout-style-simple .callout-content {
+    padding-left: 1.6em;
+  }
+
+  .callout.callout-titled .callout-header {
+    padding-top: 0.2em;
+    margin-bottom: -0.2em;
+  }
+
+  .callout.callout-titled .callout-title  p {
+    margin-top: 0.5em;
+    margin-bottom: 0.5em;
+  }
+    
+  .callout.callout-titled.callout-style-simple .callout-content  p {
+    margin-top: 0;
+  }
+
+  .callout.callout-titled.callout-style-default .callout-content  p {
+    margin-top: 0.7em;
+  }
+
+  .callout.callout-style-simple div.callout-title {
+    border-bottom: none;
+    font-size: .9rem;
+    font-weight: 600;
+    opacity: 75%;
+  }
+
+  .callout.callout-style-default  div.callout-title {
+    border-bottom: none;
+    font-weight: 600;
+    opacity: 85%;
+    font-size: 0.9rem;
+    padding-left: 0.5em;
+    padding-right: 0.5em;
+  }
+
+  .callout.callout-style-default div.callout-content {
+    padding-left: 0.5em;
+    padding-right: 0.5em;
+  }
+
+  .callout.callout-style-simple .callout-icon::before {
+    height: 1rem;
+    width: 1rem;
+    display: inline-block;
+    content: "";
+    background-repeat: no-repeat;
+    background-size: 1rem 1rem;
+  }
+
+  .callout.callout-style-default .callout-icon::before {
+    height: 0.9rem;
+    width: 0.9rem;
+    display: inline-block;
+    content: "";
+    background-repeat: no-repeat;
+    background-size: 0.9rem 0.9rem;
+  }
+
+  .callout-title {
+    display: flex
+  }
+    
+  .callout-icon::before {
+    margin-top: 1rem;
+    padding-right: .5rem;
+  }
+
+  .callout.no-icon::before {
+    display: none !important;
+  }
+
+  .callout.callout-titled .callout-body > .callout-content > :last-child {
+    margin-bottom: 0.5rem;
+  }
+
+  .callout.callout-titled .callout-icon::before {
+    margin-top: .5rem;
+    padding-right: .5rem;
+  }
+
+  .callout:not(.callout-titled) .callout-icon::before {
+    margin-top: 1rem;
+    padding-right: .5rem;
+  }
+
+  /* Callout Types */
+
+  div.callout-note {
+    border-left-color: #4582ec !important;
+  }
+
+  div.callout-note .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAEU0lEQVRYCcVXTWhcVRQ+586kSUMMxkyaElstCto2SIhitS5Ek8xUKV2poatCcVHtUlFQk8mbaaziwpWgglJwVaquitBOfhQXFlqlzSJpFSpIYyXNjBNiTCck7x2/8/LeNDOZxDuEkgOXe++553zfefee+/OYLOXFk3+1LLrRdiO81yNqZ6K9cG0P3MeFaMIQjXssE8Z1JzLO9ls20MBZX7oG8w9GxB0goaPrW5aNMp1yOZIa7Wv6o2ykpLtmAPs/vrG14Z+6d4jpbSKuhdcSyq9wGMPXjonwmESXrriLzFGOdDBLB8Y6MNYBu0dRokSygMA/mrun8MGFN3behm6VVAwg4WR3i6FvYK1T7MHo9BK7ydH+1uurECoouk5MPRyVSBrBHMYwVobG2aOXM07sWrn5qgB60rc6mcwIDJtQrnrEr44kmy+UO9r0u9O5/YbkS9juQckLed3DyW2XV/qWBBB3ptvI8EUY3I9p/67OW+g967TNr3Sotn3IuVlfMLVnsBwH4fsnebJvyGm5GeIUA3jljERmrv49SizPYuq+z7c2H/jlGC+Ghhupn/hcapqmcudB9jwJ/3jvnvu6vu5lVzF1fXyZuZZ7U8nRmVzytvT+H3kilYvH09mLWrQdwFSsFEsxFVs5fK7A0g8gMZjbif4ACpKbjv7gNGaD8bUrlk8x+KRflttr22JEMRUbTUwwDQScyzPgedQHZT0xnx7ujw2jfVfExwYHwOsDTjLdJ2ebmeQIlJ7neo41s/DrsL3kl+W2lWvAga0tR3zueGr6GL78M3ifH0rGXrBC2aAR8uYcIA5gwV8zIE8onoh8u0Fca/ciF7j1uOzEnqcIm59sEXoGc0+z6+H45V1CvAvHcD7THztu669cnp+L0okAeIc6zjbM/24LgGM1gZk7jnRu1aQWoU9sfUOuhrmtaPIO3YY1KLLWZaEO5TKUbMY5zx8W9UJ6elpLwKXbsaZ4EFl7B4bMtDv0iRipKoDQT2sNQI9b1utXFdYisi+wzZ/ri/1m7QfDgEuvgUUEIJPq3DhX/5DWNqIXDOweC2wvIR90Oq3lDpdMIgD2r0dXvGdsEW5H6x6HLRJYU7C69VefO1x8Gde1ZFSJLfWS1jbCnhtOPxmpfv2LXOA2Xk2tvnwKKPFuZ/oRmwBwqRQDcKNeVQkYcOjtWVBuM/JuYw5b6isojIkYxyYAFn5K7ZBF10fea52y8QltAg6jnMqNHFBmGkQ1j+U43HMi2xMar1Nv0zGsf1s8nUsmUtPOOrbFIR8bHFDMB5zL13Gmr/kGlCkUzedTzzmzsaJXhYawnA3UmARpiYj5ooJZiUoxFRtK3X6pgNPv+IZVPcnwbOl6f+aBaO1CNvPW9n9LmCp01nuSaTRF2YxHqZ8DYQT6WsXT+RD6eUztwYLZ8rM+rcPxamv1VQzFUkzFXvkiVrySGQgJNvXHJAxiU3/NwiC03rSf05VBaPtu/Z7/B8Yn/w7eguloAAAAAElFTkSuQmCC');
+  }
+
+  div.callout-note.callout-style-default .callout-title {
+    background-color: #dae6fb
+  }
+
+  div.callout-important {
+    border-left-color: #d9534f !important;
+  }
+
+  div.callout-important .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAEKklEQVRYCcVXTWhcVRS+575MJym48A+hSRFr00ySRQhURRfd2HYjk2SSTokuBCkU2o0LoSKKraKIBTcuFCoidGFD08nkBzdREbpQ1EDNIv8qSGMFUboImMSZd4/f9zJv8ibJMC8xJQfO3HPPPef7zrvvvnvviIkpC9nsw0UttFunbUhpFzFtarSd6WJkStVMw5xyVqYTvkwfzuf/5FgtkVoB0729j1rjXwThS7Vio+Mo6DNnvLfahoZ+i/o32lULuJ3NNiz7q6+pyAUkJaFF6JwaM2lUJlV0MlnQn5aTRbEu0SEqHUa0A4AdiGuB1kFXRfVyg5d87+Dg4DL6m2TLAub60ilj7A1Ec4odSAc8X95sHh7+ZRPCFo6Fnp7HfU/fBng/hi10CjCnWnJjsxvDNxWw0NfV6Rv5GgP3I3jGWXumdTD/3cbEOP2ZbOZp69yniG3FQ9z1jD7bnBu9Fc2tKGC2q+uAJOQHBDRiZX1x36o7fWBs7J9ownbtO+n0/qWkvW7UPIfc37WgT6ZGR++EOJyeQDSb9UB+DZ1G6DdLDzyS+b/kBCYGsYgJbSQHuThGKRcw5xdeQf8YdNHsc6ePXrlSYMBuSIAFTGAtQo+VuALo4BX83N190NWZWbynBjhOHsmNfFWLeL6v+ynsA58zDvvAC8j5PkbOcXCMg2PZFk3q8MjI7WAG/Dp9AwP7jdGBOOQkAvlFUB+irtm16I1Zw9YBcpGTGXYmk3kQIC/Cds55l+iMI3jqhjAuaoe+am2Jw5GT3Nbz3CkE12NavmzN5+erJW7046n/CH1RO/RVa8lBLozXk9uqykkGAyRXLWlLv5jyp4RFsG5vGVzpDLnIjTWgnRy2Rr+tDKvRc7Y8AyZq10jj8DqXdnIRNtFZb+t/ZRtXcDiVnzpqx8mPcDWxgARUqx0W1QB9MeUZiNrV4qP+Ehc+BpNgATsTX8ozYKL2NtFYAHc84fG7ndxUPr+AR/iQSns7uSUufAymwDOb2+NjK27lEFocm/EE2WpyIy/Hi66MWuMKJn8RvxIcj87IM5Vh9663ziW36kR0HNenXuxmfaD8JC7tfKbrhFr7LiZCrMjrzTeGx+PmkosrkNzW94ObzwocJ7A1HokLolY+AvkTiD/q1H0cN48c5EL8Crkttsa/AXQVDmutfyku0E7jShx49XqV3MFK8IryDhYVbj7Sj2P2eBxwcXoe8T8idsKKPRcnZw1b+slFTubwUwhktrfnAt7J++jwQtLZcm3sr9LQrjRzz6cfMv9aLvgmnAGvpoaGLxM4mAEaLV7iAzQ3oU0IvD5x9ix3yF2RAAuYAOO2f7PEFWCXZ4C9Pb2UsgDeVnFSpbFK7/IWu7TPTvBqzbGdCHOJQSxiEjt6IyZmxQyEJHv6xyQsYk//moVFsN2zP6fRImjfq7/n/wFDguUQFNEwugAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-important.callout-style-default .callout-title {
+    background-color: #f7dddc
+  }
+
+  div.callout-warning {
+    border-left-color: #f0ad4e !important;
+  }
+
+  div.callout-warning .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAAETklEQVRYCeVWW2gcVRg+58yaTUnizqbipZeX4uWhBEniBaoUX1Ioze52t7sRq6APio9V9MEaoWlVsFasRq0gltaAPuxms8lu0gcviE/FFOstVbSIxgcv6SU7EZqmdc7v9+9mJtNks51NTUH84ed889/PP+cmxP+d5FIbMJmNbpREu4WUkiTtCicKny0l1pIKmBzovF2S+hIJHX8iEu3hZJ5lNZGqyRrGSIQpq15AzF28jgpeY6yk6GVdrfFqdrD6Iw+QlB8g0YS2g7dyQmXM/IDhBhT0UCiRf59lfqmmDvzRt6kByV/m4JjtzuaujMUM2c5Z2d6JdKrRb3K2q6mA+oYVz8JnDdKPmmNthzkAk/lN63sYPgevrguc72aZX/L9C6x09GYyxBgCX4NlvyGUHOKELlm5rXeR1kchuChJt4SSwyddZRXgvwMGvYo4QSlk3/zkHD8UHxwVJA6zjZZqP8v8kK8OWLnIZtLyCAJagYC4rTGW/9Pqj92N/c+LUaAj27movwbi19tk/whRCIE7Q9vyI6yvRpftAKVTdUjOW40X3h5OXsKCdmFcx0xlLJoSuQngnrJe7Kcjm4OMq9FlC7CMmScQANuNvjfP3PjGXDBaUQmbp296S5L4DrpbrHN1T87ZVEZVCzg1FF0Ft+dKrlLukI+/c9ENo+TvlTDbYFvuKPtQ9+l052rXrgKoWkDAFnvh0wTOmYn8R5f4k/jN/fZiCM1tQx9jQQ4ANhqG4hiL0qIFTGViG9DKB7GYzgubnpofgYRwO+DFjh0Zin2m4b/97EDkXkc+f6xYAPX0KK2I/7fUQuwzuwo/L3AkcjugPNixC8cHf0FyPjWlItmLxWw4Ou9YsQCr5fijMGoD/zpdRy95HRysyXA74MWOnscpO4j2y3HAVisw85hX5+AFBRSHt4ShfLFkIMXTqyKFc46xdzQM6XbAi702a7sy04J0+feReMFKp5q9esYLCqAZYw/k14E/xcLLsFElaornTuJB0svMuJINy8xkIYuL+xPAlWRceH6+HX7THJ0djLUom46zREu7tTkxwmf/FdOZ/sh6Q8qvEAiHpm4PJ4a/doJe0gH1t+aHRgCzOvBvJedEK5OFE5jpm4AGP2a8Dxe3gGJ/pAutug9Gp6he92CsSsWBaEcxGx0FHytmIpuqGkOpldqNYQK8cSoXvd+xLxXADw0kf6UkJNFtdo5MOgaLjiQOQHcn+A6h5NuL2s0qsC2LOM75PcF3yr5STuBSAcGG+meA14K/CI21HcS4LBT6tv0QAh8Dr5l93AhZzG5ZJ4VxAqdZUEl9z7WJ4aN+svMvwHHL21UKTd1mqvChH7/Za5xzXBBKrUcB0TQ+Ulgkfbi/H/YT5EptrGzsEK7tR1B7ln9BBwckYfMiuSqklSznIuoIIOM42MQO+QnduCoFCI0bpkzjCjddHPN/F+2Yu+sd9bKNpVwHhbS3LluK/0zgfwD0xYI5dXuzlQAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-warning.callout-style-default .callout-title {
+    background-color: #fcefdc
+  }
+
+  div.callout-tip {
+    border-left-color: #02b875 !important;
+  }
+
+  div.callout-tip .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAADr0lEQVRYCe1XTWgTQRj9ZjZV8a9SPIkKgj8I1bMHsUWrqYLVg4Ue6v9BwZOxSYsIerFao7UiUryIqJcqgtpimhbBXoSCVxUFe9CTiogUrUp2Pt+3aUI2u5vdNh4dmMzOzHvvezuz8xNFM0mjnbXaNu1MvFWRXkXEyE6aYOYJpdW4IXuA4r0fo8qqSMDBU0v1HJUgVieAXxzCsdE/YJTdFcVIZQNMyhruOMJKXYFoLfIfIvVIMWdsrd+Rpd86ZmyzzjJmLStqRn0v8lzkb4rVIXvnpScOJuAn2ACC65FkPzEdEy4TPWRLJ2h7z4cArXzzaOdKlbOvKKX25Wl00jSnrwVxAg3o4dRxhO13RBSdNvH0xSARv3adTXbBdTf64IWO2vH0LT+cv4GR1DJt+DUItaQogeBX/chhbTBxEiZ6gftlDNXTrvT7co4ub5A6gp9HIcHvzTa46OS5fBeP87Qm0fQkr4FsYgVQ7Qg+ZayaDg9jhg1GkWj8RG6lkeSacrrHgDaxdoBiZPg+NXV/KifMuB6//JmYH4CntVEHy/keA6x4h4CU5oFy8GzrBS18cLJMXcljAKB6INjWsRcuZBWVaS3GDrqB7rdapVIeA+isQ57Eev9eCqzqOa81CY05VLd6SamW2wA2H3SiTbnbSxmzfp7WtKZkqy4mdyAlGx7ennghYf8voqp9cLSgKdqNfa6RdRsAAkPwRuJZNbpByn+RrJi1RXTwdi8RQF6ymDwGMAtZ6TVE+4uoKh+MYkcLsT0Hk8eAienbiGdjJHZTpmNjlbFJNKDVAp2fJlYju6IreQxQ08UJDNYdoLSl6AadO+fFuCQqVMB1NJwPm69T04Wv5WhfcWyfXQB+wXRs1pt+nCknRa0LVzSA/2B+a9+zQJadb7IyyV24YAxKp2Jqs3emZTuNnKxsah+uabKbMk7CbTgJx/zIgQYErIeTKRQ9yD9wxVof5YolPHqaWo7TD6tJlh7jQnK5z2n3+fGdggIOx2kaa2YI9QWarc5Ce1ipNWMKeSG4DysFF52KBmTNMmn5HqCFkwy34rDg05gDwgH3bBi+sgFhN/e8QvRn8kbamCOhgrZ9GJhFDgfcMHzFb6BAtjKpFhzTjwv1KCVuxHvCbsSiEz4CANnj84cwHdFXAbAOJ4LTSAawGWFn5tDhLMYz6nWeU2wJfIhmIJBefcd/A5FWQWGgrWzyORZ3Q6HuV+Jf0Bj+BTX69fm1zWgK7By1YTXchFDORywnfQ7GpzOo6S+qECrsx2ifVQAAAABJRU5ErkJggg==');
+  }
+
+  div.callout-tip.callout-style-default .callout-title {
+    background-color: #ccf1e3
+  }
+
+  div.callout-caution {
+    border-left-color: #fd7e14 !important;
+  }
+
+  div.callout-caution .callout-icon::before {
+    background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAIKADAAQAAAABAAAAIAAAAACshmLzAAACV0lEQVRYCdVWzWoUQRCuqp2ICBLJXgITZL1EfQDBW/bkzUMUD7klD+ATSHBEfAIfQO+iXsWDxJsHL96EHAwhgzlkg8nBg25XWb0zIb0zs9muYYWkoKeru+vn664fBqElyZNuyh167NXJ8Ut8McjbmEraKHkd7uAnAFku+VWdb3reSmRV8PKSLfZ0Gjn3a6Xlcq9YGb6tADjn+lUfTXtVmaZ1KwBIvFI11rRXlWlatwIAAv2asaa9mlB9wwygiDX26qaw1yYPzFXg2N1GgG0FMF8Oj+VIx7E/03lHx8UhvYyNZLN7BwSPgekXXLribw7w5/c8EF+DBK5idvDVYtEEwMeYefjjLAdEyQ3M9nfOkgnPTEkYU+sxMq0BxNR6jExrAI31H1rzvLEfRIdgcv1XEdj6QTQAS2wtstEALLG1yEZ3QhH6oDX7ExBSFEkFINXH98NTrme5IOaaA7kIfiu2L8A3qhH9zRbukdCqdsA98TdElyeMe5BI8Rs2xHRIsoTSSVFfCFCWGPn9XHb4cdobRIWABNf0add9jakDjQJpJ1bTXOJXnnRXHRf+dNL1ZV1MBRCXhMbaHqGI1JkKIL7+i8uffuP6wVQAzO7+qVEbF6NbS0LJureYcWXUUhH66nLR5rYmva+2tjRFtojkM2aD76HEGAD3tPtKM309FJg5j/K682ywcWJ3PASCcycH/22u+Bh7Aa0ehM2Fu4z0SAE81HF9RkB21c5bEn4Dzw+/qNOyXr3DCTQDMBOdhi4nAgiFDGCinIa2owCEChUwD8qzd03PG+qdW/4fDzjUMcE1ZpIAAAAASUVORK5CYII=');
+  }
+
+  div.callout-caution.callout-style-default .callout-title {
+    background-color: #ffe5d0
+  }
+
+  </style>
+  <style type="text/css">
+    .reveal div.sourceCode {
+      margin: 0;
+      overflow: auto;
+    }
+    .reveal div.hanging-indent {
+      margin-left: 1em;
+      text-indent: -1em;
+    }
+    .reveal .slide:not(.center) {
+      height: 100%;
+      overflow-y: auto;
+    }
+    .reveal .slide.scrollable {
+      overflow-y: auto;
+    }
+    .reveal .footnotes {
+      height: 100%;
+      overflow-y: auto;
+    }
+    .reveal .slide .absolute {
+      position: absolute;
+      display: block;
+    }
+    .reveal .footnotes ol {
+      counter-reset: ol;
+      list-style-type: none; 
+      margin-left: 0;
+    }
+    .reveal .footnotes ol li:before {
+      counter-increment: ol;
+      content: counter(ol) ". "; 
+    }
+    .reveal .footnotes ol li > p:first-child {
+      display: inline-block;
+    }
+    .reveal .slide ul,
+    .reveal .slide ol {
+      margin-bottom: 0.5em;
+    }
+    .reveal .slide ul li,
+    .reveal .slide ol li {
+      margin-top: 0.4em;
+      margin-bottom: 0.2em;
+    }
+    .reveal .slide ul[role="tablist"] li {
+      margin-bottom: 0;
+    }
+    .reveal .slide ul li > *:first-child,
+    .reveal .slide ol li > *:first-child {
+      margin-block-start: 0;
+    }
+    .reveal .slide ul li > *:last-child,
+    .reveal .slide ol li > *:last-child {
+      margin-block-end: 0;
+    }
+    .reveal .slide .columns:nth-child(3) {
+      margin-block-start: 0.8em;
+    }
+    .reveal blockquote {
+      box-shadow: none;
+    }
+    .reveal .tippy-content>* {
+      margin-top: 0.2em;
+      margin-bottom: 0.7em;
+    }
+    .reveal .tippy-content>*:last-child {
+      margin-bottom: 0.2em;
+    }
+    .reveal .slide > img.stretch.quarto-figure-center,
+    .reveal .slide > img.r-stretch.quarto-figure-center {
+      display: block;
+      margin-left: auto;
+      margin-right: auto; 
+    }
+    .reveal .slide > img.stretch.quarto-figure-left,
+    .reveal .slide > img.r-stretch.quarto-figure-left  {
+      display: block;
+      margin-left: 0;
+      margin-right: auto; 
+    }
+    .reveal .slide > img.stretch.quarto-figure-right,
+    .reveal .slide > img.r-stretch.quarto-figure-right  {
+      display: block;
+      margin-left: auto;
+      margin-right: 0; 
+    }
+  </style>
+</head>
+<body class="quarto-light">
+  <div class="reveal">
+    <div class="slides">
+
+<section id="title-slide" class="quarto-title-block center">
+  <h1 class="title">Module 8: Data Merging and Reshaping</h1>
+
+<div class="quarto-title-authors">
+<div class="quarto-title-author">
+<div class="quarto-title-author-name">
+<a href="https://publichealth.uga.edu/faculty-member/amy-k-winter/">Amy Winter</a> <a href="https://orcid.org/0000-0003-2737-7003" class="quarto-title-author-orcid"> <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA2ZpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVzb3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1NOk9yaWdpbmFsRG9jdW1lbnRJRD0ieG1wLmRpZDo1N0NEMjA4MDI1MjA2ODExOTk0QzkzNTEzRjZEQTg1NyIgeG1wTU06RG9jdW1lbnRJRD0ieG1wLmRpZDozM0NDOEJGNEZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wTU06SW5zdGFuY2VJRD0ieG1wLmlpZDozM0NDOEJGM0ZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wOkNyZWF0b3JUb29sPSJBZG9iZSBQaG90b3Nob3AgQ1M1IE1hY2ludG9zaCI+IDx4bXBNTTpEZXJpdmVkRnJvbSBzdFJlZjppbnN0YW5jZUlEPSJ4bXAuaWlkOkZDN0YxMTc0MDcyMDY4MTE5NUZFRDc5MUM2MUUwNEREIiBzdFJlZjpkb2N1bWVudElEPSJ4bXAuZGlkOjU3Q0QyMDgwMjUyMDY4MTE5OTRDOTM1MTNGNkRBODU3Ii8+IDwvcmRmOkRlc2NyaXB0aW9uPiA8L3JkZjpSREY+IDwveDp4bXBtZXRhPiA8P3hwYWNrZXQgZW5kPSJyIj8+84NovQAAAR1JREFUeNpiZEADy85ZJgCpeCB2QJM6AMQLo4yOL0AWZETSqACk1gOxAQN+cAGIA4EGPQBxmJA0nwdpjjQ8xqArmczw5tMHXAaALDgP1QMxAGqzAAPxQACqh4ER6uf5MBlkm0X4EGayMfMw/Pr7Bd2gRBZogMFBrv01hisv5jLsv9nLAPIOMnjy8RDDyYctyAbFM2EJbRQw+aAWw/LzVgx7b+cwCHKqMhjJFCBLOzAR6+lXX84xnHjYyqAo5IUizkRCwIENQQckGSDGY4TVgAPEaraQr2a4/24bSuoExcJCfAEJihXkWDj3ZAKy9EJGaEo8T0QSxkjSwORsCAuDQCD+QILmD1A9kECEZgxDaEZhICIzGcIyEyOl2RkgwAAhkmC+eAm0TAAAAABJRU5ErkJggg=="></a>
+</div>
+</div>
+<div class="quarto-title-author">
+<div class="quarto-title-author-name">
+<a href="https://wzbillings.com/">Zane Billings</a> <a href="https://orcid.org/0000-0002-0184-6134" class="quarto-title-author-orcid"> <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA2ZpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wTU09Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9tbS8iIHhtbG5zOnN0UmVmPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvc1R5cGUvUmVzb3VyY2VSZWYjIiB4bWxuczp4bXA9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC8iIHhtcE1NOk9yaWdpbmFsRG9jdW1lbnRJRD0ieG1wLmRpZDo1N0NEMjA4MDI1MjA2ODExOTk0QzkzNTEzRjZEQTg1NyIgeG1wTU06RG9jdW1lbnRJRD0ieG1wLmRpZDozM0NDOEJGNEZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wTU06SW5zdGFuY2VJRD0ieG1wLmlpZDozM0NDOEJGM0ZGNTcxMUUxODdBOEVCODg2RjdCQ0QwOSIgeG1wOkNyZWF0b3JUb29sPSJBZG9iZSBQaG90b3Nob3AgQ1M1IE1hY2ludG9zaCI+IDx4bXBNTTpEZXJpdmVkRnJvbSBzdFJlZjppbnN0YW5jZUlEPSJ4bXAuaWlkOkZDN0YxMTc0MDcyMDY4MTE5NUZFRDc5MUM2MUUwNEREIiBzdFJlZjpkb2N1bWVudElEPSJ4bXAuZGlkOjU3Q0QyMDgwMjUyMDY4MTE5OTRDOTM1MTNGNkRBODU3Ii8+IDwvcmRmOkRlc2NyaXB0aW9uPiA8L3JkZjpSREY+IDwveDp4bXBtZXRhPiA8P3hwYWNrZXQgZW5kPSJyIj8+84NovQAAAR1JREFUeNpiZEADy85ZJgCpeCB2QJM6AMQLo4yOL0AWZETSqACk1gOxAQN+cAGIA4EGPQBxmJA0nwdpjjQ8xqArmczw5tMHXAaALDgP1QMxAGqzAAPxQACqh4ER6uf5MBlkm0X4EGayMfMw/Pr7Bd2gRBZogMFBrv01hisv5jLsv9nLAPIOMnjy8RDDyYctyAbFM2EJbRQw+aAWw/LzVgx7b+cwCHKqMhjJFCBLOzAR6+lXX84xnHjYyqAo5IUizkRCwIENQQckGSDGY4TVgAPEaraQr2a4/24bSuoExcJCfAEJihXkWDj3ZAKy9EJGaEo8T0QSxkjSwORsCAuDQCD+QILmD1A9kECEZgxDaEZhICIzGcIyEyOl2RkgwAAhkmC+eAm0TAAAAABJRU5ErkJggg=="></a>
+</div>
+</div>
+</div>
+
+</section>
+<section id="learning-objectives" class="slide level2">
+<h2>Learning Objectives</h2>
+<p>After module 8, you should be able to…</p>
+<ul>
+<li>Merge/join data together</li>
+<li>Reshape data from wide to long</li>
+<li>Reshape data from long to wide</li>
+</ul>
+</section>
+<section id="joining-types" class="slide level2">
+<h2>Joining types</h2>
+<p>Pay close attention to the number of rows in your data set before and after a join. This will help flag when an issue has arisen. This will depend on the type of merge:</p>
+<ul>
+<li>1:1 merge (one-to-one merge) – Simplest merge (sometimes things go wrong)</li>
+<li>1:m merge (one-to-many merge) – More complex (things often go wrong)
+<ul>
+<li>The “one” suggests that one dataset has the merging variable (e.g., id) each represented once and the “many” implies that one dataset has the merging variable represented multiple times</li>
+</ul></li>
+<li>m:m merge (many-to-many merge) – Danger zone (can be unpredictable)</li>
+</ul>
+</section>
+<section id="one-to-one-merge" class="slide level2">
+<h2>one-to-one merge</h2>
+<ul>
+<li>This means that each row of data represents a unique unit of analysis that exists in another dataset (e.g,. id variable)</li>
+<li>Will likely have variables that don’t exist in the current dataset (that’s why you are trying to merge it in)</li>
+<li>The merging variable (e.g., id) each represented a single time</li>
+<li>You should try to structure your data so that a 1:1 merge or 1:m merge is possible so that fewer things can go wrong.</li>
+</ul>
+</section>
+<section id="merge-function" class="slide level2">
+<h2><code>merge()</code> function</h2>
+<p>We will use the <code>merge()</code> function to conduct one-to-one merge</p>
+<pre><code>Registered S3 method overwritten by 'printr':
+  method                from     
+  knit_print.data.frame rmarkdown</code></pre>
+<p>Merge Two Data Frames</p>
+<p>Description:</p>
+<pre><code> Merge two data frames by common columns or row names, or do other
+ versions of database _join_ operations.</code></pre>
+<p>Usage:</p>
+<pre><code> merge(x, y, ...)
+ 
+ ## Default S3 method:
+ merge(x, y, ...)
+ 
+ ## S3 method for class 'data.frame'
+ merge(x, y, by = intersect(names(x), names(y)),
+       by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
+       sort = TRUE, suffixes = c(".x",".y"), no.dups = TRUE,
+       incomparables = NULL, ...)
+ </code></pre>
+<p>Arguments:</p>
+<pre><code>x, y: data frames, or objects to be coerced to one.</code></pre>
+<p>by, by.x, by.y: specifications of the columns used for merging. See ‘Details’.</p>
+<pre><code> all: logical; 'all = L' is shorthand for 'all.x = L' and 'all.y =
+      L', where 'L' is either 'TRUE' or 'FALSE'.</code></pre>
+<p>all.x: logical; if ‘TRUE’, then extra rows will be added to the output, one for each row in ‘x’ that has no matching row in ‘y’. These rows will have ‘NA’s in those columns that are usually filled with values from ’y’. The default is ‘FALSE’, so that only rows with data from both ‘x’ and ‘y’ are included in the output.</p>
+<p>all.y: logical; analogous to ‘all.x’.</p>
+<pre><code>sort: logical.  Should the result be sorted on the 'by' columns?</code></pre>
+<p>suffixes: a character vector of length 2 specifying the suffixes to be used for making unique the names of columns in the result which are not used for merging (appearing in ‘by’ etc).</p>
+<p>no.dups: logical indicating that ‘suffixes’ are appended in more cases to avoid duplicated column names in the result. This was implicitly false before R version 3.5.0.</p>
+<p>incomparables: values which cannot be matched. See ‘match’. This is intended to be used for merging on one column, so these are incomparable values of that column.</p>
+<pre><code> ...: arguments to be passed to or from methods.</code></pre>
+<p>Details:</p>
+<pre><code> 'merge' is a generic function whose principal method is for data
+ frames: the default method coerces its arguments to data frames
+ and calls the '"data.frame"' method.
+
+ By default the data frames are merged on the columns with names
+ they both have, but separate specifications of the columns can be
+ given by 'by.x' and 'by.y'.  The rows in the two data frames that
+ match on the specified columns are extracted, and joined together.
+ If there is more than one match, all possible matches contribute
+ one row each.  For the precise meaning of 'match', see 'match'.
+
+ Columns to merge on can be specified by name, number or by a
+ logical vector: the name '"row.names"' or the number '0' specifies
+ the row names.  If specified by name it must correspond uniquely
+ to a named column in the input.
+
+ If 'by' or both 'by.x' and 'by.y' are of length 0 (a length zero
+ vector or 'NULL'), the result, 'r', is the _Cartesian product_ of
+ 'x' and 'y', i.e., 'dim(r) = c(nrow(x)*nrow(y), ncol(x) +
+ ncol(y))'.
+
+ If 'all.x' is true, all the non matching cases of 'x' are appended
+ to the result as well, with 'NA' filled in the corresponding
+ columns of 'y'; analogously for 'all.y'.
+
+ If the columns in the data frames not used in merging have any
+ common names, these have 'suffixes' ('".x"' and '".y"' by default)
+ appended to try to make the names of the result unique.  If this
+ is not possible, an error is thrown.
+
+ If a 'by.x' column name matches one of 'y', and if 'no.dups' is
+ true (as by default), the y version gets suffixed as well,
+ avoiding duplicate column names in the result.
+
+ The complexity of the algorithm used is proportional to the length
+ of the answer.
+
+ In SQL database terminology, the default value of 'all = FALSE'
+ gives a _natural join_, a special case of an _inner join_.
+ Specifying 'all.x = TRUE' gives a _left (outer) join_, 'all.y =
+ TRUE' a _right (outer) join_, and both ('all = TRUE') a _(full)
+ outer join_.  DBMSes do not match 'NULL' records, equivalent to
+ 'incomparables = NA' in R.</code></pre>
+<p>Value:</p>
+<pre><code> A data frame.  The rows are by default lexicographically sorted on
+ the common columns, but for 'sort = FALSE' are in an unspecified
+ order.  The columns are the common columns followed by the
+ remaining columns in 'x' and then those in 'y'.  If the matching
+ involved row names, an extra character column called 'Row.names'
+ is added at the left, and in all cases the result has 'automatic'
+ row names.</code></pre>
+<p>Note:</p>
+<pre><code> This is intended to work with data frames with vector-like
+ columns: some aspects work with data frames containing matrices,
+ but not all.
+
+ Currently long vectors are not accepted for inputs, which are thus
+ restricted to less than 2^31 rows. That restriction also applies
+ to the result for 32-bit platforms.</code></pre>
+<p>See Also:</p>
+<pre><code> 'data.frame', 'by', 'cbind'.
+
+ 'dendrogram' for a class which has a 'merge' method.</code></pre>
+<p>Examples:</p>
+<pre><code> authors &lt;- data.frame(
+     ## I(*) : use character columns of names to get sensible sort order
+     surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
+     nationality = c("US", "Australia", "US", "UK", "Australia"),
+     deceased = c("yes", rep("no", 4)))
+ authorN &lt;- within(authors, { name &lt;- surname; rm(surname) })
+ books &lt;- data.frame(
+     name = I(c("Tukey", "Venables", "Tierney",
+              "Ripley", "Ripley", "McNeil", "R Core")),
+     title = c("Exploratory Data Analysis",
+               "Modern Applied Statistics ...",
+               "LISP-STAT",
+               "Spatial Statistics", "Stochastic Simulation",
+               "Interactive Data Analysis",
+               "An Introduction to R"),
+     other.author = c(NA, "Ripley", NA, NA, NA, NA,
+                      "Venables &amp; Smith"))
+ 
+ (m0 &lt;- merge(authorN, books))
+ (m1 &lt;- merge(authors, books, by.x = "surname", by.y = "name"))
+  m2 &lt;- merge(books, authors, by.x = "name", by.y = "surname")
+ stopifnot(exprs = {
+    identical(m0, m2[, names(m0)])
+    as.character(m1[, 1]) == as.character(m2[, 1])
+    all.equal(m1[, -1], m2[, -1][ names(m1)[-1] ])
+    identical(dim(merge(m1, m2, by = NULL)),
+              c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))
+ })
+ 
+ ## "R core" is missing from authors and appears only here :
+ merge(authors, books, by.x = "surname", by.y = "name", all = TRUE)
+ 
+ 
+ ## example of using 'incomparables'
+ x &lt;- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)
+ y &lt;- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)
+ merge(x, y, by = c("k1","k2")) # NA's match
+ merge(x, y, by = "k1") # NA's match, so 6 rows
+ merge(x, y, by = "k2", incomparables = NA) # 2 rows</code></pre>
+</section>
+<section id="lets-import-the-new-data-we-want-to-merge-and-take-a-look" class="slide level2">
+<h2>Lets import the new data we want to merge and take a look</h2>
+<p>The new data <code>serodata_new.csv</code> represents a follow-up serological survey four years later. At this follow-up individuals were retested for IgG antibody concentrations and their ages were collected.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a>df_new <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">"data/serodata_new.csv"</span>)</span>
+<span id="cb13-2"><a href="#cb13-2"></a><span class="fu">str</span>(df_new)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>'data.frame':   636 obs. of  3 variables:
+ $ observation_id   : int  5772 8095 9784 9338 6369 6885 6252 8913 7332 6941 ...
+ $ IgG_concentration: num  0.261 2.981 0.282 136.638 0.381 ...
+ $ age              : int  6 8 8 8 5 8 8 NA 8 6 ...</code></pre>
+</div>
+<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1"></a><span class="fu">summary</span>(df_new)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: left;"></th>
+<th style="text-align: left;">observation_id</th>
+<th style="text-align: left;">IgG_concentration</th>
+<th style="text-align: left;">age</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">Min. :5006</td>
+<td style="text-align: left;">Min. : 0.0051</td>
+<td style="text-align: left;">Min. : 5.00</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">1st Qu.:6328</td>
+<td style="text-align: left;">1st Qu.: 0.2751</td>
+<td style="text-align: left;">1st Qu.: 7.00</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">Median :7494</td>
+<td style="text-align: left;">Median : 1.5477</td>
+<td style="text-align: left;">Median :10.00</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">Mean :7490</td>
+<td style="text-align: left;">Mean : 82.7684</td>
+<td style="text-align: left;">Mean :10.63</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">3rd Qu.:8736</td>
+<td style="text-align: left;">3rd Qu.:129.6389</td>
+<td style="text-align: left;">3rd Qu.:14.00</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">Max. :9982</td>
+<td style="text-align: left;">Max. :950.6590</td>
+<td style="text-align: left;">Max. :19.00</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;"></td>
+<td style="text-align: left;">NA</td>
+<td style="text-align: left;">NA</td>
+<td style="text-align: left;">NA’s :9</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
+</section>
+<section id="merge-the-new-data-with-the-original-data" class="slide level2">
+<h2>Merge the new data with the original data</h2>
+<p>Lets load the old data as well and look for a variable, or variables, to merge by.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1"></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">"data/serodata.csv"</span>)</span>
+<span id="cb16-2"><a href="#cb16-2"></a><span class="fu">colnames</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>[1] "observation_id"    "IgG_concentration" "age"              
+[4] "gender"            "slum"             </code></pre>
+</div>
+</div>
+<p>We notice that <code>observation_id</code> seems to be the obvious variable by which to merge. However, we also realize that <code>IgG_concentration</code> and <code>age</code> are the exact same names. If we merge now we see that</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1"></a><span class="fu">head</span>(<span class="fu">merge</span>(df, df_new, <span class="at">all.x=</span>T, <span class="at">all.y=</span>T, <span class="at">by=</span><span class="fu">c</span>(<span class="st">'observation_id'</span>)))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<colgroup>
+<col style="width: 18%">
+<col style="width: 24%">
+<col style="width: 7%">
+<col style="width: 8%">
+<col style="width: 10%">
+<col style="width: 24%">
+<col style="width: 7%">
+</colgroup>
+<thead>
+<tr class="header">
+<th style="text-align: right;">observation_id</th>
+<th style="text-align: right;">IgG_concentration.x</th>
+<th style="text-align: right;">age.x</th>
+<th style="text-align: left;">gender</th>
+<th style="text-align: left;">slum</th>
+<th style="text-align: right;">IgG_concentration.y</th>
+<th style="text-align: right;">age.y</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: right;">5006</td>
+<td style="text-align: right;">164.2979452</td>
+<td style="text-align: right;">7</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">155.5811325</td>
+<td style="text-align: right;">11</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">5024</td>
+<td style="text-align: right;">0.3000000</td>
+<td style="text-align: right;">5</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">0.2918605</td>
+<td style="text-align: right;">9</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">5026</td>
+<td style="text-align: right;">0.3000000</td>
+<td style="text-align: right;">10</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">0.2542945</td>
+<td style="text-align: right;">14</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">5030</td>
+<td style="text-align: right;">0.0555556</td>
+<td style="text-align: right;">7</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">0.0533262</td>
+<td style="text-align: right;">11</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">5035</td>
+<td style="text-align: right;">26.2112514</td>
+<td style="text-align: right;">11</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">22.0159300</td>
+<td style="text-align: right;">15</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">5054</td>
+<td style="text-align: right;">0.3000000</td>
+<td style="text-align: right;">3</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">0.2709671</td>
+<td style="text-align: right;">7</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
+</section>
+<section id="merge-the-new-data-with-the-original-data-1" class="slide level2">
+<h2>Merge the new data with the original data</h2>
+<p>The first option is to rename the <code>IgG_concentration</code> and <code>age</code> variables before the merge, so that it is clear which is time point 1 and time point 2.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a href="#cb19-1"></a>df<span class="sc">$</span>IgG_concentration_time1 <span class="ot">&lt;-</span> df<span class="sc">$</span>IgG_concentration</span>
+<span id="cb19-2"><a href="#cb19-2"></a>df<span class="sc">$</span>age_time1 <span class="ot">&lt;-</span> df<span class="sc">$</span>age</span>
+<span id="cb19-3"><a href="#cb19-3"></a>df<span class="sc">$</span>IgG_concentration <span class="ot">&lt;-</span> df<span class="sc">$</span>age <span class="ot">&lt;-</span> <span class="cn">NULL</span> <span class="co">#remove the original variables</span></span>
+<span id="cb19-4"><a href="#cb19-4"></a></span>
+<span id="cb19-5"><a href="#cb19-5"></a>df_new<span class="sc">$</span>IgG_concentration_time2 <span class="ot">&lt;-</span> df_new<span class="sc">$</span>IgG_concentration</span>
+<span id="cb19-6"><a href="#cb19-6"></a>df_new<span class="sc">$</span>age_time2 <span class="ot">&lt;-</span> df_new<span class="sc">$</span>age</span>
+<span id="cb19-7"><a href="#cb19-7"></a>df_new<span class="sc">$</span>IgG_concentration <span class="ot">&lt;-</span> df_new<span class="sc">$</span>age <span class="ot">&lt;-</span> <span class="cn">NULL</span> <span class="co">#remove the original variables</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<p>Now, lets merge.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1"></a>df_all_wide <span class="ot">&lt;-</span> <span class="fu">merge</span>(df, df_new, <span class="at">all.x=</span>T, <span class="at">all.y=</span>T, <span class="at">by=</span><span class="fu">c</span>(<span class="st">'observation_id'</span>))</span>
+<span id="cb20-2"><a href="#cb20-2"></a><span class="fu">str</span>(df_all_wide)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>'data.frame':   651 obs. of  7 variables:
+ $ observation_id         : int  5006 5024 5026 5030 5035 5054 5057 5063 5064 5080 ...
+ $ gender                 : chr  "Male" "Female" "Female" "Female" ...
+ $ slum                   : chr  "Non slum" "Non slum" "Non slum" "Non slum" ...
+ $ IgG_concentration_time1: num  164.2979 0.3 0.3 0.0556 26.2113 ...
+ $ age_time1              : int  7 5 10 7 11 3 3 12 14 6 ...
+ $ IgG_concentration_time2: num  155.5811 0.2919 0.2543 0.0533 22.0159 ...
+ $ age_time2              : int  11 9 14 11 15 7 7 16 18 10 ...</code></pre>
+</div>
+</div>
+</section>
+<section id="merge-the-new-data-with-the-original-data-2" class="slide level2">
+<h2>Merge the new data with the original data</h2>
+<p>The second option is to add a time variable to the two data sets and then merge by <code>observation_id</code>,<code>time</code>,<code>age</code>,<code>IgG_concentration</code>. Note, I need to read in the data again b/c I removed the <code>IgG_concentration</code> and <code>age</code> variables.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb22"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb22-1"><a href="#cb22-1"></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">"data/serodata.csv"</span>)</span>
+<span id="cb22-2"><a href="#cb22-2"></a>df_new <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="st">"data/serodata_new.csv"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1"></a>df<span class="sc">$</span>time <span class="ot">&lt;-</span> <span class="dv">1</span> <span class="co">#you can put in one number and it will repeat it</span></span>
+<span id="cb23-2"><a href="#cb23-2"></a>df_new<span class="sc">$</span>time <span class="ot">&lt;-</span> <span class="dv">2</span></span>
+<span id="cb23-3"><a href="#cb23-3"></a><span class="fu">head</span>(df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: right;">observation_id</th>
+<th style="text-align: right;">IgG_concentration</th>
+<th style="text-align: right;">age</th>
+<th style="text-align: left;">gender</th>
+<th style="text-align: left;">slum</th>
+<th style="text-align: right;">time</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: right;">5772</td>
+<td style="text-align: right;">0.3176895</td>
+<td style="text-align: right;">2</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">8095</td>
+<td style="text-align: right;">3.4368231</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Female</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">9784</td>
+<td style="text-align: right;">0.3000000</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">9338</td>
+<td style="text-align: right;">143.2363014</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">6369</td>
+<td style="text-align: right;">0.4476534</td>
+<td style="text-align: right;">1</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">6885</td>
+<td style="text-align: right;">0.0252708</td>
+<td style="text-align: right;">4</td>
+<td style="text-align: left;">Male</td>
+<td style="text-align: left;">Non slum</td>
+<td style="text-align: right;">1</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div class="sourceCode cell-code" id="cb24"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1"><a href="#cb24-1"></a><span class="fu">head</span>(df_new)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: right;">observation_id</th>
+<th style="text-align: right;">IgG_concentration</th>
+<th style="text-align: right;">age</th>
+<th style="text-align: right;">time</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: right;">5772</td>
+<td style="text-align: right;">0.2612388</td>
+<td style="text-align: right;">6</td>
+<td style="text-align: right;">2</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">8095</td>
+<td style="text-align: right;">2.9809049</td>
+<td style="text-align: right;">8</td>
+<td style="text-align: right;">2</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">9784</td>
+<td style="text-align: right;">0.2819489</td>
+<td style="text-align: right;">8</td>
+<td style="text-align: right;">2</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">9338</td>
+<td style="text-align: right;">136.6382260</td>
+<td style="text-align: right;">8</td>
+<td style="text-align: right;">2</td>
+</tr>
+<tr class="odd">
+<td style="text-align: right;">6369</td>
+<td style="text-align: right;">0.3810119</td>
+<td style="text-align: right;">5</td>
+<td style="text-align: right;">2</td>
+</tr>
+<tr class="even">
+<td style="text-align: right;">6885</td>
+<td style="text-align: right;">0.0245951</td>
+<td style="text-align: right;">8</td>
+<td style="text-align: right;">2</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
+<p>Now, lets merge. Note, “By default the data frames are merged on the columns with names they both have” therefore if I don’t specify the by argument it will merge on all matching variables.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a>df_all_long <span class="ot">&lt;-</span> <span class="fu">merge</span>(df, df_new, <span class="at">all.x=</span>T, <span class="at">all.y=</span>T) </span>
+<span id="cb25-2"><a href="#cb25-2"></a><span class="fu">str</span>(df_all_long)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>'data.frame':   1287 obs. of  6 variables:
+ $ observation_id   : int  5006 5006 5024 5024 5026 5026 5030 5030 5035 5035 ...
+ $ IgG_concentration: num  155.581 164.298 0.292 0.3 0.254 ...
+ $ age              : int  11 7 9 5 14 10 11 7 15 11 ...
+ $ time             : num  2 1 2 1 2 1 2 1 2 1 ...
+ $ gender           : chr  NA "Male" NA "Female" ...
+ $ slum             : chr  NA "Non slum" NA "Non slum" ...</code></pre>
+</div>
+</div>
+<p>Note, there are 1287 rows, which is the sum of the number of rows of <code>df</code> (651 rows) and <code>df_new</code> (636 rows)</p>
+</section>
+<section id="what-is-widelong-data" class="slide level2">
+<h2>What is wide/long data?</h2>
+<p>Above, we actually created a wide and long version of the data.</p>
+<p>Wide: has many columns</p>
+<ul>
+<li>multiple columns per individual, values spread across multiple columns</li>
+<li>easier for humans to read</li>
+</ul>
+<p>Long: has many rows</p>
+<ul>
+<li>column names become data</li>
+<li>multiple rows per observation, a single column contains the values</li>
+<li>easier for R to make plots &amp; do analysis</li>
+</ul>
+</section>
+<section id="reshape-function" class="slide level2">
+<h2><code>reshape()</code> function</h2>
+<p>The <code>reshape()</code> function allows you to toggle between wide and long data</p>
+<p>Reshape Grouped Data</p>
+<p>Description:</p>
+<pre><code> This function reshapes a data frame between 'wide' format (with
+ repeated measurements in separate columns of the same row) and
+ 'long' format (with the repeated measurements in separate rows).</code></pre>
+<p>Usage:</p>
+<pre><code> reshape(data, varying = NULL, v.names = NULL, timevar = "time",
+         idvar = "id", ids = 1:NROW(data),
+         times = seq_along(varying[[1]]),
+         drop = NULL, direction, new.row.names = NULL,
+         sep = ".",
+         split = if (sep == "") {
+             list(regexp = "[A-Za-z][0-9]", include = TRUE)
+         } else {
+             list(regexp = sep, include = FALSE, fixed = TRUE)}
+         )
+ 
+ ### Typical usage for converting from long to wide format:
+ 
+ # reshape(data, direction = "wide",
+ #         idvar = "___", timevar = "___", # mandatory
+ #         v.names = c(___),    # time-varying variables
+ #         varying = list(___)) # auto-generated if missing
+ 
+ ### Typical usage for converting from wide to long format:
+ 
+ ### If names of wide-format variables are in a 'nice' format
+ 
+ # reshape(data, direction = "long",
+ #         varying = c(___), # vector 
+ #         sep)              # to help guess 'v.names' and 'times'
+ 
+ ### To specify long-format variable names explicitly
+ 
+ # reshape(data, direction = "long",
+ #         varying = ___,  # list / matrix / vector (use with care)
+ #         v.names = ___,  # vector of variable names in long format
+ #         timevar, times, # name / values of constructed time variable
+ #         idvar, ids)     # name / values of constructed id variable
+ </code></pre>
+<p>Arguments:</p>
+<pre><code>data: a data frame</code></pre>
+<p>varying: names of sets of variables in the wide format that correspond to single variables in long format (‘time-varying’). This is canonically a list of vectors of variable names, but it can optionally be a matrix of names, or a single vector of names. In each case, when ‘direction = “long”’, the names can be replaced by indices which are interpreted as referring to ‘names(data)’. See ‘Details’ for more details and options.</p>
+<p>v.names: names of variables in the long format that correspond to multiple variables in the wide format. See ‘Details’.</p>
+<p>timevar: the variable in long format that differentiates multiple records from the same group or individual. If more than one record matches, the first will be taken (with a warning).</p>
+<p>idvar: Names of one or more variables in long format that identify multiple records from the same group/individual. These variables may also be present in wide format.</p>
+<pre><code> ids: the values to use for a newly created 'idvar' variable in
+      long format.</code></pre>
+<p>times: the values to use for a newly created ‘timevar’ variable in long format. See ‘Details’.</p>
+<pre><code>drop: a vector of names of variables to drop before reshaping.</code></pre>
+<p>direction: character string, partially matched to either ‘“wide”’ to reshape to wide format, or ‘“long”’ to reshape to long format.</p>
+<p>new.row.names: character or ‘NULL’: a non-null value will be used for the row names of the result.</p>
+<pre><code> sep: A character vector of length 1, indicating a separating
+      character in the variable names in the wide format.  This is
+      used for guessing 'v.names' and 'times' arguments based on
+      the names in 'varying'.  If 'sep == ""', the split is just
+      before the first numeral that follows an alphabetic
+      character.  This is also used to create variable names when
+      reshaping to wide format.</code></pre>
+<p>split: A list with three components, ‘regexp’, ‘include’, and (optionally) ‘fixed’. This allows an extended interface to variable name splitting. See ‘Details’.</p>
+<p>Details:</p>
+<pre><code> Although 'reshape()' can be used in a variety of contexts, the
+ motivating application is data from longitudinal studies, and the
+ arguments of this function are named and described in those terms.
+ A longitudinal study is characterized by repeated measurements of
+ the same variable(s), e.g., height and weight, on each unit being
+ studied (e.g., individual persons) at different time points (which
+ are assumed to be the same for all units). These variables are
+ called time-varying variables. The study may include other
+ variables that are measured only once for each unit and do not
+ vary with time (e.g., gender and race); these are called
+ time-constant variables.
+
+ A 'wide' format representation of a longitudinal dataset will have
+ one record (row) for each unit, typically with some time-constant
+ variables that occupy single columns, and some time-varying
+ variables that occupy multiple columns (one column for each time
+ point).  A 'long' format representation of the same dataset will
+ have multiple records (rows) for each individual, with the
+ time-constant variables being constant across these records and
+ the time-varying variables varying across the records.  The 'long'
+ format dataset will have two additional variables: a 'time'
+ variable identifying which time point each record comes from, and
+ an 'id' variable showing which records refer to the same unit.
+
+ The type of conversion (long to wide or wide to long) is
+ determined by the 'direction' argument, which is mandatory unless
+ the 'data' argument is the result of a previous call to 'reshape'.
+ In that case, the operation can be reversed simply using
+ 'reshape(data)' (the other arguments are stored as attributes on
+ the data frame).
+
+ Conversion from long to wide format with 'direction = "wide"' is
+ the simpler operation, and is mainly useful in the context of
+ multivariate analysis where data is often expected as a
+ wide-format matrix. In this case, the time variable 'timevar' and
+ id variable 'idvar' must be specified. All other variables are
+ assumed to be time-varying, unless the time-varying variables are
+ explicitly specified via the 'v.names' argument.  A warning is
+ issued if time-constant variables are not actually constant.
+
+ Each time-varying variable is expanded into multiple variables in
+ the wide format.  The names of these expanded variables are
+ generated automatically, unless they are specified as the
+ 'varying' argument in the form of a list (or matrix) with one
+ component (or row) for each time-varying variable. If 'varying' is
+ a vector of names, it is implicitly converted into a matrix, with
+ one row for each time-varying variable. Use this option with care
+ if there are multiple time-varying variables, as the ordering (by
+ column, the default in the 'matrix' constructor) may be
+ unintuitive, whereas the explicit list or matrix form is
+ unambiguous.
+
+ Conversion from wide to long with 'direction = "long"' is the more
+ common operation as most (univariate) statistical modeling
+ functions expect data in the long format. In the simpler case
+ where there is only one time-varying variable, the corresponding
+ columns in the wide format input can be specified as the 'varying'
+ argument, which can be either a vector of column names or the
+ corresponding column indices. The name of the corresponding
+ variable in the long format output combining these columns can be
+ optionally specified as the 'v.names' argument, and the name of
+ the time variables as the 'timevar' argument. The values to use as
+ the time values corresponding to the different columns in the wide
+ format can be specified as the 'times' argument.  If 'v.names' is
+ unspecified, the function will attempt to guess 'v.names' and
+ 'times' from 'varying' (an explicitly specified 'times' argument
+ is unused in that case).  The default expects variable names like
+ 'x.1', 'x.2', where 'sep = "."' specifies to split at the dot and
+ drop it from the name.  To have alphabetic followed by numeric
+ times use 'sep = ""'.
+
+ Multiple time-varying variables can be specified in two ways,
+ either with 'varying' as an atomic vector as above, or as a list
+ (or a matrix). The first form is useful (and mandatory) if the
+ automatic variable name splitting as described above is used; this
+ requires the names of all time-varying variables to be suitably
+ formatted in the same manner, and 'v.names' to be unspecified. If
+ 'varying' is a list (with one component for each time-varying
+ variable) or a matrix (one row for each time-varying variable),
+ variable name splitting is not attempted, and 'v.names' and
+ 'times' will generally need to be specified, although they will
+ default to, respectively, the first variable name in each set, and
+ sequential times.
+
+ Also, guessing is not attempted if 'v.names' is given explicitly,
+ even if 'varying' is an atomic vector. In that case, the number of
+ time-varying variables is taken to be the length of 'v.names', and
+ 'varying' is implicitly converted into a matrix, with one row for
+ each time-varying variable. As in the case of long to wide
+ conversion, the matrix is filled up by column, so careful
+ attention needs to be paid to the order of variable names (or
+ indices) in 'varying', which is taken to be like 'x.1', 'y.1',
+ 'x.2', 'y.2' (i.e., variables corresponding to the same time point
+ need to be grouped together).
+
+ The 'split' argument should not usually be necessary.  The
+ 'split$regexp' component is passed to either 'strsplit' or
+ 'regexpr', where the latter is used if 'split$include' is 'TRUE',
+ in which case the splitting occurs after the first character of
+ the matched string.  In the 'strsplit' case, the separator is not
+ included in the result, and it is possible to specify fixed-string
+ matching using 'split$fixed'.</code></pre>
+<p>Value:</p>
+<pre><code> The reshaped data frame with added attributes to simplify
+ reshaping back to the original form.</code></pre>
+<p>See Also:</p>
+<pre><code> 'stack', 'aperm'; 'relist' for reshaping the result of 'unlist'.
+ 'xtabs' and 'as.data.frame.table' for creating contingency tables
+ and converting them back to data frames.</code></pre>
+<p>Examples:</p>
+<pre><code> summary(Indometh) # data in long format
+ 
+ ## long to wide (direction = "wide") requires idvar and timevar at a minimum
+ reshape(Indometh, direction = "wide", idvar = "Subject", timevar = "time")
+ 
+ ## can also explicitly specify name of combined variable
+ wide &lt;- reshape(Indometh, direction = "wide", idvar = "Subject",
+                 timevar = "time", v.names = "conc", sep= "_")
+ wide
+ 
+ ## reverse transformation
+ reshape(wide, direction = "long")
+ reshape(wide, idvar = "Subject", varying = list(2:12),
+         v.names = "conc", direction = "long")
+ 
+ ## times need not be numeric
+ df &lt;- data.frame(id = rep(1:4, rep(2,4)),
+                  visit = I(rep(c("Before","After"), 4)),
+                  x = rnorm(4), y = runif(4))
+ df
+ reshape(df, timevar = "visit", idvar = "id", direction = "wide")
+ ## warns that y is really varying
+ reshape(df, timevar = "visit", idvar = "id", direction = "wide", v.names = "x")
+ 
+ 
+ ##  unbalanced 'long' data leads to NA fill in 'wide' form
+ df2 &lt;- df[1:7, ]
+ df2
+ reshape(df2, timevar = "visit", idvar = "id", direction = "wide")
+ 
+ ## Alternative regular expressions for guessing names
+ df3 &lt;- data.frame(id = 1:4, age = c(40,50,60,50), dose1 = c(1,2,1,2),
+                   dose2 = c(2,1,2,1), dose4 = c(3,3,3,3))
+ reshape(df3, direction = "long", varying = 3:5, sep = "")
+ 
+ 
+ ## an example that isn't longitudinal data
+ state.x77 &lt;- as.data.frame(state.x77)
+ long &lt;- reshape(state.x77, idvar = "state", ids = row.names(state.x77),
+                 times = names(state.x77), timevar = "Characteristic",
+                 varying = list(names(state.x77)), direction = "long")
+ 
+ reshape(long, direction = "wide")
+ 
+ reshape(long, direction = "wide", new.row.names = unique(long$state))
+ 
+ ## multiple id variables
+ df3 &lt;- data.frame(school = rep(1:3, each = 4), class = rep(9:10, 6),
+                   time = rep(c(1,1,2,2), 3), score = rnorm(12))
+ wide &lt;- reshape(df3, idvar = c("school", "class"), direction = "wide")
+ wide
+ ## transform back
+ reshape(wide)</code></pre>
+</section>
+<section id="long-to-wide-data" class="slide level2">
+<h2>long to wide data</h2>
+<p>xxzane - help</p>
+</section>
+<section id="wide-to-long-data" class="slide level2">
+<h2>wide to long data</h2>
+<p>xxzane - help</p>
+</section>
+<section id="lets-get-real" class="slide level2">
+<h2>Let’s get real</h2>
+<p>Use the <code>pivot_wider()</code> and <code>pivot_longer()</code> from the tidyr package!</p>
+</section>
+<section id="summary" class="slide level2">
+<h2>Summary</h2>
+<ul>
+<li>…</li>
+</ul>
+</section>
+<section id="acknowledgements" class="slide level2">
+<h2>Acknowledgements</h2>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
+<ul>
+<li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
+</ul>
+
+<div class="footer footer-default">
+
+</div>
+</section>
+    </div>
+  </div>
+
+  <script>window.backupDefine = window.define; window.define = undefined;</script>
+  <script src="../site_libs/revealjs/dist/reveal.js"></script>
+  <!-- reveal.js plugins -->
+  <script src="../site_libs/revealjs/plugin/quarto-line-highlight/line-highlight.js"></script>
+  <script src="../site_libs/revealjs/plugin/pdf-export/pdfexport.js"></script>
+  <script src="../site_libs/revealjs/plugin/reveal-menu/menu.js"></script>
+  <script src="../site_libs/revealjs/plugin/reveal-menu/quarto-menu.js"></script>
+  <script src="../site_libs/revealjs/plugin/quarto-support/support.js"></script>
+  
+
+  <script src="../site_libs/revealjs/plugin/notes/notes.js"></script>
+  <script src="../site_libs/revealjs/plugin/search/search.js"></script>
+  <script src="../site_libs/revealjs/plugin/zoom/zoom.js"></script>
+  <script src="../site_libs/revealjs/plugin/math/math.js"></script>
+  <script>window.define = window.backupDefine; window.backupDefine = undefined;</script>
+
+  <script>
+
+      // Full list of configuration options available at:
+      // https://revealjs.com/config/
+      Reveal.initialize({
+'controlsAuto': true,
+'previewLinksAuto': false,
+'smaller': true,
+'pdfSeparateFragments': false,
+'autoAnimateEasing': "ease",
+'autoAnimateDuration': 1,
+'autoAnimateUnmatched': true,
+'menu': {"side":"left","useTextContentForMissingTitles":true,"markers":false,"loadIcons":false,"custom":[{"title":"Tools","icon":"<i class=\"fas fa-gear\"></i>","content":"<ul class=\"slide-menu-items\">\n<li class=\"slide-tool-item active\" data-item=\"0\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.fullscreen(event)\"><kbd>f</kbd> Fullscreen</a></li>\n<li class=\"slide-tool-item\" data-item=\"1\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.speakerMode(event)\"><kbd>s</kbd> Speaker View</a></li>\n<li class=\"slide-tool-item\" data-item=\"2\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.overview(event)\"><kbd>o</kbd> Slide Overview</a></li>\n<li class=\"slide-tool-item\" data-item=\"3\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.togglePdfExport(event)\"><kbd>e</kbd> PDF Export Mode</a></li>\n<li class=\"slide-tool-item\" data-item=\"4\"><a href=\"#\" onclick=\"RevealMenuToolHandlers.keyboardHelp(event)\"><kbd>?</kbd> Keyboard Help</a></li>\n</ul>"}],"openButton":true},
+'smaller': true,
+ 
+        // Display controls in the bottom right corner
+        controls: false,
+
+        // Help the user learn the controls by providing hints, for example by
+        // bouncing the down arrow when they first encounter a vertical slide
+        controlsTutorial: false,
+
+        // Determines where controls appear, "edges" or "bottom-right"
+        controlsLayout: 'edges',
+
+        // Visibility rule for backwards navigation arrows; "faded", "hidden"
+        // or "visible"
+        controlsBackArrows: 'faded',
+
+        // Display a presentation progress bar
+        progress: true,
+
+        // Display the page number of the current slide
+        slideNumber: false,
+
+        // 'all', 'print', or 'speaker'
+        showSlideNumber: 'all',
+
+        // Add the current slide number to the URL hash so that reloading the
+        // page/copying the URL will return you to the same slide
+        hash: true,
+
+        // Start with 1 for the hash rather than 0
+        hashOneBasedIndex: false,
+
+        // Flags if we should monitor the hash and change slides accordingly
+        respondToHashChanges: true,
+
+        // Push each slide change to the browser history
+        history: true,
+
+        // Enable keyboard shortcuts for navigation
+        keyboard: true,
+
+        // Enable the slide overview mode
+        overview: true,
+
+        // Disables the default reveal.js slide layout (scaling and centering)
+        // so that you can use custom CSS layout
+        disableLayout: false,
+
+        // Vertical centering of slides
+        center: false,
+
+        // Enables touch navigation on devices with touch input
+        touch: true,
+
+        // Loop the presentation
+        loop: false,
+
+        // Change the presentation direction to be RTL
+        rtl: false,
+
+        // see https://revealjs.com/vertical-slides/#navigation-mode
+        navigationMode: 'linear',
+
+        // Randomizes the order of slides each time the presentation loads
+        shuffle: false,
+
+        // Turns fragments on and off globally
+        fragments: true,
+
+        // Flags whether to include the current fragment in the URL,
+        // so that reloading brings you to the same fragment position
+        fragmentInURL: false,
+
+        // Flags if the presentation is running in an embedded mode,
+        // i.e. contained within a limited portion of the screen
+        embedded: false,
+
+        // Flags if we should show a help overlay when the questionmark
+        // key is pressed
+        help: true,
+
+        // Flags if it should be possible to pause the presentation (blackout)
+        pause: true,
+
+        // Flags if speaker notes should be visible to all viewers
+        showNotes: false,
+
+        // Global override for autoplaying embedded media (null/true/false)
+        autoPlayMedia: null,
+
+        // Global override for preloading lazy-loaded iframes (null/true/false)
+        preloadIframes: null,
+
+        // Number of milliseconds between automatically proceeding to the
+        // next slide, disabled when set to 0, this value can be overwritten
+        // by using a data-autoslide attribute on your slides
+        autoSlide: 0,
+
+        // Stop auto-sliding after user input
+        autoSlideStoppable: true,
+
+        // Use this method for navigation when auto-sliding
+        autoSlideMethod: null,
+
+        // Specify the average time in seconds that you think you will spend
+        // presenting each slide. This is used to show a pacing timer in the
+        // speaker view
+        defaultTiming: null,
+
+        // Enable slide navigation via mouse wheel
+        mouseWheel: false,
+
+        // The display mode that will be used to show slides
+        display: 'block',
+
+        // Hide cursor if inactive
+        hideInactiveCursor: true,
+
+        // Time before the cursor is hidden (in ms)
+        hideCursorTime: 5000,
+
+        // Opens links in an iframe preview overlay
+        previewLinks: false,
+
+        // Transition style (none/fade/slide/convex/concave/zoom)
+        transition: 'none',
+
+        // Transition speed (default/fast/slow)
+        transitionSpeed: 'default',
+
+        // Transition style for full page slide backgrounds
+        // (none/fade/slide/convex/concave/zoom)
+        backgroundTransition: 'none',
+
+        // Number of slides away from the current that are visible
+        viewDistance: 3,
+
+        // Number of slides away from the current that are visible on mobile
+        // devices. It is advisable to set this to a lower number than
+        // viewDistance in order to save resources.
+        mobileViewDistance: 2,
+
+        // The "normal" size of the presentation, aspect ratio will be preserved
+        // when the presentation is scaled to fit different resolutions. Can be
+        // specified using percentage units.
+        width: 1050,
+
+        height: 700,
+
+        // Factor of the display size that should remain empty around the content
+        margin: 0.1,
+
+        math: {
+          mathjax: 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js',
+          config: 'TeX-AMS_HTML-full',
+          tex2jax: {
+            inlineMath: [['\\(','\\)']],
+            displayMath: [['\\[','\\]']],
+            balanceBraces: true,
+            processEscapes: false,
+            processRefs: true,
+            processEnvironments: true,
+            preview: 'TeX',
+            skipTags: ['script','noscript','style','textarea','pre','code'],
+            ignoreClass: 'tex2jax_ignore',
+            processClass: 'tex2jax_process'
+          },
+        },
+
+        // reveal.js plugins
+        plugins: [QuartoLineHighlight, PdfExport, RevealMenu, QuartoSupport,
+
+          RevealMath,
+          RevealNotes,
+          RevealSearch,
+          RevealZoom
+        ]
+      });
+    </script>
+    
+    <script>
+      // htmlwidgets need to know to resize themselves when slides are shown/hidden.
+      // Fire the "slideenter" event (handled by htmlwidgets.js) when the current
+      // slide changes (different for each slide format).
+      (function () {
+        // dispatch for htmlwidgets
+        function fireSlideEnter() {
+          const event = window.document.createEvent("Event");
+          event.initEvent("slideenter", true, true);
+          window.document.dispatchEvent(event);
+        }
+
+        function fireSlideChanged(previousSlide, currentSlide) {
+          fireSlideEnter();
+
+          // dispatch for shiny
+          if (window.jQuery) {
+            if (previousSlide) {
+              window.jQuery(previousSlide).trigger("hidden");
+            }
+            if (currentSlide) {
+              window.jQuery(currentSlide).trigger("shown");
+            }
+          }
+        }
+
+        // hookup for slidy
+        if (window.w3c_slidy) {
+          window.w3c_slidy.add_observer(function (slide_num) {
+            // slide_num starts at position 1
+            fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);
+          });
+        }
+
+      })();
+    </script>
+
+    <script id="quarto-html-after-body" type="application/javascript">
+    window.document.addEventListener("DOMContentLoaded", function (event) {
+      const toggleBodyColorMode = (bsSheetEl) => {
+        const mode = bsSheetEl.getAttribute("data-mode");
+        const bodyEl = window.document.querySelector("body");
+        if (mode === "dark") {
+          bodyEl.classList.add("quarto-dark");
+          bodyEl.classList.remove("quarto-light");
+        } else {
+          bodyEl.classList.add("quarto-light");
+          bodyEl.classList.remove("quarto-dark");
+        }
+      }
+      const toggleBodyColorPrimary = () => {
+        const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
+        if (bsSheetEl) {
+          toggleBodyColorMode(bsSheetEl);
+        }
+      }
+      toggleBodyColorPrimary();  
+      const tabsets =  window.document.querySelectorAll(".panel-tabset-tabby")
+      tabsets.forEach(function(tabset) {
+        const tabby = new Tabby('#' + tabset.id);
+      });
+      const isCodeAnnotation = (el) => {
+        for (const clz of el.classList) {
+          if (clz.startsWith('code-annotation-')) {                     
+            return true;
+          }
+        }
+        return false;
+      }
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
+        // button target
+        const button = e.trigger;
+        // don't keep focus
+        button.blur();
+        // flash "checked"
+        button.classList.add('code-copy-button-checked');
+        var currentTitle = button.getAttribute("title");
+        button.setAttribute("title", "Copied!");
+        let tooltip;
+        if (window.bootstrap) {
+          button.setAttribute("data-bs-toggle", "tooltip");
+          button.setAttribute("data-bs-placement", "left");
+          button.setAttribute("data-bs-title", "Copied!");
+          tooltip = new bootstrap.Tooltip(button, 
+            { trigger: "manual", 
+              customClass: "code-copy-button-tooltip",
+              offset: [0, -8]});
+          tooltip.show();    
+        }
+        setTimeout(function() {
+          if (tooltip) {
+            tooltip.hide();
+            button.removeAttribute("data-bs-title");
+            button.removeAttribute("data-bs-toggle");
+            button.removeAttribute("data-bs-placement");
+          }
+          button.setAttribute("title", currentTitle);
+          button.classList.remove('code-copy-button-checked');
+        }, 1000);
+        // clear code selection
+        e.clearSelection();
+      });
+      function tippyHover(el, contentFn) {
+        const config = {
+          allowHTML: true,
+          content: contentFn,
+          maxWidth: 500,
+          delay: 100,
+          arrow: false,
+          appendTo: function(el) {
+              return el.closest('section.slide') || el.parentElement;
+          },
+          interactive: true,
+          interactiveBorder: 10,
+          theme: 'light-border',
+          placement: 'bottom-start'
+        };
+          config['offset'] = [0,0];
+          config['maxWidth'] = 700;
+        window.tippy(el, config); 
+      }
+      const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
+      for (var i=0; i<noterefs.length; i++) {
+        const ref = noterefs[i];
+        tippyHover(ref, function() {
+          // use id or data attribute instead here
+          let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
+          try { href = new URL(href).hash; } catch {}
+          const id = href.replace(/^#\/?/, "");
+          const note = window.document.getElementById(id);
+          return note.innerHTML;
+        });
+      }
+      const findCites = (el) => {
+        const parentEl = el.parentElement;
+        if (parentEl) {
+          const cites = parentEl.dataset.cites;
+          if (cites) {
+            return {
+              el,
+              cites: cites.split(' ')
+            };
+          } else {
+            return findCites(el.parentElement)
+          }
+        } else {
+          return undefined;
+        }
+      };
+      var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
+      for (var i=0; i<bibliorefs.length; i++) {
+        const ref = bibliorefs[i];
+        const citeInfo = findCites(ref);
+        if (citeInfo) {
+          tippyHover(citeInfo.el, function() {
+            var popup = window.document.createElement('div');
+            citeInfo.cites.forEach(function(cite) {
+              var citeDiv = window.document.createElement('div');
+              citeDiv.classList.add('hanging-indent');
+              citeDiv.classList.add('csl-entry');
+              var biblioDiv = window.document.getElementById('ref-' + cite);
+              if (biblioDiv) {
+                citeDiv.innerHTML = biblioDiv.innerHTML;
+              }
+              popup.appendChild(citeDiv);
+            });
+            return popup.innerHTML;
+          });
+        }
+      }
+    });
+    </script>
+    
+
+</body></html>
\ No newline at end of file
diff --git a/docs/modules/Module09-DataAnalysis.html b/docs/modules/Module09-DataAnalysis.html
index aca507b..e7487aa 100644
--- a/docs/modules/Module09-DataAnalysis.html
+++ b/docs/modules/Module09-DataAnalysis.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 9: Data Analysis</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 9: Data Analysis</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,45 +407,22 @@ <h1 class="title">Module 9: Data Analysis</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/import-data-for-this-module" id="/toc-import-data-for-this-module">Import data for this module</a></li>
-<li><a href="#/prep-data" id="/toc-prep-data">Prep data</a></li>
-<li><a href="#/variable-contingency-tables" id="/toc-variable-contingency-tables">2 variable contingency tables</a></li>
-<li><a href="#/chi-square-test" id="/toc-chi-square-test">Chi-Square test</a></li>
-<li><a href="#/chi-square-test-1" id="/toc-chi-square-test-1">Chi-Square test</a></li>
-<li><a href="#/correlation" id="/toc-correlation">Correlation</a></li>
-<li><a href="#/t-test" id="/toc-t-test">T-test</a></li>
-<li><a href="#/t-test-1" id="/toc-t-test-1">T-test</a></li>
-<li><a href="#/running-two-sample-t-test" id="/toc-running-two-sample-t-test">Running two-sample t-test</a></li>
-<li><a href="#/running-two-sample-t-test-1" id="/toc-running-two-sample-t-test-1">Running two-sample t-test</a></li>
-<li><a href="#/linear-regression-fit-in-r" id="/toc-linear-regression-fit-in-r">Linear regression fit in R</a></li>
-<li><a href="#/linear-regression-fit-in-r-1" id="/toc-linear-regression-fit-in-r-1">Linear regression fit in R</a></li>
-<li><a href="#/summary.glm" id="/toc-summary.glm"><code>summary.glm()</code></a></li>
-<li><a href="#/linear-regression-fit-in-r-2" id="/toc-linear-regression-fit-in-r-2">Linear regression fit in R</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
 <p>After module 9, you should be able to…</p>
 <ul>
-<li><pre><code>  Descriptively assess association between two variables</code></pre></li>
-<li><pre><code>  Compute basic statistics </code></pre></li>
-<li><pre><code>  Fit a generalized linear model</code></pre></li>
+<li>Descriptively assess association between two variables</li>
+<li>Compute basic statistics</li>
+<li>Fit a generalized linear model</li>
 </ul>
 </section>
 <section id="import-data-for-this-module" class="slide level2">
 <h2>Import data for this module</h2>
 <p>Let’s read in our data (again) and take a quick look.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span>
-<span id="cb4-2"><a></a><span class="fu">head</span>(<span class="at">x=</span>df, <span class="at">n=</span><span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span>
+<span id="cb1-2"><a href="#cb1-2"></a><span class="fu">head</span>(<span class="at">x=</span>df, <span class="at">n=</span><span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>  observation_id IgG_concentration age gender     slum
 1           5772         0.3176895   2 Female Non slum
@@ -459,47 +435,186 @@ <h2>Import data for this module</h2>
 <h2>Prep data</h2>
 <p>Create <code>age_group</code> three level factor variable</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
-<span id="cb6-2"><a></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, </span>
-<span id="cb6-3"><a></a>                              <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">10</span>, <span class="st">"old"</span>, <span class="cn">NA</span>)))</span>
-<span id="cb6-4"><a></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
+<span id="cb3-2"><a href="#cb3-2"></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span>
+<span id="cb3-3"><a href="#cb3-3"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>Create <code>seropos</code> binary variable representing seropositivity if antibody concentrations are &gt;10 mIUmL.</p>
+<p>Create <code>seropos</code> binary variable representing seropositivity if antibody concentrations are &gt;10 IU/mL.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a></a>df<span class="sc">$</span>seropos <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&lt;</span><span class="dv">10</span>, <span class="dv">0</span>, </span>
-<span id="cb7-2"><a></a>                                        <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&gt;=</span><span class="dv">10</span>, <span class="dv">1</span>, <span class="cn">NA</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1"></a>df<span class="sc">$</span>seropos <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&lt;</span><span class="dv">10</span>, <span class="dv">0</span>, <span class="dv">1</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="variable-contingency-tables" class="slide level2">
 <h2>2 variable contingency tables</h2>
 <p>We use <code>table()</code> prior to look at one variable, now we can generate frequency tables for 2 plus variables. To get cell percentages, the <code>prop.table()</code> is useful.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>age_group, df<span class="sc">$</span>seropo)</span>
-<span id="cb8-2"><a></a>prop <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq)</span>
-<span id="cb8-3"><a></a>freq</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-<div class="cell-output cell-output-stdout">
-<pre><code>        
-           0   1
-  young  254  57
-  middle  70 105
-  old     30 116</code></pre>
+<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1"></a>?prop.table</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a></a>prop</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a><span class="fu">library</span>(printr)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stderr">
+<pre><code>Registered S3 method overwritten by 'printr':
+  method                from     
+  knit_print.data.frame rmarkdown</code></pre>
+</div>
+<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1"></a>?prop.table</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
-<pre><code>        
-                  0          1
-  young  0.40189873 0.09018987
-  middle 0.11075949 0.16613924
-  old    0.04746835 0.18354430</code></pre>
+<pre><code>Express Table Entries as Fraction of Marginal Table
+
+Description:
+
+     Returns conditional proportions given 'margins', i.e. entries of
+     'x', divided by the appropriate marginal sums.
+
+Usage:
+
+     proportions(x, margin = NULL)
+     prop.table(x, margin = NULL)
+     
+Arguments:
+
+       x: table
+
+  margin: a vector giving the margins to split by.  E.g., for a matrix
+          '1' indicates rows, '2' indicates columns, 'c(1, 2)'
+          indicates rows and columns.  When 'x' has named dimnames, it
+          can be a character vector selecting dimension names.
+
+Value:
+
+     Table like 'x' expressed relative to 'margin'
+
+Note:
+
+     'prop.table' is an earlier name, retained for back-compatibility.
+
+Author(s):
+
+     Peter Dalgaard
+
+See Also:
+
+     'marginSums'. 'apply', 'sweep' are a more general mechanism for
+     sweeping out marginal statistics.
+
+Examples:
+
+     m &lt;- matrix(1:4, 2)
+     m
+     proportions(m, 1)
+     
+     DF &lt;- as.data.frame(UCBAdmissions)
+     tbl &lt;- xtabs(Freq ~ Gender + Admit, DF)
+     
+     proportions(tbl, "Gender")</code></pre>
+</div>
+</div>
+</section>
+<section id="variable-contingency-tables-1" class="slide level2">
+<h2>2 variable contingency tables</h2>
+<p>Let’s practice</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1"></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>age_group, df<span class="sc">$</span>seropos)</span>
+<span id="cb10-2"><a href="#cb10-2"></a>freq</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: left;">/</th>
+<th style="text-align: right;">0</th>
+<th style="text-align: right;">1</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: left;">young</td>
+<td style="text-align: right;">254</td>
+<td style="text-align: right;">57</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">middle</td>
+<td style="text-align: right;">70</td>
+<td style="text-align: right;">105</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">old</td>
+<td style="text-align: right;">30</td>
+<td style="text-align: right;">116</td>
+</tr>
+</tbody>
+</table>
+</div>
+</div>
+<p>Now, lets move to percentages</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1"></a>prop.cell.percentages <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq)</span>
+<span id="cb11-2"><a href="#cb11-2"></a>prop.cell.percentages</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: left;">/</th>
+<th style="text-align: right;">0</th>
+<th style="text-align: right;">1</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: left;">young</td>
+<td style="text-align: right;">0.4018987</td>
+<td style="text-align: right;">0.0901899</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">middle</td>
+<td style="text-align: right;">0.1107595</td>
+<td style="text-align: right;">0.1661392</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">old</td>
+<td style="text-align: right;">0.0474684</td>
+<td style="text-align: right;">0.1835443</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1"></a>prop.column.percentages <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq, <span class="at">margin=</span><span class="dv">2</span>)</span>
+<span id="cb12-2"><a href="#cb12-2"></a>prop.column.percentages</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<table>
+<thead>
+<tr class="header">
+<th style="text-align: left;">/</th>
+<th style="text-align: right;">0</th>
+<th style="text-align: right;">1</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td style="text-align: left;">young</td>
+<td style="text-align: right;">0.7175141</td>
+<td style="text-align: right;">0.2050360</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">middle</td>
+<td style="text-align: right;">0.1977401</td>
+<td style="text-align: right;">0.3776978</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">old</td>
+<td style="text-align: right;">0.0847458</td>
+<td style="text-align: right;">0.4172662</td>
+</tr>
+</tbody>
+</table>
 </div>
 </div>
 </section>
 <section id="chi-square-test" class="slide level2">
 <h2>Chi-Square test</h2>
 <p>The <code>chisq.test()</code> function test of independence of factor variables from <code>stats</code> package.</p>
-<pre><code>Registered S3 method overwritten by 'printr':
-  method                from     
-  knit_print.data.frame rmarkdown</code></pre>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a>?chisq.test</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
 <p>Pearson’s Chi-squared Test for Count Data</p>
 <p>Description:</p>
 <pre><code> 'chisq.test' performs chi-squared contingency table tests and
@@ -633,7 +748,7 @@ <h2>Chi-Square test</h2>
 <section id="chi-square-test-1" class="slide level2">
 <h2>Chi-Square test</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb24"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb24-1"><a></a><span class="fu">chisq.test</span>(freq)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a><span class="fu">chisq.test</span>(freq)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>
     Pearson's Chi-squared test
@@ -642,35 +757,58 @@ <h2>Chi-Square test</h2>
 X-squared = 175.85, df = 2, p-value &lt; 2.2e-16</code></pre>
 </div>
 </div>
-<p>We reject the null hypothesis that the proportion of seropositive individuals who are young (&lt;5yo) is the same for individuals who are middle (5-10yo) or old (&gt;10yo).</p>
+<p>We reject the null hypothesis that the proportion of seropositive individuals in the young, middle, and old age groups are the same.</p>
 </section>
 <section id="correlation" class="slide level2">
 <h2>Correlation</h2>
 <p>First, we compute correlation by providing two vectors.</p>
 <p>Like other functions, if there are <code>NA</code>s, you get <code>NA</code> as the result. But if you specify use only the complete observations, then it will give you correlation using the non-missing data.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a></a><span class="fu">cor</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration, <span class="at">method=</span><span class="st">"pearson"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb27"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1"></a><span class="fu">cor</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration, <span class="at">method=</span><span class="st">"pearson"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] NA</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a></a><span class="fu">cor</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration, <span class="at">method=</span><span class="st">"pearson"</span>, <span class="at">use =</span> <span class="st">"complete.obs"</span>) <span class="co">#IF have missing data</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a><span class="fu">cor</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration, <span class="at">method=</span><span class="st">"pearson"</span>, <span class="at">use =</span> <span class="st">"complete.obs"</span>) <span class="co">#IF have missing data</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[1] 0.2604783</code></pre>
 </div>
 </div>
 <p>Small positive correlation between IgG concentration and age.</p>
 </section>
+<section id="correlation-confidence-interval" class="slide level2">
+<h2>Correlation confidence interval</h2>
+<p>The function <code>cor.test()</code> also gives you the confidence interval of the correlation statistic. Note, it uses complete observations by default.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb31"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb31-1"><a href="#cb31-1"></a><span class="fu">cor.test</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration, <span class="at">method=</span><span class="st">"pearson"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output cell-output-stdout">
+<pre><code>
+    Pearson's product-moment correlation
+
+data:  df$age and df$IgG_concentration
+t = 6.7717, df = 630, p-value = 2.921e-11
+alternative hypothesis: true correlation is not equal to 0
+95 percent confidence interval:
+ 0.1862722 0.3317295
+sample estimates:
+      cor 
+0.2604783 </code></pre>
+</div>
+</div>
+</section>
 <section id="t-test" class="slide level2">
 <h2>T-test</h2>
 <p>The commonly used are:</p>
 <ul>
 <li><strong>one-sample t-test</strong> – used to test mean of a variable in one group (to the null hypothesis mean)</li>
-<li><strong>two-sample t-test</strong> – used to test difference in means of a variable between two groups (null hypothesis - the group means are the <em>same</em>); if “two groups” are data of the <em>same</em> individuals collected at 2 time points, we say it is two-sample paired t-test</li>
+<li><strong>two-sample t-test</strong> – used to test difference in means of a variable between two groups (null hypothesis - the group means are the <em>same</em>)</li>
 </ul>
 </section>
 <section id="t-test-1" class="slide level2">
 <h2>T-test</h2>
 <p>We can use the <code>t.test()</code> function from the <code>stats</code> package.</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb33"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb33-1"><a href="#cb33-1"></a>?t.test</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
 <p>Student’s t-Test</p>
 <p>Description:</p>
 <pre><code> Performs one and two sample t-tests on vectors of data.</code></pre>
@@ -769,10 +907,10 @@ <h2>Running two-sample t-test</h2>
 <section id="running-two-sample-t-test-1" class="slide level2">
 <h2>Running two-sample t-test</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a></a>IgG_young <span class="ot">&lt;-</span> df<span class="sc">$</span>IgG_concentration[df<span class="sc">$</span>age_group<span class="sc">==</span><span class="st">"young"</span>]</span>
-<span id="cb40-2"><a></a>IgG_old <span class="ot">&lt;-</span> df<span class="sc">$</span>IgG_concentration[df<span class="sc">$</span>age_group<span class="sc">==</span><span class="st">"old"</span>]</span>
-<span id="cb40-3"><a></a></span>
-<span id="cb40-4"><a></a><span class="fu">t.test</span>(IgG_young, IgG_old)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1"></a>IgG_young <span class="ot">&lt;-</span> df<span class="sc">$</span>IgG_concentration[df<span class="sc">$</span>age_group<span class="sc">==</span><span class="st">"young"</span>]</span>
+<span id="cb44-2"><a href="#cb44-2"></a>IgG_old <span class="ot">&lt;-</span> df<span class="sc">$</span>IgG_concentration[df<span class="sc">$</span>age_group<span class="sc">==</span><span class="st">"old"</span>]</span>
+<span id="cb44-3"><a href="#cb44-3"></a></span>
+<span id="cb44-4"><a href="#cb44-4"></a><span class="fu">t.test</span>(IgG_young, IgG_old)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>
     Welch Two Sample t-test
@@ -787,11 +925,14 @@ <h2>Running two-sample t-test</h2>
  45.05056 129.35454 </code></pre>
 </div>
 </div>
-<p>The mean IgG concenration of young and old is 45.05 and 129.35 mIU/mL, respectively. We reject null hypothesis that the difference in the mean IgG concentration of young and old is 0 mIU/mL.</p>
+<p>The mean IgG concenration of young and old is 45.05 and 129.35 IU/mL, respectively. We reject null hypothesis that the difference in the mean IgG concentration of young and old is 0 IU/mL.</p>
 </section>
 <section id="linear-regression-fit-in-r" class="slide level2">
 <h2>Linear regression fit in R</h2>
 <p>To fit regression models in R, we use the function <code>glm()</code> (Generalized Linear Model).</p>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1"></a>?glm</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
 <p>Fitting Generalized Linear Models</p>
 <p>Description:</p>
 <pre><code> 'glm' is used to fit generalized linear models, specified by
@@ -1078,11 +1219,11 @@ <h2>Linear regression fit in R</h2>
 <ul>
 <li><code>formula</code> – model formula written using names of columns in our data</li>
 <li><code>data</code> – our data frame</li>
-<li><pre><code>  `family` -- error distribution and link function</code></pre></li>
+<li><code>family</code> – error distribution and link function</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb62"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb62-1"><a></a>fit1 <span class="ot">&lt;-</span> <span class="fu">glm</span>(IgG_concentration<span class="sc">~</span>age<span class="sc">+</span>gender<span class="sc">+</span>slum, <span class="at">data=</span>df, <span class="at">family=</span><span class="fu">gaussian</span>())</span>
-<span id="cb62-2"><a></a>fit2 <span class="ot">&lt;-</span> <span class="fu">glm</span>(seropos<span class="sc">~</span>age_group<span class="sc">+</span>gender<span class="sc">+</span>slum, <span class="at">data=</span>df, <span class="at">family =</span> <span class="fu">binomial</span>(<span class="at">link =</span> <span class="st">"logit"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb66"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb66-1"><a href="#cb66-1"></a>fit1 <span class="ot">&lt;-</span> <span class="fu">glm</span>(IgG_concentration<span class="sc">~</span>age<span class="sc">+</span>gender<span class="sc">+</span>slum, <span class="at">data=</span>df, <span class="at">family=</span><span class="fu">gaussian</span>())</span>
+<span id="cb66-2"><a href="#cb66-2"></a>fit2 <span class="ot">&lt;-</span> <span class="fu">glm</span>(seropos<span class="sc">~</span>age_group<span class="sc">+</span>gender<span class="sc">+</span>slum, <span class="at">data=</span>df, <span class="at">family =</span> <span class="fu">binomial</span>(<span class="at">link =</span> <span class="st">"logit"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="summary.glm" class="slide level2">
@@ -1178,7 +1319,7 @@ <h2><code>summary.glm()</code></h2>
 <h2>Linear regression fit in R</h2>
 <p>Lets look at the output…</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb72"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb72-1"><a></a><span class="fu">summary</span>(fit1)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb76"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb76-1"><a href="#cb76-1"></a><span class="fu">summary</span>(fit1)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>
 Call:
@@ -1204,7 +1345,7 @@ <h2>Linear regression fit in R</h2>
 
 Number of Fisher Scoring iterations: 2</code></pre>
 </div>
-<div class="sourceCode cell-code" id="cb74"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb74-1"><a></a><span class="fu">summary</span>(fit2)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb78"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb78-1"><a href="#cb78-1"></a><span class="fu">summary</span>(fit2)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>
 Call:
@@ -1236,23 +1377,20 @@ <h2>Linear regression fit in R</h2>
 <section id="summary" class="slide level2">
 <h2>Summary</h2>
 <ul>
-<li><pre><code>  Use `cor()` to calculate correlation between two numeric vectors.</code></pre></li>
-<li><code>corrplot()</code> and <code>ggpairs()</code> is nice for a quick visualization of correlations</li>
-<li><code>t.test()</code> or <code>t_test()</code> tests the mean compared to null or difference in means between two groups</li>
+<li>Use <code>cor()</code> or <code>cor.test()</code> to calculate correlation between two numeric vectors.</li>
+<li><code>t.test()</code> tests the mean compared to null or difference in means between two groups</li>
 <li><pre><code>  ... xxamy more</code></pre></li>
 </ul>
 </section>
 <section id="acknowledgements" class="slide level2">
 <h2>Acknowledgements</h2>
-<p>These are the materials I looked through, modified, or extracted to complete this module’s lecture.</p>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 <ul>
 <li><a href="https://jhudatascience.org/intro_to_r/">“Introduction to R for Public Health Researchers” Johns Hopkins University</a></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -1281,6 +1419,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -1535,7 +1674,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -1567,50 +1717,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -1620,17 +1731,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -1644,11 +1746,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module10-DataVisualization.html b/docs/modules/Module10-DataVisualization.html
index 8e82f1d..157bea4 100644
--- a/docs/modules/Module10-DataVisualization.html
+++ b/docs/modules/Module10-DataVisualization.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.5.54">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Module 10: Data Visualization</title>
+  <title>SISMID Module NUMBER Materials (2025) - Module 10: Data Visualization</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -408,36 +407,6 @@ <h1 class="title">Module 10: Data Visualization</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-objectives" id="/toc-learning-objectives">Learning Objectives</a></li>
-<li><a href="#/import-data-for-this-module" id="/toc-import-data-for-this-module">Import data for this module</a></li>
-<li><a href="#/prep-data" id="/toc-prep-data">Prep data</a></li>
-<li><a href="#/base-r-data-visualizattion-functions" id="/toc-base-r-data-visualizattion-functions">Base R data visualizattion functions</a></li>
-<li><a href="#/base-r-plotting" id="/toc-base-r-plotting">Base R Plotting</a></li>
-<li><a href="#/parameters" id="/toc-parameters">1. Parameters</a></li>
-<li><a href="#/lots-of-parameters-options" id="/toc-lots-of-parameters-options">Lots of parameters options</a></li>
-<li><a href="#/common-parameter-options" id="/toc-common-parameter-options">Common parameter options</a></li>
-<li><a href="#/common-parameter-options-1" id="/toc-common-parameter-options-1">Common parameter options</a></li>
-<li><a href="#/plot-attributes" id="/toc-plot-attributes">2. Plot Attributes</a></li>
-<li><a href="#/histogram-help-file" id="/toc-histogram-help-file"><code>histogram()</code> Help File</a></li>
-<li><a href="#/histogram-example" id="/toc-histogram-example"><code>histogram()</code> example</a></li>
-<li><a href="#/plot-help-file" id="/toc-plot-help-file"><code>plot()</code> Help File</a></li>
-<li><a href="#/plot-example" id="/toc-plot-example"><code>plot()</code> example</a></li>
-<li><a href="#/boxplot-help-file" id="/toc-boxplot-help-file"><code>boxplot()</code> Help File</a></li>
-<li><a href="#/boxplot-example" id="/toc-boxplot-example"><code>boxplot()</code> example</a></li>
-<li><a href="#/barplot-help-file" id="/toc-barplot-help-file"><code>barplot()</code> Help File</a></li>
-<li><a href="#/barplot-example" id="/toc-barplot-example"><code>barplot()</code> example</a></li>
-<li><a href="#/legend" id="/toc-legend">3. Legend!</a></li>
-<li><a href="#/add-legend-to-the-plot" id="/toc-add-legend-to-the-plot">Add legend to the plot</a></li>
-<li><a href="#/barplot-example-1" id="/toc-barplot-example-1"><code>barplot()</code> example</a></li>
-<li><a href="#/barplot-example-2" id="/toc-barplot-example-2"><code>barplot()</code> example</a></li>
-<li><a href="#/summary" id="/toc-summary">Summary</a></li>
-<li><a href="#/acknowledgements" id="/toc-acknowledgements">Acknowledgements</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-objectives" class="slide level2">
 <h2>Learning Objectives</h2>
@@ -450,8 +419,8 @@ <h2>Learning Objectives</h2>
 <h2>Import data for this module</h2>
 <p>Let’s read in our data (again) and take a quick look.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span>
-<span id="cb1-2"><a></a><span class="fu">head</span>(<span class="at">x=</span>df, <span class="at">n=</span><span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a>df <span class="ot">&lt;-</span> <span class="fu">read.csv</span>(<span class="at">file =</span> <span class="st">"data/serodata.csv"</span>) <span class="co">#relative path</span></span>
+<span id="cb1-2"><a href="#cb1-2"></a><span class="fu">head</span>(<span class="at">x=</span>df, <span class="at">n=</span><span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>  observation_id IgG_concentration age gender     slum
 1           5772         0.3176895   2 Female Non slum
@@ -464,22 +433,20 @@ <h2>Import data for this module</h2>
 <h2>Prep data</h2>
 <p>Create <code>age_group</code> three level factor variable</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
-<span id="cb3-2"><a></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, </span>
-<span id="cb3-3"><a></a>                              <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">10</span>, <span class="st">"old"</span>, <span class="cn">NA</span>)))</span>
-<span id="cb3-4"><a></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>age <span class="sc">&lt;=</span> <span class="dv">5</span>, <span class="st">"young"</span>, </span>
+<span id="cb3-2"><a href="#cb3-2"></a>                       <span class="fu">ifelse</span>(df<span class="sc">$</span>age<span class="sc">&lt;=</span><span class="dv">10</span> <span class="sc">&amp;</span> df<span class="sc">$</span>age<span class="sc">&gt;</span><span class="dv">5</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>)) </span>
+<span id="cb3-3"><a href="#cb3-3"></a>df<span class="sc">$</span>age_group <span class="ot">&lt;-</span> <span class="fu">factor</span>(df<span class="sc">$</span>age_group, <span class="at">levels=</span><span class="fu">c</span>(<span class="st">"young"</span>, <span class="st">"middle"</span>, <span class="st">"old"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>Create <code>seropos</code> binary variable representing seropositivity if antibody concentrations are &gt;10 mIUmL.</p>
+<p>Create <code>seropos</code> binary variable representing seropositivity if antibody concentrations are &gt;10 IU/mL.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a></a>df<span class="sc">$</span>seropos <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&lt;</span><span class="dv">10</span>, <span class="dv">0</span>, </span>
-<span id="cb4-2"><a></a>                                        <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&gt;=</span><span class="dv">10</span>, <span class="dv">1</span>, <span class="cn">NA</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1"></a>df<span class="sc">$</span>seropos <span class="ot">&lt;-</span> <span class="fu">ifelse</span>(df<span class="sc">$</span>IgG_concentration<span class="sc">&lt;</span><span class="dv">10</span>, <span class="dv">0</span>, <span class="dv">1</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="base-r-data-visualizattion-functions" class="slide level2">
 <h2>Base R data visualizattion functions</h2>
 <p>The Base R ‘graphics’ package has a ton of graphics options.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a></a><span class="fu">library</span>(<span class="at">help =</span> <span class="st">"graphics"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1"></a><span class="fu">help</span>(<span class="at">package =</span> <span class="st">"graphics"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <div class="cell">
 <div class="cell-output cell-output-stderr">
@@ -586,7 +553,7 @@ <h2>Base R data visualizattion functions</h2>
 </div>
 </div>
 </section>
-<section id="base-r-plotting" class="slide level2 scrollable">
+<section id="base-r-plotting" class="slide level2">
 <h2>Base R Plotting</h2>
 <p>To make a plot you often need to specify the following features:</p>
 <ol type="1">
@@ -598,15 +565,22 @@ <h2>Base R Plotting</h2>
 <section id="parameters" class="slide level2">
 <h2>1. Parameters</h2>
 <p>The parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via <code>par()</code> before you call the plot creates ‘global’ settings for your plot.</p>
-<p>In the example below, we have set two commonly used optional attributes in the global plot settings. - The <code>mfrow</code> specifies that we have one row and two columns of plots — that is, two plots side by side. - The <code>mar</code> attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).</p>
+<p>In the example below, we have set two commonly used optional attributes in the global plot settings.</p>
+<ul>
+<li>The <code>mfrow</code> specifies that we have one row and two columns of plots — that is, two plots side by side.</li>
+<li>The <code>mar</code> attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).</li>
+</ul>
 <pre><code>par(mfrow = c(1,2), mar = c(5,5,4,1))</code></pre>
+</section>
+<section id="parameters-1" class="slide level2">
+<h2>1. Parameters</h2>
 
-<img data-src="images/par.png" style="width:70.0%" class="r-stretch"></section>
+<img data-src="images/par.png" class="r-stretch"></section>
 <section id="lots-of-parameters-options" class="slide level2">
 <h2>Lots of parameters options</h2>
 <p>However, there are many more parameter options that can be specified in the ‘global’ settings or specific to a certain plot option.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a></a>?par</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1"></a>?par</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Set or Query Graphical Parameters</p>
 <p>Description:</p>
@@ -1266,7 +1240,7 @@ <h2>Lots of parameters options</h2>
 </section>
 <section id="common-parameter-options" class="slide level2">
 <h2>Common parameter options</h2>
-<p>Six useful parameter arguments help improve the readability of the plot:</p>
+<p>Eight useful parameter arguments help improve the readability of the plot:</p>
 <ul>
 <li><code>xlab</code>: specifies the x-axis label of the plot</li>
 <li><code>ylab</code>: specifies the y-axis label</li>
@@ -1297,7 +1271,7 @@ <h2>2. Plot Attributes</h2>
 <section id="histogram-help-file" class="slide level2">
 <h2><code>histogram()</code> Help File</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb22"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb22-1"><a></a>?hist</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb22"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb22-1"><a href="#cb22-1"></a>?hist</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Histograms</p>
 <p>Description:</p>
@@ -1474,7 +1448,7 @@ <h2><code>histogram()</code> Help File</h2>
 </section>
 <section id="histogram-example" class="slide level2">
 <h2><code>histogram()</code> example</h2>
-<p>Reminder</p>
+<p>Reminder function signature</p>
 <pre><code>hist(x, breaks = "Sturges",
      freq = NULL, probability = !freq,
      include.lowest = TRUE, right = TRUE, fuzz = 1e-7,
@@ -1486,33 +1460,25 @@ <h2><code>histogram()</code> example</h2>
      nclass = NULL, warn.unused = TRUE, ...)</code></pre>
 <p>Let’s practice</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a></a><span class="fu">hist</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb38"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1"></a><span class="fu">hist</span>(df<span class="sc">$</span>age)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-1.png" width="960"></p>
-</figure>
-</div>
 </div>
-<div class="sourceCode cell-code" id="cb39"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><a></a><span class="fu">hist</span>(</span>
-<span id="cb39-2"><a></a>    df<span class="sc">$</span>age, </span>
-<span id="cb39-3"><a></a>    <span class="at">freq=</span><span class="cn">FALSE</span>, </span>
-<span id="cb39-4"><a></a>    <span class="at">main=</span><span class="st">"Histogram"</span>, </span>
-<span id="cb39-5"><a></a>    <span class="at">xlab=</span><span class="st">"Age (years)"</span></span>
-<span id="cb39-6"><a></a>    )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb39"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><a href="#cb39-1"></a><span class="fu">hist</span>(</span>
+<span id="cb39-2"><a href="#cb39-2"></a>    df<span class="sc">$</span>age, </span>
+<span id="cb39-3"><a href="#cb39-3"></a>    <span class="at">freq=</span><span class="cn">FALSE</span>, </span>
+<span id="cb39-4"><a href="#cb39-4"></a>    <span class="at">main=</span><span class="st">"Histogram"</span>, </span>
+<span id="cb39-5"><a href="#cb39-5"></a>    <span class="at">xlab=</span><span class="st">"Age (years)"</span></span>
+<span id="cb39-6"><a href="#cb39-6"></a>    )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-12-2.png" width="960"></p>
-</figure>
-</div>
 </div>
 </div>
 </section>
 <section id="plot-help-file" class="slide level2">
 <h2><code>plot()</code> Help File</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a></a>?plot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1"></a>?plot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Generic X-Y Plotting</p>
 <p>Description:</p>
@@ -1612,37 +1578,29 @@ <h2><code>plot()</code> Help File</h2>
 <section id="plot-example" class="slide level2">
 <h2><code>plot()</code> example</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a></a><span class="fu">plot</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb48"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1"></a><span class="fu">plot</span>(df<span class="sc">$</span>age, df<span class="sc">$</span>IgG_concentration)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-1.png" width="960"></p>
-</figure>
-</div>
 </div>
-<div class="sourceCode cell-code" id="cb49"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb49-1"><a></a><span class="fu">plot</span>(</span>
-<span id="cb49-2"><a></a>    df<span class="sc">$</span>age, </span>
-<span id="cb49-3"><a></a>    df<span class="sc">$</span>IgG_concentration, </span>
-<span id="cb49-4"><a></a>    <span class="at">type=</span><span class="st">"p"</span>, </span>
-<span id="cb49-5"><a></a>    <span class="at">main=</span><span class="st">"Age by IgG Concentrations"</span>, </span>
-<span id="cb49-6"><a></a>    <span class="at">xlab=</span><span class="st">"Age (years)"</span>, </span>
-<span id="cb49-7"><a></a>    <span class="at">ylab=</span><span class="st">"IgG Concentration (mIU/mL)"</span>, </span>
-<span id="cb49-8"><a></a>    <span class="at">pch=</span><span class="dv">16</span>, </span>
-<span id="cb49-9"><a></a>    <span class="at">cex=</span><span class="fl">0.9</span>,</span>
-<span id="cb49-10"><a></a>    <span class="at">col=</span><span class="st">"lightblue"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb49"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb49-1"><a href="#cb49-1"></a><span class="fu">plot</span>(</span>
+<span id="cb49-2"><a href="#cb49-2"></a>    df<span class="sc">$</span>age, </span>
+<span id="cb49-3"><a href="#cb49-3"></a>    df<span class="sc">$</span>IgG_concentration, </span>
+<span id="cb49-4"><a href="#cb49-4"></a>    <span class="at">type=</span><span class="st">"p"</span>, </span>
+<span id="cb49-5"><a href="#cb49-5"></a>    <span class="at">main=</span><span class="st">"Age by IgG Concentrations"</span>, </span>
+<span id="cb49-6"><a href="#cb49-6"></a>    <span class="at">xlab=</span><span class="st">"Age (years)"</span>, </span>
+<span id="cb49-7"><a href="#cb49-7"></a>    <span class="at">ylab=</span><span class="st">"IgG Concentration (IU/mL)"</span>, </span>
+<span id="cb49-8"><a href="#cb49-8"></a>    <span class="at">pch=</span><span class="dv">16</span>, </span>
+<span id="cb49-9"><a href="#cb49-9"></a>    <span class="at">cex=</span><span class="fl">0.9</span>,</span>
+<span id="cb49-10"><a href="#cb49-10"></a>    <span class="at">col=</span><span class="st">"lightblue"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png" width="960"></p>
-</figure>
-</div>
 </div>
 </div>
 </section>
 <section id="boxplot-help-file" class="slide level2">
 <h2><code>boxplot()</code> Help File</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb50"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb50-1"><a></a>?boxplot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb50"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb50-1"><a href="#cb50-1"></a>?boxplot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Box Plots</p>
 <p>Description:</p>
@@ -1813,7 +1771,7 @@ <h2><code>boxplot()</code> Help File</h2>
 </section>
 <section id="boxplot-example" class="slide level2">
 <h2><code>boxplot()</code> example</h2>
-<p>Reminder</p>
+<p>Reminder function signature</p>
 <pre><code>boxplot(formula, data = NULL, ..., subset, na.action = NULL,
         xlab = mklab(y_var = horizontal),
         ylab = mklab(y_var =!horizontal),
@@ -1821,207 +1779,209 @@ <h2><code>boxplot()</code> example</h2>
         drop = FALSE, sep = ".", lex.order = FALSE)</code></pre>
 <p>Let’s practice</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb66"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb66-1"><a></a><span class="fu">boxplot</span>(IgG_concentration<span class="sc">~</span>age_group, <span class="at">data=</span>df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb66"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb66-1"><a href="#cb66-1"></a><span class="fu">boxplot</span>(IgG_concentration<span class="sc">~</span>age_group, <span class="at">data=</span>df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-1.png" width="960"></p>
-</figure>
-</div>
 </div>
-<div class="sourceCode cell-code" id="cb67"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb67-1"><a></a><span class="fu">boxplot</span>(</span>
-<span id="cb67-2"><a></a>    <span class="fu">log</span>(df<span class="sc">$</span>IgG_concentration)<span class="sc">~</span>df<span class="sc">$</span>age_group, </span>
-<span id="cb67-3"><a></a>    <span class="at">main=</span><span class="st">"Age by IgG Concentrations"</span>, </span>
-<span id="cb67-4"><a></a>    <span class="at">xlab=</span><span class="st">"Age Group (years)"</span>, </span>
-<span id="cb67-5"><a></a>    <span class="at">ylab=</span><span class="st">"log IgG Concentration (mIU/mL)"</span>, </span>
-<span id="cb67-6"><a></a>    <span class="at">names=</span><span class="fu">c</span>(<span class="st">"1-5"</span>,<span class="st">"6-10"</span>, <span class="st">"11-15"</span>), </span>
-<span id="cb67-7"><a></a>    <span class="at">varwidth=</span>T</span>
-<span id="cb67-8"><a></a>    )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb67"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb67-1"><a href="#cb67-1"></a><span class="fu">boxplot</span>(</span>
+<span id="cb67-2"><a href="#cb67-2"></a>    <span class="fu">log</span>(df<span class="sc">$</span>IgG_concentration)<span class="sc">~</span>df<span class="sc">$</span>age_group, </span>
+<span id="cb67-3"><a href="#cb67-3"></a>    <span class="at">main=</span><span class="st">"Age by IgG Concentrations"</span>, </span>
+<span id="cb67-4"><a href="#cb67-4"></a>    <span class="at">xlab=</span><span class="st">"Age Group (years)"</span>, </span>
+<span id="cb67-5"><a href="#cb67-5"></a>    <span class="at">ylab=</span><span class="st">"log IgG Concentration (mIU/mL)"</span>, </span>
+<span id="cb67-6"><a href="#cb67-6"></a>    <span class="at">names=</span><span class="fu">c</span>(<span class="st">"1-5"</span>,<span class="st">"6-10"</span>, <span class="st">"11-15"</span>), </span>
+<span id="cb67-7"><a href="#cb67-7"></a>    <span class="at">varwidth=</span>T</span>
+<span id="cb67-8"><a href="#cb67-8"></a>    )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-18-2.png" width="960"></p>
-</figure>
-</div>
 </div>
 </div>
 </section>
 <section id="barplot-help-file" class="slide level2">
 <h2><code>barplot()</code> Help File</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb68"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb68-1"><a></a>?barplot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb68"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb68-1"><a href="#cb68-1"></a>?barplot</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<p>Box Plots</p>
+<p>Bar Plots</p>
 <p>Description:</p>
-<pre><code> Produce box-and-whisker plot(s) of the given (grouped) values.</code></pre>
+<pre><code> Creates a bar plot with vertical or horizontal bars.</code></pre>
 <p>Usage:</p>
-<pre><code> boxplot(x, ...)
- 
- ## S3 method for class 'formula'
- boxplot(formula, data = NULL, ..., subset, na.action = NULL,
-         xlab = mklab(y_var = horizontal),
-         ylab = mklab(y_var =!horizontal),
-         add = FALSE, ann = !add, horizontal = FALSE,
-         drop = FALSE, sep = ".", lex.order = FALSE)
+<pre><code> barplot(height, ...)
  
  ## Default S3 method:
- boxplot(x, ..., range = 1.5, width = NULL, varwidth = FALSE,
-         notch = FALSE, outline = TRUE, names, plot = TRUE,
-         border = par("fg"), col = "lightgray", log = "",
-         pars = list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5),
-          ann = !add, horizontal = FALSE, add = FALSE, at = NULL)
+ barplot(height, width = 1, space = NULL,
+         names.arg = NULL, legend.text = NULL, beside = FALSE,
+         horiz = FALSE, density = NULL, angle = 45,
+         col = NULL, border = par("fg"),
+         main = NULL, sub = NULL, xlab = NULL, ylab = NULL,
+         xlim = NULL, ylim = NULL, xpd = TRUE, log = "",
+         axes = TRUE, axisnames = TRUE,
+         cex.axis = par("cex.axis"), cex.names = par("cex.axis"),
+         inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,
+         add = FALSE, ann = !add &amp;&amp; par("ann"), args.legend = NULL, ...)
+ 
+ ## S3 method for class 'formula'
+ barplot(formula, data, subset, na.action,
+         horiz = FALSE, xlab = NULL, ylab = NULL, ...)
  </code></pre>
 <p>Arguments:</p>
-<p>formula: a formula, such as ‘y ~ grp’, where ‘y’ is a numeric vector of data values to be split into groups according to the grouping variable ‘grp’ (usually a factor). Note that ‘~ g1 + g2’ is equivalent to ‘g1:g2’.</p>
-<pre><code>data: a data.frame (or list) from which the variables in 'formula'
+<p>height: either a vector or matrix of values describing the bars which make up the plot. If ‘height’ is a vector, the plot consists of a sequence of rectangular bars with heights given by the values in the vector. If ‘height’ is a matrix and ‘beside’ is ‘FALSE’ then each bar of the plot corresponds to a column of ‘height’, with the values in the column giving the heights of stacked sub-bars making up the bar. If ‘height’ is a matrix and ‘beside’ is ‘TRUE’, then the values in each column are juxtaposed rather than stacked.</p>
+<p>width: optional vector of bar widths. Re-cycled to length the number of bars drawn. Specifying a single value will have no visible effect unless ‘xlim’ is specified.</p>
+<p>space: the amount of space (as a fraction of the average bar width) left before each bar. May be given as a single number or one number per bar. If ‘height’ is a matrix and ‘beside’ is ‘TRUE’, ‘space’ may be specified by two numbers, where the first is the space between bars in the same group, and the second the space between the groups. If not given explicitly, it defaults to ‘c(0,1)’ if ‘height’ is a matrix and ‘beside’ is ‘TRUE’, and to 0.2 otherwise.</p>
+<p>names.arg: a vector of names to be plotted below each bar or group of bars. If this argument is omitted, then the names are taken from the ‘names’ attribute of ‘height’ if this is a vector, or the column names if it is a matrix.</p>
+<p>legend.text: a vector of text used to construct a legend for the plot, or a logical indicating whether a legend should be included. This is only useful when ‘height’ is a matrix. In that case given legend labels should correspond to the rows of ‘height’; if ‘legend.text’ is true, the row names of ‘height’ will be used as labels if they are non-null.</p>
+<p>beside: a logical value. If ‘FALSE’, the columns of ‘height’ are portrayed as stacked bars, and if ‘TRUE’ the columns are portrayed as juxtaposed bars.</p>
+<p>horiz: a logical value. If ‘FALSE’, the bars are drawn vertically with the first bar to the left. If ‘TRUE’, the bars are drawn horizontally with the first at the bottom.</p>
+<p>density: a vector giving the density of shading lines, in lines per inch, for the bars or bar components. The default value of ‘NULL’ means that no shading lines are drawn. Non-positive values of ‘density’ also inhibit the drawing of shading lines.</p>
+<p>angle: the slope of shading lines, given as an angle in degrees (counter-clockwise), for the bars or bar components.</p>
+<pre><code> col: a vector of colors for the bars or bar components.  By
+      default, '"grey"' is used if 'height' is a vector, and a
+      gamma-corrected grey palette if 'height' is a matrix; see
+      'grey.colors'.</code></pre>
+<p>border: the color to be used for the border of the bars. Use ‘border = NA’ to omit borders. If there are shading lines, ‘border = TRUE’ means use the same colour for the border as for the shading lines.</p>
+<p>main,sub: main title and subtitle for the plot.</p>
+<pre><code>xlab: a label for the x axis.
+
+ylab: a label for the y axis.
+
+xlim: limits for the x axis.
+
+ylim: limits for the y axis.
+
+ xpd: logical. Should bars be allowed to go outside region?
+
+ log: string specifying if axis scales should be logarithmic; see
+      'plot.default'.
+
+axes: logical.  If 'TRUE', a vertical (or horizontal, if 'horiz' is
+      true) axis is drawn.</code></pre>
+<p>axisnames: logical. If ‘TRUE’, and if there are ‘names.arg’ (see above), the other axis is drawn (with ‘lty = 0’) and labeled.</p>
+<p>cex.axis: expansion factor for numeric axis labels (see ‘par(’cex’)’).</p>
+<p>cex.names: expansion factor for axis names (bar labels).</p>
+<p>inside: logical. If ‘TRUE’, the lines which divide adjacent (non-stacked!) bars will be drawn. Only applies when ‘space = 0’ (which it partly is when ‘beside = TRUE’).</p>
+<pre><code>plot: logical.  If 'FALSE', nothing is plotted.</code></pre>
+<p>axis.lty: the graphics parameter ‘lty’ (see ‘par(’lty’)’) applied to the axis and tick marks of the categorical (default horizontal) axis. Note that by default the axis is suppressed.</p>
+<p>offset: a vector indicating how much the bars should be shifted relative to the x axis.</p>
+<pre><code> add: logical specifying if bars should be added to an already
+      existing plot; defaults to 'FALSE'.
+
+ ann: logical specifying if the default annotation ('main', 'sub',
+      'xlab', 'ylab') should appear on the plot, see 'title'.</code></pre>
+<p>args.legend: list of additional arguments to pass to ‘legend()’; names of the list are used as argument names. Only used if ‘legend.text’ is supplied.</p>
+<p>formula: a formula where the ‘y’ variables are numeric data to plot against the categorical ‘x’ variables. The formula can have one of three forms:</p>
+<pre><code>            y ~ x
+            y ~ x1 + x2
+            cbind(y1, y2) ~ x
+      
+      (see the examples).
+
+data: a data frame (or list) from which the variables in formula
       should be taken.</code></pre>
-<p>subset: an optional vector specifying a subset of observations to be used for plotting.</p>
-<p>na.action: a function which indicates what should happen when the data contain ’NA’s. The default is to ignore missing values in either the response or the group.</p>
-<p>xlab, ylab: x- and y-axis annotation, since R 3.6.0 with a non-empty default. Can be suppressed by ‘ann=FALSE’.</p>
-<pre><code> ann: 'logical' indicating if axes should be annotated (by 'xlab'
-      and 'ylab').</code></pre>
-<p>drop, sep, lex.order: passed to ‘split.default’, see there.</p>
-<pre><code>   x: for specifying data from which the boxplots are to be
-      produced. Either a numeric vector, or a single list
-      containing such vectors. Additional unnamed arguments specify
-      further data as separate vectors (each corresponding to a
-      component boxplot).  'NA's are allowed in the data.
-
- ...: For the 'formula' method, named arguments to be passed to the
-      default method.
-
-      For the default method, unnamed arguments are additional data
-      vectors (unless 'x' is a list when they are ignored), and
-      named arguments are arguments and graphical parameters to be
-      passed to 'bxp' in addition to the ones given by argument
-      'pars' (and override those in 'pars'). Note that 'bxp' may or
-      may not make use of graphical parameters it is passed: see
-      its documentation.</code></pre>
-<p>range: this determines how far the plot whiskers extend out from the box. If ‘range’ is positive, the whiskers extend to the most extreme data point which is no more than ‘range’ times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.</p>
-<p>width: a vector giving the relative widths of the boxes making up the plot.</p>
-<p>varwidth: if ‘varwidth’ is ‘TRUE’, the boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.</p>
-<p>notch: if ‘notch’ is ‘TRUE’, a notch is drawn in each side of the boxes. If the notches of two plots do not overlap this is ‘strong evidence’ that the two medians differ (Chambers <em>et al</em>, 1983, p.&nbsp;62). See ‘boxplot.stats’ for the calculations used.</p>
-<p>outline: if ‘outline’ is not true, the outliers are not drawn (as points whereas S+ uses lines).</p>
-<p>names: group labels which will be printed under each boxplot. Can be a character vector or an expression (see plotmath).</p>
-<p>boxwex: a scale factor to be applied to all boxes. When there are only a few groups, the appearance of the plot can be improved by making the boxes narrower.</p>
-<p>staplewex: staple line width expansion, proportional to box width.</p>
-<p>outwex: outlier line width expansion, proportional to box width.</p>
-<pre><code>plot: if 'TRUE' (the default) then a boxplot is produced.  If not,
-      the summaries which the boxplots are based on are returned.</code></pre>
-<p>border: an optional vector of colors for the outlines of the boxplots. The values in ‘border’ are recycled if the length of ‘border’ is less than the number of plots.</p>
-<pre><code> col: if 'col' is non-null it is assumed to contain colors to be
-      used to colour the bodies of the box plots. By default they
-      are in the background colour.
-
- log: character indicating if x or y or both coordinates should be
-      plotted in log scale.
-
-pars: a list of (potentially many) more graphical parameters, e.g.,
-      'boxwex' or 'outpch'; these are passed to 'bxp' (if 'plot' is
-      true); for details, see there.</code></pre>
-<p>horizontal: logical indicating if the boxplots should be horizontal; default ‘FALSE’ means vertical boxes.</p>
-<pre><code> add: logical, if true _add_ boxplot to current plot.
-
-  at: numeric vector giving the locations where the boxplots should
-      be drawn, particularly when 'add = TRUE'; defaults to '1:n'
-      where 'n' is the number of boxes.</code></pre>
-<p>Details:</p>
-<pre><code> The generic function 'boxplot' currently has a default method
- ('boxplot.default') and a formula interface ('boxplot.formula').
-
- If multiple groups are supplied either as multiple arguments or
- via a formula, parallel boxplots will be plotted, in the order of
- the arguments or the order of the levels of the factor (see
- 'factor').
-
- Missing values are ignored when forming boxplots.</code></pre>
+<p>subset: an optional vector specifying a subset of observations to be used.</p>
+<p>na.action: a function which indicates what should happen when the data contain ‘NA’ values. The default is to ignore missing values in the given variables.</p>
+<pre><code> ...: arguments to be passed to/from other methods.  For the
+      default method these can include further arguments (such as
+      'axes', 'asp' and 'main') and graphical parameters (see
+      'par') which are passed to 'plot.window()', 'title()' and
+      'axis'.</code></pre>
 <p>Value:</p>
-<pre><code> List with the following components:</code></pre>
-<p>stats: a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot. If all the inputs have the same class attribute, so will this component.</p>
-<pre><code>   n: a vector with the number of (non-'NA') observations in each
-      group.
-
-conf: a matrix where each column contains the lower and upper
-      extremes of the notch.
-
- out: the values of any data points which lie beyond the extremes
-      of the whiskers.</code></pre>
-<p>group: a vector of the same length as ‘out’ whose elements indicate to which group the outlier belongs.</p>
-<p>names: a vector of names for the groups.</p>
+<pre><code> A numeric vector (or matrix, when 'beside = TRUE'), say 'mp',
+ giving the coordinates of _all_ the bar midpoints drawn, useful
+ for adding to the graph.
+
+ If 'beside' is true, use 'colMeans(mp)' for the midpoints of each
+ _group_ of bars, see example.</code></pre>
+<p>Author(s):</p>
+<pre><code> R Core, with a contribution by Arni Magnusson.</code></pre>
 <p>References:</p>
-<pre><code> Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).  _The New
- S Language_.  Wadsworth &amp; Brooks/Cole.
-
- Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A.
- (1983).  _Graphical Methods for Data Analysis_.  Wadsworth &amp;
- Brooks/Cole.
-
- Murrell, P. (2005).  _R Graphics_.  Chapman &amp; Hall/CRC Press.
+<pre><code> Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
+ Language_.  Wadsworth &amp; Brooks/Cole.
 
- See also 'boxplot.stats'.</code></pre>
+ Murrell, P. (2005) _R Graphics_. Chapman &amp; Hall/CRC Press.</code></pre>
 <p>See Also:</p>
-<pre><code> 'boxplot.stats' which does the computation, 'bxp' for the plotting
- and more examples; and 'stripchart' for an alternative (with small
- data sets).</code></pre>
+<pre><code> 'plot(..., type = "h")', 'dotchart'; 'hist' for bars of a
+ _continuous_ variable.  'mosaicplot()', more sophisticated to
+ visualize _several_ categorical variables.</code></pre>
 <p>Examples:</p>
-<pre><code> ## boxplot on a formula:
- boxplot(count ~ spray, data = InsectSprays, col = "lightgray")
- # *add* notches (somewhat funny here &lt;--&gt; warning "notches .. outside hinges"):
- boxplot(count ~ spray, data = InsectSprays,
-         notch = TRUE, add = TRUE, col = "blue")
+<pre><code> # Formula method
+ barplot(GNP ~ Year, data = longley)
+ barplot(cbind(Employed, Unemployed) ~ Year, data = longley)
  
- boxplot(decrease ~ treatment, data = OrchardSprays, col = "bisque",
-         log = "y")
- ## horizontal=TRUE, switching  y &lt;--&gt; x :
- boxplot(decrease ~ treatment, data = OrchardSprays, col = "bisque",
-         log = "x", horizontal=TRUE)
+ ## 3rd form of formula - 2 categories :
+ op &lt;- par(mfrow = 2:1, mgp = c(3,1,0)/2, mar = .1+c(3,3:1))
+ summary(d.Titanic &lt;- as.data.frame(Titanic))
+ barplot(Freq ~ Class + Survived, data = d.Titanic,
+         subset = Age == "Adult" &amp; Sex == "Male",
+         main = "barplot(Freq ~ Class + Survived, *)", ylab = "# {passengers}", legend.text = TRUE)
+ # Corresponding table :
+ (xt &lt;- xtabs(Freq ~ Survived + Class + Sex, d.Titanic, subset = Age=="Adult"))
+ # Alternatively, a mosaic plot :
+ mosaicplot(xt[,,"Male"], main = "mosaicplot(Freq ~ Class + Survived, *)", color=TRUE)
+ par(op)
  
- rb &lt;- boxplot(decrease ~ treatment, data = OrchardSprays, col = "bisque")
- title("Comparing boxplot()s and non-robust mean +/- SD")
- mn.t &lt;- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)
- sd.t &lt;- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)
- xi &lt;- 0.3 + seq(rb$n)
- points(xi, mn.t, col = "orange", pch = 18)
- arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,
-        code = 3, col = "pink", angle = 75, length = .1)
  
- ## boxplot on a matrix:
- mat &lt;- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),
-              `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))
- boxplot(mat) # directly, calling boxplot.matrix()
+ # Default method
+ require(grDevices) # for colours
+ tN &lt;- table(Ni &lt;- stats::rpois(100, lambda = 5))
+ r &lt;- barplot(tN, col = rainbow(20))
+ #- type = "h" plotting *is* 'bar'plot
+ lines(r, tN, type = "h", col = "red", lwd = 2)
  
- ## boxplot on a data frame:
- df. &lt;- as.data.frame(mat)
- par(las = 1) # all axis labels horizontal
- boxplot(df., main = "boxplot(*, horizontal = TRUE)", horizontal = TRUE)
+ barplot(tN, space = 1.5, axisnames = FALSE,
+         sub = "barplot(..., space= 1.5, axisnames = FALSE)")
  
- ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :
- boxplot(len ~ dose, data = ToothGrowth,
-         boxwex = 0.25, at = 1:3 - 0.2,
-         subset = supp == "VC", col = "yellow",
-         main = "Guinea Pigs' Tooth Growth",
-         xlab = "Vitamin C dose mg",
-         ylab = "tooth length",
-         xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = "i")
- boxplot(len ~ dose, data = ToothGrowth, add = TRUE,
-         boxwex = 0.25, at = 1:3 + 0.2,
-         subset = supp == "OJ", col = "orange")
- legend(2, 9, c("Ascorbic acid", "Orange juice"),
-        fill = c("yellow", "orange"))
+ barplot(VADeaths, plot = FALSE)
+ barplot(VADeaths, plot = FALSE, beside = TRUE)
  
- ## With less effort (slightly different) using factor *interaction*:
- boxplot(len ~ dose:supp, data = ToothGrowth,
-         boxwex = 0.5, col = c("orange", "yellow"),
-         main = "Guinea Pigs' Tooth Growth",
-         xlab = "Vitamin C dose mg", ylab = "tooth length",
-         sep = ":", lex.order = TRUE, ylim = c(0, 35), yaxs = "i")
+ mp &lt;- barplot(VADeaths) # default
+ tot &lt;- colMeans(VADeaths)
+ text(mp, tot + 3, format(tot), xpd = TRUE, col = "blue")
+ barplot(VADeaths, beside = TRUE,
+         col = c("lightblue", "mistyrose", "lightcyan",
+                 "lavender", "cornsilk"),
+         legend.text = rownames(VADeaths), ylim = c(0, 100))
+ title(main = "Death Rates in Virginia", font.main = 4)
  
- ## more examples in  help(bxp)</code></pre>
+ hh &lt;- t(VADeaths)[, 5:1]
+ mybarcol &lt;- "gray20"
+ mp &lt;- barplot(hh, beside = TRUE,
+         col = c("lightblue", "mistyrose",
+                 "lightcyan", "lavender"),
+         legend.text = colnames(VADeaths), ylim = c(0,100),
+         main = "Death Rates in Virginia", font.main = 4,
+         sub = "Faked upper 2*sigma error bars", col.sub = mybarcol,
+         cex.names = 1.5)
+ segments(mp, hh, mp, hh + 2*sqrt(1000*hh/100), col = mybarcol, lwd = 1.5)
+ stopifnot(dim(mp) == dim(hh))  # corresponding matrices
+ mtext(side = 1, at = colMeans(mp), line = -2,
+       text = paste("Mean", formatC(colMeans(hh))), col = "red")
+ 
+ # Bar shading example
+ barplot(VADeaths, angle = 15+10*1:5, density = 20, col = "black",
+         legend.text = rownames(VADeaths))
+ title(main = list("Death Rates in Virginia", font = 4))
+ 
+ # Border color
+ barplot(VADeaths, border = "dark blue") 
+ 
+ 
+ # Log scales (not much sense here)
+ barplot(tN, col = heat.colors(12), log = "y")
+ barplot(tN, col = gray.colors(20), log = "xy")
+ 
+ # Legend location
+ barplot(height = cbind(x = c(465, 91) / 465 * 100,
+                        y = c(840, 200) / 840 * 100,
+                        z = c(37, 17) / 37 * 100),
+         beside = FALSE,
+         width = c(465, 840, 37),
+         col = c(1, 2),
+         legend.text = c("A", "B"),
+         args.legend = list(x = "topleft"))</code></pre>
 </section>
 <section id="barplot-example" class="slide level2">
 <h2><code>barplot()</code> example</h2>
 <p>The function takes the a lot of arguments to control the way the way our data is plotted.</p>
-<p>Reminder</p>
+<p>Reminder function signature</p>
 <pre><code>barplot(height, width = 1, space = NULL,
         names.arg = NULL, legend.text = NULL, beside = FALSE,
         horiz = FALSE, density = NULL, angle = 45,
@@ -2033,23 +1993,15 @@ <h2><code>barplot()</code> example</h2>
         inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,
         add = FALSE, ann = !add &amp;&amp; par("ann"), args.legend = NULL, ...)</code></pre>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb84"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb84-1"><a></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>seropos, df<span class="sc">$</span>age_group)</span>
-<span id="cb84-2"><a></a><span class="fu">barplot</span>(freq)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb83"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb83-1"><a href="#cb83-1"></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>seropos, df<span class="sc">$</span>age_group)</span>
+<span id="cb83-2"><a href="#cb83-2"></a><span class="fu">barplot</span>(freq)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-1.png" width="960"></p>
-</figure>
-</div>
 </div>
-<div class="sourceCode cell-code" id="cb85"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb85-1"><a></a>prop <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq)</span>
-<span id="cb85-2"><a></a><span class="fu">barplot</span>(prop)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb84"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb84-1"><a href="#cb84-1"></a>prop.cell.percentages <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq)</span>
+<span id="cb84-2"><a href="#cb84-2"></a><span class="fu">barplot</span>(prop.cell.percentages)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<div>
-<figure>
 <p><img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-21-2.png" width="960"></p>
-</figure>
-</div>
 </div>
 </div>
 </section>
@@ -2057,7 +2009,7 @@ <h2><code>barplot()</code> example</h2>
 <h2>3. Legend!</h2>
 <p>In Base R plotting the legend is not automatically generated. This is nice because it gives you a huge amount of control over how your legend looks, but it is also easy to mislabel your colors, symbols, line types, etc. So, basically be careful.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb86"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb86-1"><a></a>?legend</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb85"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb85-1"><a href="#cb85-1"></a>?legend</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <div class="cell">
 <div class="cell-output cell-output-stdout">
@@ -2422,7 +2374,7 @@ <h2>3. Legend!</h2>
 </section>
 <section id="add-legend-to-the-plot" class="slide level2">
 <h2>Add legend to the plot</h2>
-<p>Reminder</p>
+<p>Reminder function signature</p>
 <pre><code>legend(x, y = NULL, legend, fill = NULL, col = par("col"),
        border = "black", lty, lwd, pch,
        angle = 45, density = NULL, bty = "o", bg = par("bg"),
@@ -2437,45 +2389,55 @@ <h2>Add legend to the plot</h2>
        seg.len = 2)</code></pre>
 <p>Let’s practice</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb89"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb89-1"><a></a><span class="fu">barplot</span>(prop, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">0.7</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
-<span id="cb89-2"><a></a><span class="fu">legend</span>(<span class="at">x=</span><span class="fl">2.5</span>, <span class="at">y=</span><span class="fl">0.7</span>,</span>
-<span id="cb89-3"><a></a>             <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), </span>
-<span id="cb89-4"><a></a>             <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-
+<div class="sourceCode cell-code" id="cb88"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb88-1"><a href="#cb88-1"></a><span class="fu">barplot</span>(prop.cell.percentages, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">0.5</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
+<span id="cb88-2"><a href="#cb88-2"></a><span class="fu">legend</span>(<span class="at">x=</span><span class="fl">2.5</span>, <span class="at">y=</span><span class="fl">0.5</span>,</span>
+<span id="cb88-3"><a href="#cb88-3"></a>             <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), </span>
+<span id="cb88-4"><a href="#cb88-4"></a>             <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-24-1.png" width="960" class="r-stretch"></section>
+</section>
+<section id="add-legend-to-the-plot-1" class="slide level2">
+<h2>Add legend to the plot</h2>
+
+<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png" width="960" class="r-stretch"></section>
 <section id="barplot-example-1" class="slide level2">
 <h2><code>barplot()</code> example</h2>
 <p>Getting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb90"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb90-1"><a></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>seropos, df<span class="sc">$</span>age_group)</span>
-<span id="cb90-2"><a></a>tot.per.age.group <span class="ot">&lt;-</span> <span class="fu">colSums</span>(freq)</span>
-<span id="cb90-3"><a></a>age.seropos.matrix <span class="ot">&lt;-</span> <span class="fu">t</span>(<span class="fu">t</span>(freq)<span class="sc">/</span>tot.per.age.group)</span>
-<span id="cb90-4"><a></a><span class="fu">colnames</span>(age.seropos.matrix) <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"1-5 yo"</span>, <span class="st">"6-10 yo"</span>, <span class="st">"11-15 yo"</span>)</span>
-<span id="cb90-5"><a></a></span>
-<span id="cb90-6"><a></a><span class="fu">barplot</span>(age.seropos.matrix, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
-<span id="cb90-7"><a></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
-<span id="cb90-8"><a></a><span class="fu">legend</span>(<span class="at">x=</span><span class="fl">2.8</span>, <span class="at">y=</span><span class="fl">1.35</span>,</span>
-<span id="cb90-9"><a></a>             <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), </span>
-<span id="cb90-10"><a></a>             <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-
+<div class="sourceCode cell-code" id="cb89"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb89-1"><a href="#cb89-1"></a>freq <span class="ot">&lt;-</span> <span class="fu">table</span>(df<span class="sc">$</span>seropos, df<span class="sc">$</span>age_group)</span>
+<span id="cb89-2"><a href="#cb89-2"></a>prop.column.percentages <span class="ot">&lt;-</span> <span class="fu">prop.table</span>(freq, <span class="at">margin=</span><span class="dv">2</span>)</span>
+<span id="cb89-3"><a href="#cb89-3"></a><span class="fu">colnames</span>(prop.column.percentages) <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">"1-5 yo"</span>, <span class="st">"6-10 yo"</span>, <span class="st">"11-15 yo"</span>)</span>
+<span id="cb89-4"><a href="#cb89-4"></a></span>
+<span id="cb89-5"><a href="#cb89-5"></a><span class="fu">barplot</span>(prop.column.percentages, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
+<span id="cb89-6"><a href="#cb89-6"></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
+<span id="cb89-7"><a href="#cb89-7"></a><span class="fu">legend</span>(<span class="at">x=</span><span class="fl">2.8</span>, <span class="at">y=</span><span class="fl">1.35</span>,</span>
+<span id="cb89-8"><a href="#cb89-8"></a>             <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), </span>
+<span id="cb89-9"><a href="#cb89-9"></a>             <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png" width="960" class="r-stretch"></section>
+</section>
 <section id="barplot-example-2" class="slide level2">
 <h2><code>barplot()</code> example</h2>
+
+<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png" width="960" class="r-stretch"></section>
+<section id="barplot-example-3" class="slide level2">
+<h2><code>barplot()</code> example</h2>
 <p>Now, let look at seropositivity by two individual level characteristics in the same plot.</p>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb91"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb91-1"><a></a><span class="fu">par</span>(<span class="at">mfrow =</span> <span class="fu">c</span>(<span class="dv">1</span>,<span class="dv">2</span>))</span>
-<span id="cb91-2"><a></a><span class="fu">barplot</span>(age.seropos.matrix, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
-<span id="cb91-3"><a></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
-<span id="cb91-4"><a></a><span class="fu">legend</span>(<span class="at">x=</span><span class="dv">1</span>, <span class="at">y=</span><span class="fl">1.35</span>, <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span>
-<span id="cb91-5"><a></a></span>
-<span id="cb91-6"><a></a><span class="fu">barplot</span>(slum.seropos.matrix, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Residence"</span>)</span>
-<span id="cb91-7"><a></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
-<span id="cb91-8"><a></a><span class="fu">legend</span>(<span class="at">x=</span><span class="dv">1</span>, <span class="at">y=</span><span class="fl">1.35</span>, <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>),  <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
-
+<div class="sourceCode cell-code" id="cb90"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb90-1"><a href="#cb90-1"></a><span class="fu">par</span>(<span class="at">mfrow =</span> <span class="fu">c</span>(<span class="dv">1</span>,<span class="dv">2</span>))</span>
+<span id="cb90-2"><a href="#cb90-2"></a><span class="fu">barplot</span>(prop.column.percentages, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Age Group"</span>)</span>
+<span id="cb90-3"><a href="#cb90-3"></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
+<span id="cb90-4"><a href="#cb90-4"></a><span class="fu">legend</span>(<span class="st">"topright"</span>,</span>
+<span id="cb90-5"><a href="#cb90-5"></a>             <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), </span>
+<span id="cb90-6"><a href="#cb90-6"></a>             <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span>
+<span id="cb90-7"><a href="#cb90-7"></a></span>
+<span id="cb90-8"><a href="#cb90-8"></a><span class="fu">barplot</span>(prop.column.percentages2, <span class="at">col=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>), <span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="fl">1.35</span>), <span class="at">main=</span><span class="st">"Seropositivity by Residence"</span>)</span>
+<span id="cb90-9"><a href="#cb90-9"></a><span class="fu">axis</span>(<span class="dv">2</span>, <span class="at">at =</span> <span class="fu">c</span>(<span class="fl">0.2</span>, <span class="fl">0.4</span>, <span class="fl">0.6</span>, <span class="fl">0.8</span>,<span class="dv">1</span>))</span>
+<span id="cb90-10"><a href="#cb90-10"></a><span class="fu">legend</span>(<span class="st">"topright"</span>, <span class="at">fill=</span><span class="fu">c</span>(<span class="st">"darkblue"</span>,<span class="st">"red"</span>),  <span class="at">legend =</span> <span class="fu">c</span>(<span class="st">"seronegative"</span>, <span class="st">"seropositive"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
-<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png" width="960" class="r-stretch"></section>
+</section>
+<section id="barplot-example-4" class="slide level2">
+<h2><code>barplot()</code> example</h2>
+
+<img data-src="Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-30-1.png" width="960" class="r-stretch"></section>
 <section id="summary" class="slide level2">
 <h2>Summary</h2>
 <ul>
@@ -2484,16 +2446,14 @@ <h2>Summary</h2>
 </section>
 <section id="acknowledgements" class="slide level2">
 <h2>Acknowledgements</h2>
-<p>These are the materials I looked through, modified, or extracted to complete this module’s lecture.</p>
+<p>These are the materials we looked through, modified, or extracted to complete this module’s lecture.</p>
 <ul>
 <li><a href="https://towardsdatascience.com/base-plotting-in-r-eb365da06b22">“Base Plotting in R” by Medium</a></li>
 <li><pre><code>  ["Base R margins: a cheatsheet"](https://r-graph-gallery.com/74-margin-and-oma-cheatsheet.html)</code></pre></li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -2522,6 +2482,7 @@ <h2>Acknowledgements</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': true,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -2776,7 +2737,18 @@ <h2>Acknowledgements</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -2808,50 +2780,11 @@ <h2>Acknowledgements</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -2861,17 +2794,8 @@ <h2>Acknowledgements</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -2885,11 +2809,7 @@ <h2>Acknowledgements</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png
index 5656265..4e5c9c8 100644
Binary files a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png and b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-15-2.png differ
diff --git a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-24-1.png b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-24-1.png
deleted file mode 100644
index 5313a2a..0000000
Binary files a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-24-1.png and /dev/null differ
diff --git a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png
index 232d44e..edfae88 100644
Binary files a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png and b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-25-1.png differ
diff --git a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png
index 1abfaa6..232d44e 100644
Binary files a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png and b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-27-1.png differ
diff --git a/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-30-1.png b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-30-1.png
new file mode 100644
index 0000000..c6eb02c
Binary files /dev/null and b/docs/modules/Module10-DataVisualization_files/figure-revealjs/unnamed-chunk-30-1.png differ
diff --git a/docs/modules/ModuleXX-Iteration.html b/docs/modules/ModuleXX-Iteration.html
index adee8c8..8c66c94 100644
--- a/docs/modules/ModuleXX-Iteration.html
+++ b/docs/modules/ModuleXX-Iteration.html
@@ -8,11 +8,11 @@
 <link href="../site_libs/quarto-html/light-border.css" rel="stylesheet">
 <link href="../site_libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
 <link href="../site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
-  <meta name="generator" content="quarto-1.6.1">
+  <meta name="generator" content="quarto-1.3.353">
 
   <meta name="author" content="Amy Winter">
   <meta name="author" content="Zane Billings">
-  <title>SISMID Module NUMBER Materials (2025) – Iteration in R</title>
+  <title>SISMID Module NUMBER Materials (2025) - Iteration in R</title>
   <meta name="apple-mobile-web-app-capable" content="yes">
   <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
   <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
@@ -32,7 +32,7 @@
     }
     /* CSS for syntax highlighting */
     pre > code.sourceCode { white-space: pre; position: relative; }
-    pre > code.sourceCode > span { line-height: 1.25; }
+    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
     pre > code.sourceCode > span:empty { height: 1.2em; }
     .sourceCode { overflow: visible; }
     code.sourceCode > span { color: inherit; text-decoration: inherit; }
@@ -43,7 +43,7 @@
     }
     @media print {
     pre > code.sourceCode { white-space: pre-wrap; }
-    pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
+    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
     }
     pre.numberSource code
       { counter-reset: source-line 0; }
@@ -71,7 +71,7 @@
     code span.at { color: #657422; } /* Attribute */
     code span.bn { color: #ad0000; } /* BaseN */
     code span.bu { } /* BuiltIn */
-    code span.cf { color: #003b4f; font-weight: bold; } /* ControlFlow */
+    code span.cf { color: #003b4f; } /* ControlFlow */
     code span.ch { color: #20794d; } /* Char */
     code span.cn { color: #8f5902; } /* Constant */
     code span.co { color: #5e5e5e; } /* Comment */
@@ -85,7 +85,7 @@
     code span.fu { color: #4758ab; } /* Function */
     code span.im { color: #00769e; } /* Import */
     code span.in { color: #5e5e5e; } /* Information */
-    code span.kw { color: #003b4f; font-weight: bold; } /* Keyword */
+    code span.kw { color: #003b4f; } /* Keyword */
     code span.op { color: #5e5e5e; } /* Operator */
     code span.ot { color: #003b4f; } /* Other */
     code span.pp { color: #ad0000; } /* Preprocessor */
@@ -222,8 +222,7 @@
   }
 
   .callout.callout-titled .callout-body > .callout-content > :last-child {
-    padding-bottom: 0.5rem;
-    margin-bottom: 0;
+    margin-bottom: 0.5rem;
   }
 
   .callout.callout-titled .callout-icon::before {
@@ -407,44 +406,11 @@ <h1 class="title">Iteration in R</h1>
 </div>
 </div>
 
-</section><section id="TOC">
-<nav role="doc-toc"> 
-<h2 id="toc-title">Page Items</h2>
-<ul>
-<li><a href="#/learning-goals" id="/toc-learning-goals">Learning goals</a></li>
-<li><a href="#/what-is-iteration" id="/toc-what-is-iteration">What is iteration?</a></li>
-<li><a href="#/parts-of-a-loop" id="/toc-parts-of-a-loop">Parts of a loop</a></li>
-<li><a href="#/parts-of-a-loop-1" id="/toc-parts-of-a-loop-1">Parts of a loop</a></li>
-<li><a href="#/header-parts" id="/toc-header-parts">Header parts</a></li>
-<li><a href="#/header-parts-1" id="/toc-header-parts-1">Header parts</a></li>
-<li><a href="#/loop-iteration-1" id="/toc-loop-iteration-1">Loop iteration 1</a></li>
-<li><a href="#/loop-iteration-2" id="/toc-loop-iteration-2">Loop iteration 2</a></li>
-<li><a href="#/loop-iteration-3" id="/toc-loop-iteration-3">Loop iteration 3</a></li>
-<li><a href="#/the-loop-structure-automates-this-process-for-us-so-we-dont-have-to-copy-and-paste-our-code" id="/toc-the-loop-structure-automates-this-process-for-us-so-we-dont-have-to-copy-and-paste-our-code">The loop structure automates this process for us so we don’t have to copy and paste our code!</a></li>
-<li><a href="#/remember-write-dry-code" id="/toc-remember-write-dry-code">Remember: write DRY code!</a></li>
-
-<li><a href="#/you-try-it" id="/toc-you-try-it">You try it!</a></li>
-<li><a href="#/wait-did-we-need-to-do-that" id="/toc-wait-did-we-need-to-do-that">Wait, did we need to do that?</a></li>
-<li><a href="#/wait-did-we-need-to-do-that-1" id="/toc-wait-did-we-need-to-do-that-1">Wait, did we need to do that?</a></li>
-<li><a href="#/wait-did-we-need-to-do-that-2" id="/toc-wait-did-we-need-to-do-that-2">Wait, did we need to do that?</a></li>
-<li><a href="#/loop-walkthrough" id="/toc-loop-walkthrough">Loop walkthrough</a></li>
-<li><a href="#/loop-walkthrough-1" id="/toc-loop-walkthrough-1">Loop walkthrough</a></li>
-<li><a href="#/loop-walkthrough-2" id="/toc-loop-walkthrough-2">Loop walkthrough</a></li>
-<li><a href="#/loop-walkthrough-3" id="/toc-loop-walkthrough-3">Loop walkthrough</a></li>
-<li><a href="#/you-try-it-if-we-have-time" id="/toc-you-try-it-if-we-have-time">You try it! (if we have time)</a></li>
-<li><a href="#/main-problem-solution" id="/toc-main-problem-solution">Main problem solution</a></li>
-<li><a href="#/main-problem-solution-1" id="/toc-main-problem-solution-1">Main problem solution</a></li>
-<li><a href="#/bonus-problem-solution" id="/toc-bonus-problem-solution">Bonus problem solution</a></li>
-<li><a href="#/bonus-problem-solution-1" id="/toc-bonus-problem-solution-1">Bonus problem solution</a></li>
-<li><a href="#/more-practice-on-your-own" id="/toc-more-practice-on-your-own">More practice on your own</a></li>
-</ul>
-</nav>
 </section>
 <section id="learning-goals" class="slide level2">
 <h2>Learning goals</h2>
 <ol type="1">
 <li>Replace repetitive code with a <code>for</code> loop</li>
-<li>Compare and contrast <code>for</code> loops and <code>*apply()</code> functions</li>
 <li>Use vectorization to replace unnecessary loops</li>
 </ol>
 </section>
@@ -455,16 +421,16 @@ <h2>What is iteration?</h2>
 <li>In <code>R</code>, this means running the same code multiple times in a row.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a></a><span class="fu">data</span>(<span class="st">"penguins"</span>, <span class="at">package =</span> <span class="st">"palmerpenguins"</span>)</span>
-<span id="cb1-2"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
-<span id="cb1-3"><a></a>    island_mean <span class="ot">&lt;-</span></span>
-<span id="cb1-4"><a></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
-<span id="cb1-5"><a></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb1-6"><a></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb1-7"><a></a>    </span>
-<span id="cb1-8"><a></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb1-9"><a></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
-<span id="cb1-10"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1"></a><span class="fu">data</span>(<span class="st">"penguins"</span>, <span class="at">package =</span> <span class="st">"palmerpenguins"</span>)</span>
+<span id="cb1-2"><a href="#cb1-2"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
+<span id="cb1-3"><a href="#cb1-3"></a>    island_mean <span class="ot">&lt;-</span></span>
+<span id="cb1-4"><a href="#cb1-4"></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
+<span id="cb1-5"><a href="#cb1-5"></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb1-6"><a href="#cb1-6"></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb1-7"><a href="#cb1-7"></a>    </span>
+<span id="cb1-8"><a href="#cb1-8"></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb1-9"><a href="#cb1-9"></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
+<span id="cb1-10"><a href="#cb1-10"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>The mean bill depth on Biscoe Island was 15.87 mm.
 The mean bill depth on Dream Island was 18.34 mm.
@@ -475,37 +441,37 @@ <h2>What is iteration?</h2>
 <section id="parts-of-a-loop" class="slide level2">
 <h2>Parts of a loop</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb3" data-code-line-numbers="1,9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
-<span id="cb3-2"><a></a>    island_mean <span class="ot">&lt;-</span></span>
-<span id="cb3-3"><a></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
-<span id="cb3-4"><a></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb3-5"><a></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb3-6"><a></a>    </span>
-<span id="cb3-7"><a></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb3-8"><a></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
-<span id="cb3-9"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb3" data-code-line-numbers="1,9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
+<span id="cb3-2"><a href="#cb3-2"></a>    island_mean <span class="ot">&lt;-</span></span>
+<span id="cb3-3"><a href="#cb3-3"></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
+<span id="cb3-4"><a href="#cb3-4"></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb3-5"><a href="#cb3-5"></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb3-6"><a href="#cb3-6"></a>    </span>
+<span id="cb3-7"><a href="#cb3-7"></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb3-8"><a href="#cb3-8"></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
+<span id="cb3-9"><a href="#cb3-9"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>The <strong>header</strong> declares how many times we will repeat the same code. The header contains a <strong>control variable</strong> that changes in each repetition and a <strong>sequence</strong> of values for the control variable to take.</p>
 </section>
 <section id="parts-of-a-loop-1" class="slide level2">
 <h2>Parts of a loop</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb4" data-code-line-numbers="2-8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
-<span id="cb4-2"><a></a>    island_mean <span class="ot">&lt;-</span></span>
-<span id="cb4-3"><a></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
-<span id="cb4-4"><a></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb4-5"><a></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb4-6"><a></a>    </span>
-<span id="cb4-7"><a></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb4-8"><a></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
-<span id="cb4-9"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb4" data-code-line-numbers="2-8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
+<span id="cb4-2"><a href="#cb4-2"></a>    island_mean <span class="ot">&lt;-</span></span>
+<span id="cb4-3"><a href="#cb4-3"></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
+<span id="cb4-4"><a href="#cb4-4"></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb4-5"><a href="#cb4-5"></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb4-6"><a href="#cb4-6"></a>    </span>
+<span id="cb4-7"><a href="#cb4-7"></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb4-8"><a href="#cb4-8"></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
+<span id="cb4-9"><a href="#cb4-9"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>The <strong>body</strong> of the loop contains code that will be repeated a number of times based on the header instructions. In <code>R</code>, the body has to be surrounded by curly braces.</p>
 </section>
 <section id="header-parts" class="slide level2">
 <h2>Header parts</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <ul>
 <li><code>for</code>: keyword that declares we are doing a for loop.</li>
@@ -519,12 +485,12 @@ <h2>Header parts</h2>
 <section id="header-parts-1" class="slide level2">
 <h2>Header parts</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <ul>
 <li>Since <code>levels(penguins$island)</code> evaluates to <code>c("Biscoe", "Dream", "Torgersen")</code>, our loop will repeat 3 times.</li>
 </ul>
-<table class="caption-top">
+<table>
 <thead>
 <tr class="header">
 <th>Iteration</th>
@@ -553,13 +519,13 @@ <h2>Header parts</h2>
 <section id="loop-iteration-1" class="slide level2">
 <h2>Loop iteration 1</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a></a>island_mean <span class="ot">&lt;-</span></span>
-<span id="cb7-2"><a></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Biscoe"</span>] <span class="sc">|&gt;</span></span>
-<span id="cb7-3"><a></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb7-4"><a></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb7-5"><a></a></span>
-<span id="cb7-6"><a></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Biscoe"</span>, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb7-7"><a></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a href="#cb7-1"></a>island_mean <span class="ot">&lt;-</span></span>
+<span id="cb7-2"><a href="#cb7-2"></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Biscoe"</span>] <span class="sc">|&gt;</span></span>
+<span id="cb7-3"><a href="#cb7-3"></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb7-4"><a href="#cb7-4"></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb7-5"><a href="#cb7-5"></a></span>
+<span id="cb7-6"><a href="#cb7-6"></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Biscoe"</span>, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb7-7"><a href="#cb7-7"></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>The mean bill depth on Biscoe Island was 15.87 mm.</code></pre>
 </div>
@@ -568,13 +534,13 @@ <h2>Loop iteration 1</h2>
 <section id="loop-iteration-2" class="slide level2">
 <h2>Loop iteration 2</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a></a>island_mean <span class="ot">&lt;-</span></span>
-<span id="cb9-2"><a></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Dream"</span>] <span class="sc">|&gt;</span></span>
-<span id="cb9-3"><a></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb9-4"><a></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb9-5"><a></a></span>
-<span id="cb9-6"><a></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Dream"</span>, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb9-7"><a></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1"></a>island_mean <span class="ot">&lt;-</span></span>
+<span id="cb9-2"><a href="#cb9-2"></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Dream"</span>] <span class="sc">|&gt;</span></span>
+<span id="cb9-3"><a href="#cb9-3"></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb9-4"><a href="#cb9-4"></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb9-5"><a href="#cb9-5"></a></span>
+<span id="cb9-6"><a href="#cb9-6"></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Dream"</span>, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb9-7"><a href="#cb9-7"></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>The mean bill depth on Dream Island was 18.34 mm.</code></pre>
 </div>
@@ -583,13 +549,13 @@ <h2>Loop iteration 2</h2>
 <section id="loop-iteration-3" class="slide level2">
 <h2>Loop iteration 3</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a></a>island_mean <span class="ot">&lt;-</span></span>
-<span id="cb11-2"><a></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Torgersen"</span>] <span class="sc">|&gt;</span></span>
-<span id="cb11-3"><a></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb11-4"><a></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb11-5"><a></a></span>
-<span id="cb11-6"><a></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Torgersen"</span>, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb11-7"><a></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1"></a>island_mean <span class="ot">&lt;-</span></span>
+<span id="cb11-2"><a href="#cb11-2"></a>    penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> <span class="st">"Torgersen"</span>] <span class="sc">|&gt;</span></span>
+<span id="cb11-3"><a href="#cb11-3"></a>    <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb11-4"><a href="#cb11-4"></a>    <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb11-5"><a href="#cb11-5"></a></span>
+<span id="cb11-6"><a href="#cb11-6"></a><span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, <span class="st">"Torgersen"</span>, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb11-7"><a href="#cb11-7"></a>                    <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>The mean bill depth on Torgersen Island was 18.43 mm.</code></pre>
 </div>
@@ -598,15 +564,15 @@ <h2>Loop iteration 3</h2>
 <section id="the-loop-structure-automates-this-process-for-us-so-we-dont-have-to-copy-and-paste-our-code" class="slide level2">
 <h2>The loop structure automates this process for us so we don’t have to copy and paste our code!</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
-<span id="cb13-2"><a></a>    island_mean <span class="ot">&lt;-</span></span>
-<span id="cb13-3"><a></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
-<span id="cb13-4"><a></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
-<span id="cb13-5"><a></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
-<span id="cb13-6"><a></a>    </span>
-<span id="cb13-7"><a></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
-<span id="cb13-8"><a></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
-<span id="cb13-9"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1"></a><span class="cf">for</span> (this_island <span class="cf">in</span> <span class="fu">levels</span>(penguins<span class="sc">$</span>island)) {</span>
+<span id="cb13-2"><a href="#cb13-2"></a>    island_mean <span class="ot">&lt;-</span></span>
+<span id="cb13-3"><a href="#cb13-3"></a>        penguins<span class="sc">$</span>bill_depth_mm[penguins<span class="sc">$</span>island <span class="sc">==</span> this_island] <span class="sc">|&gt;</span></span>
+<span id="cb13-4"><a href="#cb13-4"></a>        <span class="fu">mean</span>(<span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">|&gt;</span></span>
+<span id="cb13-5"><a href="#cb13-5"></a>        <span class="fu">round</span>(<span class="at">digits =</span> <span class="dv">2</span>)</span>
+<span id="cb13-6"><a href="#cb13-6"></a>    </span>
+<span id="cb13-7"><a href="#cb13-7"></a>    <span class="fu">cat</span>(<span class="fu">paste</span>(<span class="st">"The mean bill depth on"</span>, this_island, <span class="st">"Island was"</span>, island_mean,</span>
+<span id="cb13-8"><a href="#cb13-8"></a>                            <span class="st">"mm.</span><span class="sc">\n</span><span class="st">"</span>))</span>
+<span id="cb13-9"><a href="#cb13-9"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>The mean bill depth on Biscoe Island was 15.87 mm.
 The mean bill depth on Dream Island was 18.34 mm.
@@ -636,9 +602,9 @@ <h2>You try it!</h2>
 <p>Write a loop that goes from 1 to 10, squares each of the numbers, and prints the squared number.</p>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>) {</span>
-<span id="cb15-2"><a></a>    <span class="fu">cat</span>(i <span class="sc">^</span> <span class="dv">2</span>, <span class="st">"</span><span class="sc">\n</span><span class="st">"</span>)</span>
-<span id="cb15-3"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>) {</span>
+<span id="cb15-2"><a href="#cb15-2"></a>    <span class="fu">cat</span>(i <span class="sc">^</span> <span class="dv">2</span>, <span class="st">"</span><span class="sc">\n</span><span class="st">"</span>)</span>
+<span id="cb15-3"><a href="#cb15-3"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>1 
 4 
@@ -670,8 +636,8 @@ <h2>Wait, did we need to do that?</h2>
 <li>Almost all basic operations in R are <strong>vectorized</strong>: they work on a vector of arguments all at the same time.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a></a><span class="co"># No loop needed!</span></span>
-<span id="cb17-2"><a></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>)<span class="sc">^</span><span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1"></a><span class="co"># No loop needed!</span></span>
+<span id="cb17-2"><a href="#cb17-2"></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>)<span class="sc">^</span><span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code> [1]   1   4   9  16  25  36  49  64  81 100</code></pre>
 </div>
@@ -685,15 +651,15 @@ <h2>Wait, did we need to do that?</h2>
 <li>Almost all basic operations in R are <strong>vectorized</strong>: they work on a vector of arguments all at the same time.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a></a><span class="co"># No loop needed!</span></span>
-<span id="cb19-2"><a></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>)<span class="sc">^</span><span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb19-1"><a href="#cb19-1"></a><span class="co"># No loop needed!</span></span>
+<span id="cb19-2"><a href="#cb19-2"></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>)<span class="sc">^</span><span class="dv">2</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code> [1]   1   4   9  16  25  36  49  64  81 100</code></pre>
 </div>
 </div>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a></a><span class="co"># Get the first 10 odd numbers, a common CS 101 loop problem on exams</span></span>
-<span id="cb21-2"><a></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)[<span class="fu">which</span>((<span class="dv">1</span><span class="sc">:</span><span class="dv">20</span> <span class="sc">%%</span> <span class="dv">2</span>) <span class="sc">==</span> <span class="dv">1</span>)]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1"></a><span class="co"># Get the first 10 odd numbers, a common CS 101 loop problem on exams</span></span>
+<span id="cb21-2"><a href="#cb21-2"></a>(<span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)[<span class="fu">which</span>((<span class="dv">1</span><span class="sc">:</span><span class="dv">20</span> <span class="sc">%%</span> <span class="dv">2</span>) <span class="sc">==</span> <span class="dv">1</span>)]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code> [1]  1  3  5  7  9 11 13 15 17 19</code></pre>
 </div>
@@ -710,9 +676,9 @@ <h2>Loop walkthrough</h2>
 </ul>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a></a>meas <span class="ot">&lt;-</span> <span class="fu">readRDS</span>(here<span class="sc">::</span><span class="fu">here</span>(<span class="st">"data"</span>, <span class="st">"measles_final.Rds"</span>)) <span class="sc">|&gt;</span></span>
-<span id="cb23-2"><a></a>    <span class="fu">subset</span>(vaccine_antigen <span class="sc">==</span> <span class="st">"MCV1"</span>)</span>
-<span id="cb23-3"><a></a><span class="fu">str</span>(meas)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1"></a>meas <span class="ot">&lt;-</span> <span class="fu">readRDS</span>(here<span class="sc">::</span><span class="fu">here</span>(<span class="st">"data"</span>, <span class="st">"measles_final.Rds"</span>)) <span class="sc">|&gt;</span></span>
+<span id="cb23-2"><a href="#cb23-2"></a>    <span class="fu">subset</span>(vaccine_antigen <span class="sc">==</span> <span class="st">"MCV1"</span>)</span>
+<span id="cb23-3"><a href="#cb23-3"></a><span class="fu">str</span>(meas)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>'data.frame':   7972 obs. of  7 variables:
  $ iso3c           : chr  "AFG" "AFG" "AFG" "AFG" ...
@@ -733,7 +699,7 @@ <h2>Loop walkthrough</h2>
 </ul>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a></a>res <span class="ot">&lt;-</span> <span class="fu">vector</span>(<span class="at">mode =</span> <span class="st">"list"</span>, <span class="at">length =</span> <span class="fu">length</span>(<span class="fu">unique</span>(meas<span class="sc">$</span>country)))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1"></a>res <span class="ot">&lt;-</span> <span class="fu">vector</span>(<span class="at">mode =</span> <span class="st">"list"</span>, <span class="at">length =</span> <span class="fu">length</span>(<span class="fu">unique</span>(meas<span class="sc">$</span>country)))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <ul>
 <li>This is called <em>preallocation</em> and it can make your loops much faster.</li>
@@ -748,8 +714,8 @@ <h2>Loop walkthrough</h2>
 </ul>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a></a>countries <span class="ot">&lt;-</span> <span class="fu">unique</span>(meas<span class="sc">$</span>country)</span>
-<span id="cb26-2"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1"></a>countries <span class="ot">&lt;-</span> <span class="fu">unique</span>(meas<span class="sc">$</span>country)</span>
+<span id="cb26-2"><a href="#cb26-2"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {...}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </div>
 <div class="fragment">
@@ -765,10 +731,10 @@ <h2>Loop walkthrough</h2>
 </ul>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb27"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb27-2"><a></a>    <span class="co"># Get the data for the current country only</span></span>
-<span id="cb27-3"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
-<span id="cb27-4"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb27"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb27-1"><a href="#cb27-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb27-2"><a href="#cb27-2"></a>    <span class="co"># Get the data for the current country only</span></span>
+<span id="cb27-3"><a href="#cb27-3"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
+<span id="cb27-4"><a href="#cb27-4"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </div>
 <div class="fragment">
@@ -778,16 +744,16 @@ <h2>Loop walkthrough</h2>
 </div>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb28-2"><a></a>    <span class="co"># Get the data for the current country only</span></span>
-<span id="cb28-3"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
-<span id="cb28-4"><a></a>    </span>
-<span id="cb28-5"><a></a>    <span class="co"># Get the summary statistics for this country</span></span>
-<span id="cb28-6"><a></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
-<span id="cb28-7"><a></a>    country_med <span class="ot">&lt;-</span> <span class="fu">median</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb28-8"><a></a>    country_iqr <span class="ot">&lt;-</span> <span class="fu">IQR</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb28-9"><a></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb28-10"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb28-2"><a href="#cb28-2"></a>    <span class="co"># Get the data for the current country only</span></span>
+<span id="cb28-3"><a href="#cb28-3"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
+<span id="cb28-4"><a href="#cb28-4"></a>    </span>
+<span id="cb28-5"><a href="#cb28-5"></a>    <span class="co"># Get the summary statistics for this country</span></span>
+<span id="cb28-6"><a href="#cb28-6"></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
+<span id="cb28-7"><a href="#cb28-7"></a>    country_med <span class="ot">&lt;-</span> <span class="fu">median</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb28-8"><a href="#cb28-8"></a>    country_iqr <span class="ot">&lt;-</span> <span class="fu">IQR</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb28-9"><a href="#cb28-9"></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb28-10"><a href="#cb28-10"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </div>
 <div class="fragment">
@@ -795,27 +761,27 @@ <h2>Loop walkthrough</h2>
 <li>Next we save the summary statistics into a data frame.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb29-2"><a></a>    <span class="co"># Get the data for the current country only</span></span>
-<span id="cb29-3"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
-<span id="cb29-4"><a></a>    </span>
-<span id="cb29-5"><a></a>    <span class="co"># Get the summary statistics for this country</span></span>
-<span id="cb29-6"><a></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
-<span id="cb29-7"><a></a>    country_quart <span class="ot">&lt;-</span> <span class="fu">quantile</span>(</span>
-<span id="cb29-8"><a></a>        country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">probs =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>)</span>
-<span id="cb29-9"><a></a>    )</span>
-<span id="cb29-10"><a></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb29-11"><a></a>    </span>
-<span id="cb29-12"><a></a>    <span class="co"># Save the summary statistics into a data frame</span></span>
-<span id="cb29-13"><a></a>    country_summary <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(</span>
-<span id="cb29-14"><a></a>        <span class="at">country =</span> countries[[i]],</span>
-<span id="cb29-15"><a></a>        <span class="at">min =</span> country_range[[<span class="dv">1</span>]],</span>
-<span id="cb29-16"><a></a>        <span class="at">Q1 =</span> country_quart[[<span class="dv">1</span>]],</span>
-<span id="cb29-17"><a></a>        <span class="at">median =</span> country_quart[[<span class="dv">2</span>]],</span>
-<span id="cb29-18"><a></a>        <span class="at">Q3 =</span> country_quart[[<span class="dv">3</span>]],</span>
-<span id="cb29-19"><a></a>        <span class="at">max =</span> country_range[[<span class="dv">2</span>]]</span>
-<span id="cb29-20"><a></a>    )</span>
-<span id="cb29-21"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb29"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb29-1"><a href="#cb29-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb29-2"><a href="#cb29-2"></a>    <span class="co"># Get the data for the current country only</span></span>
+<span id="cb29-3"><a href="#cb29-3"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
+<span id="cb29-4"><a href="#cb29-4"></a>    </span>
+<span id="cb29-5"><a href="#cb29-5"></a>    <span class="co"># Get the summary statistics for this country</span></span>
+<span id="cb29-6"><a href="#cb29-6"></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
+<span id="cb29-7"><a href="#cb29-7"></a>    country_quart <span class="ot">&lt;-</span> <span class="fu">quantile</span>(</span>
+<span id="cb29-8"><a href="#cb29-8"></a>        country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">probs =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>)</span>
+<span id="cb29-9"><a href="#cb29-9"></a>    )</span>
+<span id="cb29-10"><a href="#cb29-10"></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb29-11"><a href="#cb29-11"></a>    </span>
+<span id="cb29-12"><a href="#cb29-12"></a>    <span class="co"># Save the summary statistics into a data frame</span></span>
+<span id="cb29-13"><a href="#cb29-13"></a>    country_summary <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(</span>
+<span id="cb29-14"><a href="#cb29-14"></a>        <span class="at">country =</span> countries[[i]],</span>
+<span id="cb29-15"><a href="#cb29-15"></a>        <span class="at">min =</span> country_range[[<span class="dv">1</span>]],</span>
+<span id="cb29-16"><a href="#cb29-16"></a>        <span class="at">Q1 =</span> country_quart[[<span class="dv">1</span>]],</span>
+<span id="cb29-17"><a href="#cb29-17"></a>        <span class="at">median =</span> country_quart[[<span class="dv">2</span>]],</span>
+<span id="cb29-18"><a href="#cb29-18"></a>        <span class="at">Q3 =</span> country_quart[[<span class="dv">3</span>]],</span>
+<span id="cb29-19"><a href="#cb29-19"></a>        <span class="at">max =</span> country_range[[<span class="dv">2</span>]]</span>
+<span id="cb29-20"><a href="#cb29-20"></a>    )</span>
+<span id="cb29-21"><a href="#cb29-21"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </div>
 <div class="fragment">
@@ -823,30 +789,30 @@ <h2>Loop walkthrough</h2>
 <li>And finally, we save the data frame as the next element in our storage list.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb30"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb30-2"><a></a>    <span class="co"># Get the data for the current country only</span></span>
-<span id="cb30-3"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
-<span id="cb30-4"><a></a>    </span>
-<span id="cb30-5"><a></a>    <span class="co"># Get the summary statistics for this country</span></span>
-<span id="cb30-6"><a></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
-<span id="cb30-7"><a></a>    country_quart <span class="ot">&lt;-</span> <span class="fu">quantile</span>(</span>
-<span id="cb30-8"><a></a>        country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">probs =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>)</span>
-<span id="cb30-9"><a></a>    )</span>
-<span id="cb30-10"><a></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb30-11"><a></a>    </span>
-<span id="cb30-12"><a></a>    <span class="co"># Save the summary statistics into a data frame</span></span>
-<span id="cb30-13"><a></a>    country_summary <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(</span>
-<span id="cb30-14"><a></a>        <span class="at">country =</span> countries[[i]],</span>
-<span id="cb30-15"><a></a>        <span class="at">min =</span> country_range[[<span class="dv">1</span>]],</span>
-<span id="cb30-16"><a></a>        <span class="at">Q1 =</span> country_quart[[<span class="dv">1</span>]],</span>
-<span id="cb30-17"><a></a>        <span class="at">median =</span> country_quart[[<span class="dv">2</span>]],</span>
-<span id="cb30-18"><a></a>        <span class="at">Q3 =</span> country_quart[[<span class="dv">3</span>]],</span>
-<span id="cb30-19"><a></a>        <span class="at">max =</span> country_range[[<span class="dv">2</span>]]</span>
-<span id="cb30-20"><a></a>    )</span>
-<span id="cb30-21"><a></a>    </span>
-<span id="cb30-22"><a></a>    <span class="co"># Save the results to our container</span></span>
-<span id="cb30-23"><a></a>    res[[i]] <span class="ot">&lt;-</span> country_summary</span>
-<span id="cb30-24"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb30"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb30-1"><a href="#cb30-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb30-2"><a href="#cb30-2"></a>    <span class="co"># Get the data for the current country only</span></span>
+<span id="cb30-3"><a href="#cb30-3"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[i])</span>
+<span id="cb30-4"><a href="#cb30-4"></a>    </span>
+<span id="cb30-5"><a href="#cb30-5"></a>    <span class="co"># Get the summary statistics for this country</span></span>
+<span id="cb30-6"><a href="#cb30-6"></a>    country_cases <span class="ot">&lt;-</span> country_data<span class="sc">$</span>Cases</span>
+<span id="cb30-7"><a href="#cb30-7"></a>    country_quart <span class="ot">&lt;-</span> <span class="fu">quantile</span>(</span>
+<span id="cb30-8"><a href="#cb30-8"></a>        country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">probs =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>)</span>
+<span id="cb30-9"><a href="#cb30-9"></a>    )</span>
+<span id="cb30-10"><a href="#cb30-10"></a>    country_range <span class="ot">&lt;-</span> <span class="fu">range</span>(country_cases, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb30-11"><a href="#cb30-11"></a>    </span>
+<span id="cb30-12"><a href="#cb30-12"></a>    <span class="co"># Save the summary statistics into a data frame</span></span>
+<span id="cb30-13"><a href="#cb30-13"></a>    country_summary <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(</span>
+<span id="cb30-14"><a href="#cb30-14"></a>        <span class="at">country =</span> countries[[i]],</span>
+<span id="cb30-15"><a href="#cb30-15"></a>        <span class="at">min =</span> country_range[[<span class="dv">1</span>]],</span>
+<span id="cb30-16"><a href="#cb30-16"></a>        <span class="at">Q1 =</span> country_quart[[<span class="dv">1</span>]],</span>
+<span id="cb30-17"><a href="#cb30-17"></a>        <span class="at">median =</span> country_quart[[<span class="dv">2</span>]],</span>
+<span id="cb30-18"><a href="#cb30-18"></a>        <span class="at">Q3 =</span> country_quart[[<span class="dv">3</span>]],</span>
+<span id="cb30-19"><a href="#cb30-19"></a>        <span class="at">max =</span> country_range[[<span class="dv">2</span>]]</span>
+<span id="cb30-20"><a href="#cb30-20"></a>    )</span>
+<span id="cb30-21"><a href="#cb30-21"></a>    </span>
+<span id="cb30-22"><a href="#cb30-22"></a>    <span class="co"># Save the results to our container</span></span>
+<span id="cb30-23"><a href="#cb30-23"></a>    res[[i]] <span class="ot">&lt;-</span> country_summary</span>
+<span id="cb30-24"><a href="#cb30-24"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stderr">
 <pre><code>Warning in min(x): no non-missing arguments to min; returning Inf</code></pre>
 </div>
@@ -872,7 +838,7 @@ <h2>Loop walkthrough</h2>
 <li>Let’s take a look at the results.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb37"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb37-1"><a></a><span class="fu">head</span>(res)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb37"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb37-1"><a href="#cb37-1"></a><span class="fu">head</span>(res)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>[[1]]
       country min   Q1 median   Q3   max
@@ -908,10 +874,10 @@ <h2>Loop walkthrough</h2>
 <li>We can use a <em>vectorization</em> trick: the function <code>do.call()</code> seems like ancient computer science magic. And it is. But it will actually help us a lot.</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb39"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><a></a>res_df <span class="ot">&lt;-</span> <span class="fu">do.call</span>(rbind, res)</span>
-<span id="cb39-2"><a></a><span class="fu">head</span>(res_df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb39"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb39-1"><a href="#cb39-1"></a>res_df <span class="ot">&lt;-</span> <span class="fu">do.call</span>(rbind, res)</span>
+<span id="cb39-2"><a href="#cb39-2"></a><span class="fu">head</span>(res_df)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
-<table class="caption-top">
+<table>
 <thead>
 <tr class="header">
 <th style="text-align: left;">country</th>
@@ -981,7 +947,7 @@ <h2>Loop walkthrough</h2>
 </div>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a></a>?rbind</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1"></a>?rbind</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>Combine R Objects by Rows or Columns
 
@@ -1115,8 +1081,8 @@ <h2>Loop walkthrough</h2>
      Factors have their levels expanded as necessary (in the order of
      the levels of the level sets of the factors encountered) and the
      result is an ordered factor if and only if all the components were
-     ordered factors.  Old-style categories (integer vectors with
-     levels) are promoted to factors.
+     ordered factors.  (The last point differs from S-PLUS.)  Old-style
+     categories (integer vectors with levels) are promoted to factors.
 
      Note that for result column 'j', 'factor(., exclude = X(j))' is
      applied, where
@@ -1200,7 +1166,7 @@ <h2>Loop walkthrough</h2>
 </div>
 <div class="fragment">
 <div class="cell">
-<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a></a>?do.call</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb42"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1"></a>?do.call</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output cell-output-stdout">
 <pre><code>Execute a Function Call
 
@@ -1297,13 +1263,13 @@ <h2>Loop walkthrough</h2>
 <li>OK, so basically what happened is that</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a></a><span class="fu">do.call</span>(rbind, list)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb44"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1"></a><span class="fu">do.call</span>(rbind, list)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <ul>
 <li>Gets transformed into</li>
 </ul>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb45"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb45-1"><a></a><span class="fu">rbind</span>(list[[<span class="dv">1</span>]], list[[<span class="dv">2</span>]], list[[<span class="dv">3</span>]], ..., list[[<span class="fu">length</span>(list)]])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb45"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb45-1"><a href="#cb45-1"></a><span class="fu">rbind</span>(list[[<span class="dv">1</span>]], list[[<span class="dv">2</span>]], list[[<span class="dv">3</span>]], ..., list[[<span class="fu">length</span>(list)]])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <ul>
 <li>That’s vectorization magic!</li>
@@ -1323,25 +1289,25 @@ <h2>You try it! (if we have time)</h2>
 <section id="main-problem-solution" class="slide level2">
 <h2>Main problem solution</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a></a>meas<span class="sc">$</span>cases_per_thousand <span class="ot">&lt;-</span> meas<span class="sc">$</span>Cases <span class="sc">/</span> <span class="fu">as.numeric</span>(meas<span class="sc">$</span>total_pop) <span class="sc">*</span> <span class="dv">1000</span></span>
-<span id="cb46-2"><a></a>countries <span class="ot">&lt;-</span> <span class="fu">unique</span>(meas<span class="sc">$</span>country)</span>
-<span id="cb46-3"><a></a></span>
-<span id="cb46-4"><a></a><span class="fu">plot</span>(</span>
-<span id="cb46-5"><a></a>    <span class="cn">NULL</span>, <span class="cn">NULL</span>,</span>
-<span id="cb46-6"><a></a>    <span class="at">xlim =</span> <span class="fu">c</span>(<span class="dv">1980</span>, <span class="dv">2022</span>),</span>
-<span id="cb46-7"><a></a>    <span class="at">ylim =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">50</span>),</span>
-<span id="cb46-8"><a></a>    <span class="at">xlab =</span> <span class="st">"Year"</span>,</span>
-<span id="cb46-9"><a></a>    <span class="at">ylab =</span> <span class="st">"Incidence per 1000 people"</span></span>
-<span id="cb46-10"><a></a>)</span>
-<span id="cb46-11"><a></a></span>
-<span id="cb46-12"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb46-13"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[[i]])</span>
-<span id="cb46-14"><a></a>    <span class="fu">lines</span>(</span>
-<span id="cb46-15"><a></a>        <span class="at">x =</span> country_data<span class="sc">$</span>time,</span>
-<span id="cb46-16"><a></a>        <span class="at">y =</span> country_data<span class="sc">$</span>cases_per_thousand,</span>
-<span id="cb46-17"><a></a>        <span class="at">col =</span> <span class="fu">adjustcolor</span>(<span class="st">"black"</span>, <span class="at">alpha.f =</span> <span class="fl">0.25</span>)</span>
-<span id="cb46-18"><a></a>    )</span>
-<span id="cb46-19"><a></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb46"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1"></a>meas<span class="sc">$</span>cases_per_thousand <span class="ot">&lt;-</span> meas<span class="sc">$</span>Cases <span class="sc">/</span> <span class="fu">as.numeric</span>(meas<span class="sc">$</span>total_pop) <span class="sc">*</span> <span class="dv">1000</span></span>
+<span id="cb46-2"><a href="#cb46-2"></a>countries <span class="ot">&lt;-</span> <span class="fu">unique</span>(meas<span class="sc">$</span>country)</span>
+<span id="cb46-3"><a href="#cb46-3"></a></span>
+<span id="cb46-4"><a href="#cb46-4"></a><span class="fu">plot</span>(</span>
+<span id="cb46-5"><a href="#cb46-5"></a>    <span class="cn">NULL</span>, <span class="cn">NULL</span>,</span>
+<span id="cb46-6"><a href="#cb46-6"></a>    <span class="at">xlim =</span> <span class="fu">c</span>(<span class="dv">1980</span>, <span class="dv">2022</span>),</span>
+<span id="cb46-7"><a href="#cb46-7"></a>    <span class="at">ylim =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">50</span>),</span>
+<span id="cb46-8"><a href="#cb46-8"></a>    <span class="at">xlab =</span> <span class="st">"Year"</span>,</span>
+<span id="cb46-9"><a href="#cb46-9"></a>    <span class="at">ylab =</span> <span class="st">"Incidence per 1000 people"</span></span>
+<span id="cb46-10"><a href="#cb46-10"></a>)</span>
+<span id="cb46-11"><a href="#cb46-11"></a></span>
+<span id="cb46-12"><a href="#cb46-12"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb46-13"><a href="#cb46-13"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[[i]])</span>
+<span id="cb46-14"><a href="#cb46-14"></a>    <span class="fu">lines</span>(</span>
+<span id="cb46-15"><a href="#cb46-15"></a>        <span class="at">x =</span> country_data<span class="sc">$</span>time,</span>
+<span id="cb46-16"><a href="#cb46-16"></a>        <span class="at">y =</span> country_data<span class="sc">$</span>cases_per_thousand,</span>
+<span id="cb46-17"><a href="#cb46-17"></a>        <span class="at">col =</span> <span class="fu">adjustcolor</span>(<span class="st">"black"</span>, <span class="at">alpha.f =</span> <span class="fl">0.25</span>)</span>
+<span id="cb46-18"><a href="#cb46-18"></a>    )</span>
+<span id="cb46-19"><a href="#cb46-19"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="main-problem-solution-1" class="slide level2">
@@ -1351,38 +1317,38 @@ <h2>Main problem solution</h2>
 <section id="bonus-problem-solution" class="slide level2">
 <h2>Bonus problem solution</h2>
 <div class="cell">
-<div class="sourceCode cell-code" id="cb47"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb47-1"><a></a><span class="co"># First calculate the cumulative cases, treating NA as zeroes</span></span>
-<span id="cb47-2"><a></a>cumulative_cases <span class="ot">&lt;-</span> <span class="fu">ave</span>(</span>
-<span id="cb47-3"><a></a>    <span class="at">x =</span> <span class="fu">ifelse</span>(<span class="fu">is.na</span>(meas<span class="sc">$</span>Cases), <span class="dv">0</span>, meas<span class="sc">$</span>Cases),</span>
-<span id="cb47-4"><a></a>    meas<span class="sc">$</span>country,</span>
-<span id="cb47-5"><a></a>    <span class="at">FUN =</span> cumsum</span>
-<span id="cb47-6"><a></a>)</span>
-<span id="cb47-7"><a></a></span>
-<span id="cb47-8"><a></a><span class="co"># Now put the NAs back where they should be</span></span>
-<span id="cb47-9"><a></a>meas<span class="sc">$</span>cumulative_cases <span class="ot">&lt;-</span> cumulative_cases <span class="sc">+</span> (meas<span class="sc">$</span>Cases <span class="sc">*</span> <span class="dv">0</span>)</span>
-<span id="cb47-10"><a></a></span>
-<span id="cb47-11"><a></a><span class="fu">plot</span>(</span>
-<span id="cb47-12"><a></a>    <span class="cn">NULL</span>, <span class="cn">NULL</span>,</span>
-<span id="cb47-13"><a></a>    <span class="at">xlim =</span> <span class="fu">c</span>(<span class="dv">1980</span>, <span class="dv">2022</span>),</span>
-<span id="cb47-14"><a></a>    <span class="at">ylim =</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="fl">6.2e6</span>),</span>
-<span id="cb47-15"><a></a>    <span class="at">xlab =</span> <span class="st">"Year"</span>,</span>
-<span id="cb47-16"><a></a>    <span class="at">ylab =</span> <span class="st">"Cumulative cases per 1000 people"</span></span>
-<span id="cb47-17"><a></a>)</span>
-<span id="cb47-18"><a></a></span>
-<span id="cb47-19"><a></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
-<span id="cb47-20"><a></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[[i]])</span>
-<span id="cb47-21"><a></a>    <span class="fu">lines</span>(</span>
-<span id="cb47-22"><a></a>        <span class="at">x =</span> country_data<span class="sc">$</span>time,</span>
-<span id="cb47-23"><a></a>        <span class="at">y =</span> country_data<span class="sc">$</span>cumulative_cases,</span>
-<span id="cb47-24"><a></a>        <span class="at">col =</span> <span class="fu">adjustcolor</span>(<span class="st">"black"</span>, <span class="at">alpha.f =</span> <span class="fl">0.25</span>)</span>
-<span id="cb47-25"><a></a>    )</span>
-<span id="cb47-26"><a></a>}</span>
-<span id="cb47-27"><a></a></span>
-<span id="cb47-28"><a></a><span class="fu">text</span>(</span>
-<span id="cb47-29"><a></a>    <span class="at">x =</span> <span class="dv">2020</span>,</span>
-<span id="cb47-30"><a></a>    <span class="at">y =</span> <span class="fl">6e6</span>,</span>
-<span id="cb47-31"><a></a>    <span class="at">labels =</span> <span class="st">"China →"</span></span>
-<span id="cb47-32"><a></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="sourceCode cell-code" id="cb47"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb47-1"><a href="#cb47-1"></a><span class="co"># First calculate the cumulative cases, treating NA as zeroes</span></span>
+<span id="cb47-2"><a href="#cb47-2"></a>cumulative_cases <span class="ot">&lt;-</span> <span class="fu">ave</span>(</span>
+<span id="cb47-3"><a href="#cb47-3"></a>    <span class="at">x =</span> <span class="fu">ifelse</span>(<span class="fu">is.na</span>(meas<span class="sc">$</span>Cases), <span class="dv">0</span>, meas<span class="sc">$</span>Cases),</span>
+<span id="cb47-4"><a href="#cb47-4"></a>    meas<span class="sc">$</span>country,</span>
+<span id="cb47-5"><a href="#cb47-5"></a>    <span class="at">FUN =</span> cumsum</span>
+<span id="cb47-6"><a href="#cb47-6"></a>)</span>
+<span id="cb47-7"><a href="#cb47-7"></a></span>
+<span id="cb47-8"><a href="#cb47-8"></a><span class="co"># Now put the NAs back where they should be</span></span>
+<span id="cb47-9"><a href="#cb47-9"></a>meas<span class="sc">$</span>cumulative_cases <span class="ot">&lt;-</span> cumulative_cases <span class="sc">+</span> (meas<span class="sc">$</span>Cases <span class="sc">*</span> <span class="dv">0</span>)</span>
+<span id="cb47-10"><a href="#cb47-10"></a></span>
+<span id="cb47-11"><a href="#cb47-11"></a><span class="fu">plot</span>(</span>
+<span id="cb47-12"><a href="#cb47-12"></a>    <span class="cn">NULL</span>, <span class="cn">NULL</span>,</span>
+<span id="cb47-13"><a href="#cb47-13"></a>    <span class="at">xlim =</span> <span class="fu">c</span>(<span class="dv">1980</span>, <span class="dv">2022</span>),</span>
+<span id="cb47-14"><a href="#cb47-14"></a>    <span class="at">ylim =</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="fl">6.2e6</span>),</span>
+<span id="cb47-15"><a href="#cb47-15"></a>    <span class="at">xlab =</span> <span class="st">"Year"</span>,</span>
+<span id="cb47-16"><a href="#cb47-16"></a>    <span class="at">ylab =</span> <span class="st">"Cumulative cases per 1000 people"</span></span>
+<span id="cb47-17"><a href="#cb47-17"></a>)</span>
+<span id="cb47-18"><a href="#cb47-18"></a></span>
+<span id="cb47-19"><a href="#cb47-19"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="fu">length</span>(countries)) {</span>
+<span id="cb47-20"><a href="#cb47-20"></a>    country_data <span class="ot">&lt;-</span> <span class="fu">subset</span>(meas, country <span class="sc">==</span> countries[[i]])</span>
+<span id="cb47-21"><a href="#cb47-21"></a>    <span class="fu">lines</span>(</span>
+<span id="cb47-22"><a href="#cb47-22"></a>        <span class="at">x =</span> country_data<span class="sc">$</span>time,</span>
+<span id="cb47-23"><a href="#cb47-23"></a>        <span class="at">y =</span> country_data<span class="sc">$</span>cumulative_cases,</span>
+<span id="cb47-24"><a href="#cb47-24"></a>        <span class="at">col =</span> <span class="fu">adjustcolor</span>(<span class="st">"black"</span>, <span class="at">alpha.f =</span> <span class="fl">0.25</span>)</span>
+<span id="cb47-25"><a href="#cb47-25"></a>    )</span>
+<span id="cb47-26"><a href="#cb47-26"></a>}</span>
+<span id="cb47-27"><a href="#cb47-27"></a></span>
+<span id="cb47-28"><a href="#cb47-28"></a><span class="fu">text</span>(</span>
+<span id="cb47-29"><a href="#cb47-29"></a>    <span class="at">x =</span> <span class="dv">2020</span>,</span>
+<span id="cb47-30"><a href="#cb47-30"></a>    <span class="at">y =</span> <span class="fl">6e6</span>,</span>
+<span id="cb47-31"><a href="#cb47-31"></a>    <span class="at">labels =</span> <span class="st">"China →"</span></span>
+<span id="cb47-32"><a href="#cb47-32"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
 <section id="bonus-problem-solution-1" class="slide level2">
@@ -1396,10 +1362,8 @@ <h2>More practice on your own</h2>
 <li>Assess the impact of <code>age_months</code> as a confounder in the Diphtheria serology data. First, write code to transform <code>age_months</code> into age ranges for each year. Then, using a loop, calculate the crude odds ratio for the effect of vaccination on infection for each of the age ranges. How does the odds ratio change as age increases? Can you formalize this analysis by fitting a logistic regression model with <code>age_months</code> and vaccination as predictors?</li>
 </ul>
 
-<div class="quarto-auto-generated-content">
 <div class="footer footer-default">
 
-</div>
 </div>
 </section>
     </div>
@@ -1428,6 +1392,7 @@ <h2>More practice on your own</h2>
       Reveal.initialize({
 'controlsAuto': true,
 'previewLinksAuto': false,
+'smaller': false,
 'pdfSeparateFragments': false,
 'autoAnimateEasing': "ease",
 'autoAnimateDuration': 1,
@@ -1613,81 +1578,43 @@ <h2>More practice on your own</h2>
       });
     </script>
     
-
     <script>
-
       // htmlwidgets need to know to resize themselves when slides are shown/hidden.
-
       // Fire the "slideenter" event (handled by htmlwidgets.js) when the current
-
       // slide changes (different for each slide format).
-
       (function () {
-
         // dispatch for htmlwidgets
-
         function fireSlideEnter() {
-
           const event = window.document.createEvent("Event");
-
           event.initEvent("slideenter", true, true);
-
           window.document.dispatchEvent(event);
-
         }
 
-    
-
         function fireSlideChanged(previousSlide, currentSlide) {
-
           fireSlideEnter();
 
-    
-
           // dispatch for shiny
-
           if (window.jQuery) {
-
             if (previousSlide) {
-
               window.jQuery(previousSlide).trigger("hidden");
-
             }
-
             if (currentSlide) {
-
               window.jQuery(currentSlide).trigger("shown");
-
             }
-
           }
-
         }
 
-    
-
         // hookup for slidy
-
         if (window.w3c_slidy) {
-
           window.w3c_slidy.add_observer(function (slide_num) {
-
             // slide_num starts at position 1
-
             fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);
-
           });
-
         }
 
-    
-
       })();
-
     </script>
 
-    
-
     <script id="quarto-html-after-body" type="application/javascript">
     window.document.addEventListener("DOMContentLoaded", function (event) {
       const toggleBodyColorMode = (bsSheetEl) => {
@@ -1720,7 +1647,18 @@ <h2>More practice on your own</h2>
         }
         return false;
       }
-      const onCopySuccess = function(e) {
+      const clipboard = new window.ClipboardJS('.code-copy-button', {
+        text: function(trigger) {
+          const codeEl = trigger.previousElementSibling.cloneNode(true);
+          for (const childEl of codeEl.children) {
+            if (isCodeAnnotation(childEl)) {
+              childEl.remove();
+            }
+          }
+          return codeEl.innerText;
+        }
+      });
+      clipboard.on('success', function(e) {
         // button target
         const button = e.trigger;
         // don't keep focus
@@ -1752,50 +1690,11 @@ <h2>More practice on your own</h2>
         }, 1000);
         // clear code selection
         e.clearSelection();
-      }
-      const getTextToCopy = function(trigger) {
-          const codeEl = trigger.previousElementSibling.cloneNode(true);
-          for (const childEl of codeEl.children) {
-            if (isCodeAnnotation(childEl)) {
-              childEl.remove();
-            }
-          }
-          return codeEl.innerText;
-      }
-      const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
-        text: getTextToCopy
       });
-      clipboard.on('success', onCopySuccess);
-      if (window.document.getElementById('quarto-embedded-source-code-modal')) {
-        // For code content inside modals, clipBoardJS needs to be initialized with a container option
-        // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
-        const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
-          text: getTextToCopy,
-          container: window.document.getElementById('quarto-embedded-source-code-modal')
-        });
-        clipboardModal.on('success', onCopySuccess);
-      }
-        var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
-        var mailtoRegex = new RegExp(/^mailto:/);
-          var filterRegex = new RegExp('/' + window.location.host + '/');
-        var isInternal = (href) => {
-            return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
-        }
-        // Inspect non-navigation links and adorn them if external
-     	var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
-        for (var i=0; i<links.length; i++) {
-          const link = links[i];
-          if (!isInternal(link.href)) {
-            // undo the damage that might have been done by quarto-nav.js in the case of
-            // links that we want to consider external
-            if (link.dataset.originalHref !== undefined) {
-              link.href = link.dataset.originalHref;
-            }
-          }
-        }
-      function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
+      function tippyHover(el, contentFn) {
         const config = {
           allowHTML: true,
+          content: contentFn,
           maxWidth: 500,
           delay: 100,
           arrow: false,
@@ -1805,17 +1704,8 @@ <h2>More practice on your own</h2>
           interactive: true,
           interactiveBorder: 10,
           theme: 'light-border',
-          placement: 'bottom-start',
+          placement: 'bottom-start'
         };
-        if (contentFn) {
-          config.content = contentFn;
-        }
-        if (onTriggerFn) {
-          config.onTrigger = onTriggerFn;
-        }
-        if (onUntriggerFn) {
-          config.onUntrigger = onUntriggerFn;
-        }
           config['offset'] = [0,0];
           config['maxWidth'] = 700;
         window.tippy(el, config); 
@@ -1829,11 +1719,7 @@ <h2>More practice on your own</h2>
           try { href = new URL(href).hash; } catch {}
           const id = href.replace(/^#\/?/, "");
           const note = window.document.getElementById(id);
-          if (note) {
-            return note.innerHTML;
-          } else {
-            return "";
-          }
+          return note.innerHTML;
         });
       }
       const findCites = (el) => {
diff --git a/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png b/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png
index 84a077e..d1c55ba 100644
Binary files a/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png and b/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-30-1.png differ
diff --git a/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png b/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png
index 009fce5..66ca0eb 100644
Binary files a/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png and b/docs/modules/ModuleXX-Iteration_files/figure-revealjs/unnamed-chunk-32-1.png differ
diff --git a/docs/search.json b/docs/search.json
index 94b0514..63a76f6 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -44,11 +44,7 @@
     "href": "modules/ModuleXX-Iteration.html#learning-goals",
     "title": "Iteration in R",
     "section": "Learning goals",
-    "text": "Learning goals\n\nReplace repetitive code with a for loop\nCompare and contrast for loops and *apply() functions\nUse vectorization to replace unnecessary loops",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Learning goals\n\nReplace repetitive code with a for loop\nUse vectorization to replace unnecessary loops"
   },
   {
     "objectID": "index.html",
@@ -411,88 +407,56 @@
     "href": "modules/ModuleXX-Iteration.html#what-is-iteration",
     "title": "Iteration in R",
     "section": "What is iteration?",
-    "text": "What is iteration?\n\nWhenever you repeat something, that’s iteration.\nIn R, this means running the same code multiple times in a row.\n\n\ndata(\"penguins\", package = \"palmerpenguins\")\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "What is iteration?\n\nWhenever you repeat something, that’s iteration.\nIn R, this means running the same code multiple times in a row.\n\n\ndata(\"penguins\", package = \"palmerpenguins\")\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#parts-of-a-loop",
     "href": "modules/ModuleXX-Iteration.html#parts-of-a-loop",
     "title": "Iteration in R",
     "section": "Parts of a loop",
-    "text": "Parts of a loop\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe header declares how many times we will repeat the same code. The header contains a control variable that changes in each repetition and a sequence of values for the control variable to take.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Parts of a loop\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe header declares how many times we will repeat the same code. The header contains a control variable that changes in each repetition and a sequence of values for the control variable to take."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#parts-of-a-loop-1",
     "href": "modules/ModuleXX-Iteration.html#parts-of-a-loop-1",
     "title": "Iteration in R",
     "section": "Parts of a loop",
-    "text": "Parts of a loop\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe body of the loop contains code that will be repeated a number of times based on the header instructions. In R, the body has to be surrounded by curly braces.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Parts of a loop\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe body of the loop contains code that will be repeated a number of times based on the header instructions. In R, the body has to be surrounded by curly braces."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#header-parts",
     "href": "modules/ModuleXX-Iteration.html#header-parts",
     "title": "Iteration in R",
     "section": "Header parts",
-    "text": "Header parts\n\nfor (this_island in levels(penguins$island)) {...}\n\n\nfor: keyword that declares we are doing a for loop.\n(...): parentheses after for declare the control variable and sequence.\nthis_island: the control variable.\nin: keyword that separates the control varibale and sequence.\nlevels(penguins$island): the sequence.\n{}: curly braces will contain the body code.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Header parts\n\nfor (this_island in levels(penguins$island)) {...}\n\n\nfor: keyword that declares we are doing a for loop.\n(...): parentheses after for declare the control variable and sequence.\nthis_island: the control variable.\nin: keyword that separates the control varibale and sequence.\nlevels(penguins$island): the sequence.\n{}: curly braces will contain the body code."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#header-parts-1",
     "href": "modules/ModuleXX-Iteration.html#header-parts-1",
     "title": "Iteration in R",
     "section": "Header parts",
-    "text": "Header parts\n\nfor (this_island in levels(penguins$island)) {...}\n\n\nSince levels(penguins$island) evaluates to c(\"Biscoe\", \"Dream\", \"Torgersen\"), our loop will repeat 3 times.\n\n\n\n\nIteration\nthis_island\n\n\n\n\n1\n“Biscoe”\n\n\n2\n“Dream”\n\n\n3\n“Torgersen”\n\n\n\n\nEverything inside of {...} will be repeated three times.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Header parts\n\nfor (this_island in levels(penguins$island)) {...}\n\n\nSince levels(penguins$island) evaluates to c(\"Biscoe\", \"Dream\", \"Torgersen\"), our loop will repeat 3 times.\n\n\n\n\nIteration\nthis_island\n\n\n\n\n1\n“Biscoe”\n\n\n2\n“Dream”\n\n\n3\n“Torgersen”\n\n\n\n\nEverything inside of {...} will be repeated three times."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-iteration-1",
     "href": "modules/ModuleXX-Iteration.html#loop-iteration-1",
     "title": "Iteration in R",
     "section": "Loop iteration 1",
-    "text": "Loop iteration 1\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Biscoe\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Biscoe\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Biscoe Island was 15.87 mm.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop iteration 1\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Biscoe\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Biscoe\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Biscoe Island was 15.87 mm."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-iteration-2",
     "href": "modules/ModuleXX-Iteration.html#loop-iteration-2",
     "title": "Iteration in R",
     "section": "Loop iteration 2",
-    "text": "Loop iteration 2\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Dream\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Dream\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Dream Island was 18.34 mm.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop iteration 2\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Dream\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Dream\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Dream Island was 18.34 mm."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-iteration-3",
     "href": "modules/ModuleXX-Iteration.html#loop-iteration-3",
     "title": "Iteration in R",
     "section": "Loop iteration 3",
-    "text": "Loop iteration 3\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Torgersen\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Torgersen\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Torgersen Island was 18.43 mm.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop iteration 3\n\nisland_mean &lt;-\n    penguins$bill_depth_mm[penguins$island == \"Torgersen\"] |&gt;\n    mean(na.rm = TRUE) |&gt;\n    round(digits = 2)\n\ncat(paste(\"The mean bill depth on\", \"Torgersen\", \"Island was\", island_mean,\n                    \"mm.\\n\"))\n\nThe mean bill depth on Torgersen Island was 18.43 mm."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#the-loop-structure-automates-this-process-for-us-so-we-dont-have-to-copy",
@@ -510,22 +474,14 @@
     "href": "modules/ModuleXX-Iteration.html#the-loop-structure-automates-this-process-for-us-so-we-dont-have-to-copy-and-paste-our-code",
     "title": "Iteration in R",
     "section": "The loop structure automates this process for us so we don’t have to copy and paste our code!",
-    "text": "The loop structure automates this process for us so we don’t have to copy and paste our code!\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "The loop structure automates this process for us so we don’t have to copy and paste our code!\n\nfor (this_island in levels(penguins$island)) {\n    island_mean &lt;-\n        penguins$bill_depth_mm[penguins$island == this_island] |&gt;\n        mean(na.rm = TRUE) |&gt;\n        round(digits = 2)\n    \n    cat(paste(\"The mean bill depth on\", this_island, \"Island was\", island_mean,\n                            \"mm.\\n\"))\n}\n\nThe mean bill depth on Biscoe Island was 15.87 mm.\nThe mean bill depth on Dream Island was 18.34 mm.\nThe mean bill depth on Torgersen Island was 18.43 mm."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#remember-write-dry-code",
     "href": "modules/ModuleXX-Iteration.html#remember-write-dry-code",
     "title": "Iteration in R",
     "section": "Remember: write DRY code!",
-    "text": "Remember: write DRY code!\n\nDRY = “Don’t Repeat Yourself”\nInstead of copying and pasting, write loops and functions.\nEasier to debug and change in the future!\n\n\n\nOf course, we all copy and paste code sometimes. If you are running on a tight deadline or can’t get a loop or function to work, you might need to. DRY code is good, but working code is best!",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Remember: write DRY code!\n\nDRY = “Don’t Repeat Yourself”\nInstead of copying and pasting, write loops and functions.\nEasier to debug and change in the future!\n\n\n\nOf course, we all copy and paste code sometimes. If you are running on a tight deadline or can’t get a loop or function to work, you might need to. DRY code is good, but working code is best!"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#tweet-slide",
@@ -543,154 +499,98 @@
     "href": "modules/ModuleXX-Iteration.html#you-try-it",
     "title": "Iteration in R",
     "section": "You try it!",
-    "text": "You try it!\nWrite a loop that goes from 1 to 10, squares each of the numbers, and prints the squared number.\n\n\nfor (i in 1:10) {\n    cat(i ^ 2, \"\\n\")\n}\n\n1 \n4 \n9 \n16 \n25 \n36 \n49 \n64 \n81 \n100",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "You try it!\nWrite a loop that goes from 1 to 10, squares each of the numbers, and prints the squared number.\n\n\nfor (i in 1:10) {\n    cat(i ^ 2, \"\\n\")\n}\n\n1 \n4 \n9 \n16 \n25 \n36 \n49 \n64 \n81 \n100"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that",
     "href": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that",
     "title": "Iteration in R",
     "section": "Wait, did we need to do that?",
-    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that-1",
     "href": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that-1",
     "title": "Iteration in R",
     "section": "Wait, did we need to do that?",
-    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time.\n\n\n# No loop needed!\n(1:10)^2\n\n [1]   1   4   9  16  25  36  49  64  81 100",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time.\n\n\n# No loop needed!\n(1:10)^2\n\n [1]   1   4   9  16  25  36  49  64  81 100"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that-2",
     "href": "modules/ModuleXX-Iteration.html#wait-did-we-need-to-do-that-2",
     "title": "Iteration in R",
     "section": "Wait, did we need to do that?",
-    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time.\n\n\n# No loop needed!\n(1:10)^2\n\n [1]   1   4   9  16  25  36  49  64  81 100\n\n\n\n# Get the first 10 odd numbers, a common CS 101 loop problem on exams\n(1:20)[which((1:20 %% 2) == 1)]\n\n [1]  1  3  5  7  9 11 13 15 17 19\n\n\n\nSo you should really try vectorization first, then use loops only when you can’t use vectorization.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Wait, did we need to do that?\n\nWell, yes, because you need to practice loops!\nBut technically no, because we can use vectorization.\nAlmost all basic operations in R are vectorized: they work on a vector of arguments all at the same time.\n\n\n# No loop needed!\n(1:10)^2\n\n [1]   1   4   9  16  25  36  49  64  81 100\n\n\n\n# Get the first 10 odd numbers, a common CS 101 loop problem on exams\n(1:20)[which((1:20 %% 2) == 1)]\n\n [1]  1  3  5  7  9 11 13 15 17 19\n\n\n\nSo you should really try vectorization first, then use loops only when you can’t use vectorization."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-walkthrough",
     "href": "modules/ModuleXX-Iteration.html#loop-walkthrough",
     "title": "Iteration in R",
     "section": "Loop walkthrough",
-    "text": "Loop walkthrough\n\nLet’s walk through a complex but useful example where we can’t use vectorization.\nLoad the cleaned measles dataset, and subset it so you only have MCV1 records.\n\n\n\nmeas &lt;- readRDS(here::here(\"data\", \"measles_final.Rds\")) |&gt;\n    subset(vaccine_antigen == \"MCV1\")\nstr(meas)\n\n'data.frame':   7972 obs. of  7 variables:\n $ iso3c           : chr  \"AFG\" \"AFG\" \"AFG\" \"AFG\" ...\n $ time            : int  1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 ...\n $ country         : chr  \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" ...\n $ Cases           : int  2792 5166 2900 640 353 2012 1511 638 1154 492 ...\n $ vaccine_antigen : chr  \"MCV1\" \"MCV1\" \"MCV1\" \"MCV1\" ...\n $ vaccine_coverage: int  11 NA 8 9 14 14 14 31 34 22 ...\n $ total_pop       : chr  \"12486631\" \"11155195\" \"10088289\" \"9951449\" ...",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop walkthrough\n\nLet’s walk through a complex but useful example where we can’t use vectorization.\nLoad the cleaned measles dataset, and subset it so you only have MCV1 records.\n\n\n\nmeas &lt;- readRDS(here::here(\"data\", \"measles_final.Rds\")) |&gt;\n    subset(vaccine_antigen == \"MCV1\")\nstr(meas)\n\n'data.frame':   7972 obs. of  7 variables:\n $ iso3c           : chr  \"AFG\" \"AFG\" \"AFG\" \"AFG\" ...\n $ time            : int  1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 ...\n $ country         : chr  \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" \"Afghanistan\" ...\n $ Cases           : int  2792 5166 2900 640 353 2012 1511 638 1154 492 ...\n $ vaccine_antigen : chr  \"MCV1\" \"MCV1\" \"MCV1\" \"MCV1\" ...\n $ vaccine_coverage: int  11 NA 8 9 14 14 14 31 34 22 ...\n $ total_pop       : chr  \"12486631\" \"11155195\" \"10088289\" \"9951449\" ..."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-walkthrough-1",
     "href": "modules/ModuleXX-Iteration.html#loop-walkthrough-1",
     "title": "Iteration in R",
     "section": "Loop walkthrough",
-    "text": "Loop walkthrough\n\nFirst, make an empty list. This is where we’ll store our results. Make it the same length as the number of countries in the dataset.\n\n\n\nres &lt;- vector(mode = \"list\", length = length(unique(meas$country)))\n\n\nThis is called preallocation and it can make your loops much faster.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop walkthrough\n\nFirst, make an empty list. This is where we’ll store our results. Make it the same length as the number of countries in the dataset.\n\n\n\nres &lt;- vector(mode = \"list\", length = length(unique(meas$country)))\n\n\nThis is called preallocation and it can make your loops much faster."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-walkthrough-2",
     "href": "modules/ModuleXX-Iteration.html#loop-walkthrough-2",
     "title": "Iteration in R",
     "section": "Loop walkthrough",
-    "text": "Loop walkthrough\n\nLoop through every country in the dataset, and get the median, first and third quartiles, and range for each country. Store those summary statistics in a data frame.\nWhat should the header look like?\n\n\n\ncountries &lt;- unique(meas$country)\nfor (i in 1:length(countries)) {...}\n\n\n\n\nNote that we use the index as the control variable. When you need to do complex operations inside a loop, this is easier than the for-each construction we used earlier.",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop walkthrough\n\nLoop through every country in the dataset, and get the median, first and third quartiles, and range for each country. Store those summary statistics in a data frame.\nWhat should the header look like?\n\n\n\ncountries &lt;- unique(meas$country)\nfor (i in 1:length(countries)) {...}\n\n\n\n\nNote that we use the index as the control variable. When you need to do complex operations inside a loop, this is easier than the for-each construction we used earlier."
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#loop-walkthrough-3",
     "href": "modules/ModuleXX-Iteration.html#loop-walkthrough-3",
     "title": "Iteration in R",
     "section": "Loop walkthrough",
-    "text": "Loop walkthrough\n\nNow write out the body of the code. First we need to subset the data, to get only the data for the current country.\n\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n}\n\n\n\n\nNext we need to get the summary of the cases for that country.\n\n\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_med &lt;- median(country_cases, na.rm = TRUE)\n    country_iqr &lt;- IQR(country_cases, na.rm = TRUE)\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n}\n\n\n\n\nNext we save the summary statistics into a data frame.\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_quart &lt;- quantile(\n        country_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n    )\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n    \n    # Save the summary statistics into a data frame\n    country_summary &lt;- data.frame(\n        country = countries[[i]],\n        min = country_range[[1]],\n        Q1 = country_quart[[1]],\n        median = country_quart[[2]],\n        Q3 = country_quart[[3]],\n        max = country_range[[2]]\n    )\n}\n\n\n\n\nAnd finally, we save the data frame as the next element in our storage list.\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_quart &lt;- quantile(\n        country_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n    )\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n    \n    # Save the summary statistics into a data frame\n    country_summary &lt;- data.frame(\n        country = countries[[i]],\n        min = country_range[[1]],\n        Q1 = country_quart[[1]],\n        median = country_quart[[2]],\n        Q3 = country_quart[[3]],\n        max = country_range[[2]]\n    )\n    \n    # Save the results to our container\n    res[[i]] &lt;- country_summary\n}\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\n\n\n\nLet’s take a look at the results.\n\n\nhead(res)\n\n[[1]]\n      country min   Q1 median   Q3   max\n1 Afghanistan 353 1154   2205 5166 31107\n\n[[2]]\n  country min  Q1 median    Q3   max\n1  Angola  29 700   3271 14474 30067\n\n[[3]]\n  country min Q1 median Q3    max\n1 Albania   0  1     12 29 136034\n\n[[4]]\n  country min Q1 median Q3 max\n1 Andorra   0  0      1  2   5\n\n[[5]]\n               country min    Q1 median   Q3  max\n1 United Arab Emirates  22 89.75    320 1128 2913\n\n[[6]]\n    country min Q1 median     Q3   max\n1 Argentina   0  0     17 4591.5 42093\n\n\n\nHow do we deal with this to get it into a nice form?\n\n\n\n\nWe can use a vectorization trick: the function do.call() seems like ancient computer science magic. And it is. But it will actually help us a lot.\n\n\nres_df &lt;- do.call(rbind, res)\nhead(res_df)\n\n\n\n\ncountry\nmin\nQ1\nmedian\nQ3\nmax\n\n\n\n\nAfghanistan\n353\n1154.00\n2205\n5166.0\n31107\n\n\nAngola\n29\n700.00\n3271\n14474.0\n30067\n\n\nAlbania\n0\n1.00\n12\n29.0\n136034\n\n\nAndorra\n0\n0.00\n1\n2.0\n5\n\n\nUnited Arab Emirates\n22\n89.75\n320\n1128.0\n2913\n\n\nArgentina\n0\n0.00\n17\n4591.5\n42093\n\n\n\n\n\n\nIt combined our data frames together! Let’s take a look at the rbind and do.call() help packages to see what happened.\n\n\n\n\n?rbind\n\nCombine R Objects by Rows or Columns\n\nDescription:\n\n     Take a sequence of vector, matrix or data-frame arguments and\n     combine by _c_olumns or _r_ows, respectively.  These are generic\n     functions with methods for other R classes.\n\nUsage:\n\n     cbind(..., deparse.level = 1)\n     rbind(..., deparse.level = 1)\n     ## S3 method for class 'data.frame'\n     rbind(..., deparse.level = 1, make.row.names = TRUE,\n           stringsAsFactors = FALSE, factor.exclude = TRUE)\n     \nArguments:\n\n     ...: (generalized) vectors or matrices.  These can be given as\n          named arguments.  Other R objects may be coerced as\n          appropriate, or S4 methods may be used: see sections\n          'Details' and 'Value'.  (For the '\"data.frame\"' method of\n          'cbind' these can be further arguments to 'data.frame' such\n          as 'stringsAsFactors'.)\n\ndeparse.level: integer controlling the construction of labels in the\n          case of non-matrix-like arguments (for the default method):\n          'deparse.level = 0' constructs no labels;\n          the default 'deparse.level = 1' typically and 'deparse.level\n          = 2' always construct labels from the argument names, see the\n          'Value' section below.\n\nmake.row.names: (only for data frame method:) logical indicating if\n          unique and valid 'row.names' should be constructed from the\n          arguments.\n\nstringsAsFactors: logical, passed to 'as.data.frame'; only has an\n          effect when the '...' arguments contain a (non-'data.frame')\n          'character'.\n\nfactor.exclude: if the data frames contain factors, the default 'TRUE'\n          ensures that 'NA' levels of factors are kept, see PR#17562\n          and the 'Data frame methods'.  In R versions up to 3.6.x,\n          'factor.exclude = NA' has been implicitly hardcoded (R &lt;=\n          3.6.0) or the default (R = 3.6.x, x &gt;= 1).\n\nDetails:\n\n     The functions 'cbind' and 'rbind' are S3 generic, with methods for\n     data frames.  The data frame method will be used if at least one\n     argument is a data frame and the rest are vectors or matrices.\n     There can be other methods; in particular, there is one for time\n     series objects.  See the section on 'Dispatch' for how the method\n     to be used is selected.  If some of the arguments are of an S4\n     class, i.e., 'isS4(.)' is true, S4 methods are sought also, and\n     the hidden 'cbind' / 'rbind' functions from package 'methods'\n     maybe called, which in turn build on 'cbind2' or 'rbind2',\n     respectively.  In that case, 'deparse.level' is obeyed, similarly\n     to the default method.\n\n     In the default method, all the vectors/matrices must be atomic\n     (see 'vector') or lists.  Expressions are not allowed.  Language\n     objects (such as formulae and calls) and pairlists will be coerced\n     to lists: other objects (such as names and external pointers) will\n     be included as elements in a list result.  Any classes the inputs\n     might have are discarded (in particular, factors are replaced by\n     their internal codes).\n\n     If there are several matrix arguments, they must all have the same\n     number of columns (or rows) and this will be the number of columns\n     (or rows) of the result.  If all the arguments are vectors, the\n     number of columns (rows) in the result is equal to the length of\n     the longest vector.  Values in shorter arguments are recycled to\n     achieve this length (with a 'warning' if they are recycled only\n     _fractionally_).\n\n     When the arguments consist of a mix of matrices and vectors the\n     number of columns (rows) of the result is determined by the number\n     of columns (rows) of the matrix arguments.  Any vectors have their\n     values recycled or subsetted to achieve this length.\n\n     For 'cbind' ('rbind'), vectors of zero length (including 'NULL')\n     are ignored unless the result would have zero rows (columns), for\n     S compatibility.  (Zero-extent matrices do not occur in S3 and are\n     not ignored in R.)\n\n     Matrices are restricted to less than 2^31 rows and columns even on\n     64-bit systems.  So input vectors have the same length\n     restriction: as from R 3.2.0 input matrices with more elements\n     (but meeting the row and column restrictions) are allowed.\n\nValue:\n\n     For the default method, a matrix combining the '...' arguments\n     column-wise or row-wise.  (Exception: if there are no inputs or\n     all the inputs are 'NULL', the value is 'NULL'.)\n\n     The type of a matrix result determined from the highest type of\n     any of the inputs in the hierarchy raw &lt; logical &lt; integer &lt;\n     double &lt; complex &lt; character &lt; list .\n\n     For 'cbind' ('rbind') the column (row) names are taken from the\n     'colnames' ('rownames') of the arguments if these are matrix-like.\n     Otherwise from the names of the arguments or where those are not\n     supplied and 'deparse.level &gt; 0', by deparsing the expressions\n     given, for 'deparse.level = 1' only if that gives a sensible name\n     (a 'symbol', see 'is.symbol').\n\n     For 'cbind' row names are taken from the first argument with\n     appropriate names: rownames for a matrix, or names for a vector of\n     length the number of rows of the result.\n\n     For 'rbind' column names are taken from the first argument with\n     appropriate names: colnames for a matrix, or names for a vector of\n     length the number of columns of the result.\n\nData frame methods:\n\n     The 'cbind' data frame method is just a wrapper for\n     'data.frame(..., check.names = FALSE)'.  This means that it will\n     split matrix columns in data frame arguments, and convert\n     character columns to factors unless 'stringsAsFactors = FALSE' is\n     specified.\n\n     The 'rbind' data frame method first drops all zero-column and\n     zero-row arguments.  (If that leaves none, it returns the first\n     argument with columns otherwise a zero-column zero-row data\n     frame.)  It then takes the classes of the columns from the first\n     data frame, and matches columns by name (rather than by position).\n     Factors have their levels expanded as necessary (in the order of\n     the levels of the level sets of the factors encountered) and the\n     result is an ordered factor if and only if all the components were\n     ordered factors.  Old-style categories (integer vectors with\n     levels) are promoted to factors.\n\n     Note that for result column 'j', 'factor(., exclude = X(j))' is\n     applied, where\n\n       X(j) := if(isTRUE(factor.exclude)) {\n                  if(!NA.lev[j]) NA # else NULL\n               } else factor.exclude\n     \n     where 'NA.lev[j]' is true iff any contributing data frame has had\n     a 'factor' in column 'j' with an explicit 'NA' level.\n\nDispatch:\n\n     The method dispatching is _not_ done via 'UseMethod()', but by\n     C-internal dispatching.  Therefore there is no need for, e.g.,\n     'rbind.default'.\n\n     The dispatch algorithm is described in the source file\n     ('.../src/main/bind.c') as\n\n       1. For each argument we get the list of possible class\n          memberships from the class attribute.\n\n       2. We inspect each class in turn to see if there is an\n          applicable method.\n\n       3. If we find a method, we use it.  Otherwise, if there was an\n          S4 object among the arguments, we try S4 dispatch; otherwise,\n          we use the default code.\n\n     If you want to combine other objects with data frames, it may be\n     necessary to coerce them to data frames first.  (Note that this\n     algorithm can result in calling the data frame method if all the\n     arguments are either data frames or vectors, and this will result\n     in the coercion of character vectors to factors.)\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'c' to combine vectors (and lists) as vectors, 'data.frame' to\n     combine vectors and matrices as a data frame.\n\nExamples:\n\n     m &lt;- cbind(1, 1:7) # the '1' (= shorter vector) is recycled\n     m\n     m &lt;- cbind(m, 8:14)[, c(1, 3, 2)] # insert a column\n     m\n     cbind(1:7, diag(3)) # vector is subset -&gt; warning\n     \n     cbind(0, rbind(1, 1:3))\n     cbind(I = 0, X = rbind(a = 1, b = 1:3))  # use some names\n     xx &lt;- data.frame(I = rep(0,2))\n     cbind(xx, X = rbind(a = 1, b = 1:3))   # named differently\n     \n     cbind(0, matrix(1, nrow = 0, ncol = 4)) #&gt; Warning (making sense)\n     dim(cbind(0, matrix(1, nrow = 2, ncol = 0))) #-&gt; 2 x 1\n     \n     ## deparse.level\n     dd &lt;- 10\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 0) # middle 2 rownames\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 1) # 3 rownames (default)\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 2) # 4 rownames\n     \n     ## cheap row names:\n     b0 &lt;- gl(3,4, labels=letters[1:3])\n     bf &lt;- setNames(b0, paste0(\"o\", seq_along(b0)))\n     df  &lt;- data.frame(a = 1, B = b0, f = gl(4,3))\n     df. &lt;- data.frame(a = 1, B = bf, f = gl(4,3))\n     new &lt;- data.frame(a = 8, B =\"B\", f = \"1\")\n     (df1  &lt;- rbind(df , new))\n     (df.1 &lt;- rbind(df., new))\n     stopifnot(identical(df1, rbind(df,  new, make.row.names=FALSE)),\n               identical(df1, rbind(df., new, make.row.names=FALSE)))\n\n\n\n\n\n?do.call\n\nExecute a Function Call\n\nDescription:\n\n     'do.call' constructs and executes a function call from a name or a\n     function and a list of arguments to be passed to it.\n\nUsage:\n\n     do.call(what, args, quote = FALSE, envir = parent.frame())\n     \nArguments:\n\n    what: either a function or a non-empty character string naming the\n          function to be called.\n\n    args: a _list_ of arguments to the function call.  The 'names'\n          attribute of 'args' gives the argument names.\n\n   quote: a logical value indicating whether to quote the arguments.\n\n   envir: an environment within which to evaluate the call.  This will\n          be most useful if 'what' is a character string and the\n          arguments are symbols or quoted expressions.\n\nDetails:\n\n     If 'quote' is 'FALSE', the default, then the arguments are\n     evaluated (in the calling environment, not in 'envir').  If\n     'quote' is 'TRUE' then each argument is quoted (see 'quote') so\n     that the effect of argument evaluation is to remove the quotes -\n     leaving the original arguments unevaluated when the call is\n     constructed.\n\n     The behavior of some functions, such as 'substitute', will not be\n     the same for functions evaluated using 'do.call' as if they were\n     evaluated from the interpreter.  The precise semantics are\n     currently undefined and subject to change.\n\nValue:\n\n     The result of the (evaluated) function call.\n\nWarning:\n\n     This should not be used to attempt to evade restrictions on the\n     use of '.Internal' and other non-API calls.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'call' which creates an unevaluated call.\n\nExamples:\n\n     do.call(\"complex\", list(imaginary = 1:3))\n     \n     ## if we already have a list (e.g., a data frame)\n     ## we need c() to add further arguments\n     tmp &lt;- expand.grid(letters[1:2], 1:3, c(\"+\", \"-\"))\n     do.call(\"paste\", c(tmp, sep = \"\"))\n     \n     do.call(paste, list(as.name(\"A\"), as.name(\"B\")), quote = TRUE)\n     \n     ## examples of where objects will be found.\n     A &lt;- 2\n     f &lt;- function(x) print(x^2)\n     env &lt;- new.env()\n     assign(\"A\", 10, envir = env)\n     assign(\"f\", f, envir = env)\n     f &lt;- function(x) print(x)\n     f(A)                                      # 2\n     do.call(\"f\", list(A))                     # 2\n     do.call(\"f\", list(A), envir = env)        # 4\n     do.call( f,  list(A), envir = env)        # 2\n     do.call(\"f\", list(quote(A)), envir = env) # 100\n     do.call( f,  list(quote(A)), envir = env) # 10\n     do.call(\"f\", list(as.name(\"A\")), envir = env) # 100\n     \n     eval(call(\"f\", A))                      # 2\n     eval(call(\"f\", quote(A)))               # 2\n     eval(call(\"f\", A), envir = env)         # 4\n     eval(call(\"f\", quote(A)), envir = env)  # 100\n\n\n\n\n\nOK, so basically what happened is that\n\n\ndo.call(rbind, list)\n\n\nGets transformed into\n\n\nrbind(list[[1]], list[[2]], list[[3]], ..., list[[length(list)]])\n\n\nThat’s vectorization magic!",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Loop walkthrough\n\nNow write out the body of the code. First we need to subset the data, to get only the data for the current country.\n\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n}\n\n\n\n\nNext we need to get the summary of the cases for that country.\n\n\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_med &lt;- median(country_cases, na.rm = TRUE)\n    country_iqr &lt;- IQR(country_cases, na.rm = TRUE)\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n}\n\n\n\n\nNext we save the summary statistics into a data frame.\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_quart &lt;- quantile(\n        country_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n    )\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n    \n    # Save the summary statistics into a data frame\n    country_summary &lt;- data.frame(\n        country = countries[[i]],\n        min = country_range[[1]],\n        Q1 = country_quart[[1]],\n        median = country_quart[[2]],\n        Q3 = country_quart[[3]],\n        max = country_range[[2]]\n    )\n}\n\n\n\n\nAnd finally, we save the data frame as the next element in our storage list.\n\n\nfor (i in 1:length(countries)) {\n    # Get the data for the current country only\n    country_data &lt;- subset(meas, country == countries[i])\n    \n    # Get the summary statistics for this country\n    country_cases &lt;- country_data$Cases\n    country_quart &lt;- quantile(\n        country_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n    )\n    country_range &lt;- range(country_cases, na.rm = TRUE)\n    \n    # Save the summary statistics into a data frame\n    country_summary &lt;- data.frame(\n        country = countries[[i]],\n        min = country_range[[1]],\n        Q1 = country_quart[[1]],\n        median = country_quart[[2]],\n        Q3 = country_quart[[3]],\n        max = country_range[[2]]\n    )\n    \n    # Save the results to our container\n    res[[i]] &lt;- country_summary\n}\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\nWarning in min(x): no non-missing arguments to min; returning Inf\n\n\nWarning in max(x): no non-missing arguments to max; returning -Inf\n\n\n\n\n\nLet’s take a look at the results.\n\n\nhead(res)\n\n[[1]]\n      country min   Q1 median   Q3   max\n1 Afghanistan 353 1154   2205 5166 31107\n\n[[2]]\n  country min  Q1 median    Q3   max\n1  Angola  29 700   3271 14474 30067\n\n[[3]]\n  country min Q1 median Q3    max\n1 Albania   0  1     12 29 136034\n\n[[4]]\n  country min Q1 median Q3 max\n1 Andorra   0  0      1  2   5\n\n[[5]]\n               country min    Q1 median   Q3  max\n1 United Arab Emirates  22 89.75    320 1128 2913\n\n[[6]]\n    country min Q1 median     Q3   max\n1 Argentina   0  0     17 4591.5 42093\n\n\n\nHow do we deal with this to get it into a nice form?\n\n\n\n\nWe can use a vectorization trick: the function do.call() seems like ancient computer science magic. And it is. But it will actually help us a lot.\n\n\nres_df &lt;- do.call(rbind, res)\nhead(res_df)\n\n\n\n\ncountry\nmin\nQ1\nmedian\nQ3\nmax\n\n\n\n\nAfghanistan\n353\n1154.00\n2205\n5166.0\n31107\n\n\nAngola\n29\n700.00\n3271\n14474.0\n30067\n\n\nAlbania\n0\n1.00\n12\n29.0\n136034\n\n\nAndorra\n0\n0.00\n1\n2.0\n5\n\n\nUnited Arab Emirates\n22\n89.75\n320\n1128.0\n2913\n\n\nArgentina\n0\n0.00\n17\n4591.5\n42093\n\n\n\n\n\n\nIt combined our data frames together! Let’s take a look at the rbind and do.call() help packages to see what happened.\n\n\n\n\n?rbind\n\nCombine R Objects by Rows or Columns\n\nDescription:\n\n     Take a sequence of vector, matrix or data-frame arguments and\n     combine by _c_olumns or _r_ows, respectively.  These are generic\n     functions with methods for other R classes.\n\nUsage:\n\n     cbind(..., deparse.level = 1)\n     rbind(..., deparse.level = 1)\n     ## S3 method for class 'data.frame'\n     rbind(..., deparse.level = 1, make.row.names = TRUE,\n           stringsAsFactors = FALSE, factor.exclude = TRUE)\n     \nArguments:\n\n     ...: (generalized) vectors or matrices.  These can be given as\n          named arguments.  Other R objects may be coerced as\n          appropriate, or S4 methods may be used: see sections\n          'Details' and 'Value'.  (For the '\"data.frame\"' method of\n          'cbind' these can be further arguments to 'data.frame' such\n          as 'stringsAsFactors'.)\n\ndeparse.level: integer controlling the construction of labels in the\n          case of non-matrix-like arguments (for the default method):\n          'deparse.level = 0' constructs no labels;\n          the default 'deparse.level = 1' typically and 'deparse.level\n          = 2' always construct labels from the argument names, see the\n          'Value' section below.\n\nmake.row.names: (only for data frame method:) logical indicating if\n          unique and valid 'row.names' should be constructed from the\n          arguments.\n\nstringsAsFactors: logical, passed to 'as.data.frame'; only has an\n          effect when the '...' arguments contain a (non-'data.frame')\n          'character'.\n\nfactor.exclude: if the data frames contain factors, the default 'TRUE'\n          ensures that 'NA' levels of factors are kept, see PR#17562\n          and the 'Data frame methods'.  In R versions up to 3.6.x,\n          'factor.exclude = NA' has been implicitly hardcoded (R &lt;=\n          3.6.0) or the default (R = 3.6.x, x &gt;= 1).\n\nDetails:\n\n     The functions 'cbind' and 'rbind' are S3 generic, with methods for\n     data frames.  The data frame method will be used if at least one\n     argument is a data frame and the rest are vectors or matrices.\n     There can be other methods; in particular, there is one for time\n     series objects.  See the section on 'Dispatch' for how the method\n     to be used is selected.  If some of the arguments are of an S4\n     class, i.e., 'isS4(.)' is true, S4 methods are sought also, and\n     the hidden 'cbind' / 'rbind' functions from package 'methods'\n     maybe called, which in turn build on 'cbind2' or 'rbind2',\n     respectively.  In that case, 'deparse.level' is obeyed, similarly\n     to the default method.\n\n     In the default method, all the vectors/matrices must be atomic\n     (see 'vector') or lists.  Expressions are not allowed.  Language\n     objects (such as formulae and calls) and pairlists will be coerced\n     to lists: other objects (such as names and external pointers) will\n     be included as elements in a list result.  Any classes the inputs\n     might have are discarded (in particular, factors are replaced by\n     their internal codes).\n\n     If there are several matrix arguments, they must all have the same\n     number of columns (or rows) and this will be the number of columns\n     (or rows) of the result.  If all the arguments are vectors, the\n     number of columns (rows) in the result is equal to the length of\n     the longest vector.  Values in shorter arguments are recycled to\n     achieve this length (with a 'warning' if they are recycled only\n     _fractionally_).\n\n     When the arguments consist of a mix of matrices and vectors the\n     number of columns (rows) of the result is determined by the number\n     of columns (rows) of the matrix arguments.  Any vectors have their\n     values recycled or subsetted to achieve this length.\n\n     For 'cbind' ('rbind'), vectors of zero length (including 'NULL')\n     are ignored unless the result would have zero rows (columns), for\n     S compatibility.  (Zero-extent matrices do not occur in S3 and are\n     not ignored in R.)\n\n     Matrices are restricted to less than 2^31 rows and columns even on\n     64-bit systems.  So input vectors have the same length\n     restriction: as from R 3.2.0 input matrices with more elements\n     (but meeting the row and column restrictions) are allowed.\n\nValue:\n\n     For the default method, a matrix combining the '...' arguments\n     column-wise or row-wise.  (Exception: if there are no inputs or\n     all the inputs are 'NULL', the value is 'NULL'.)\n\n     The type of a matrix result determined from the highest type of\n     any of the inputs in the hierarchy raw &lt; logical &lt; integer &lt;\n     double &lt; complex &lt; character &lt; list .\n\n     For 'cbind' ('rbind') the column (row) names are taken from the\n     'colnames' ('rownames') of the arguments if these are matrix-like.\n     Otherwise from the names of the arguments or where those are not\n     supplied and 'deparse.level &gt; 0', by deparsing the expressions\n     given, for 'deparse.level = 1' only if that gives a sensible name\n     (a 'symbol', see 'is.symbol').\n\n     For 'cbind' row names are taken from the first argument with\n     appropriate names: rownames for a matrix, or names for a vector of\n     length the number of rows of the result.\n\n     For 'rbind' column names are taken from the first argument with\n     appropriate names: colnames for a matrix, or names for a vector of\n     length the number of columns of the result.\n\nData frame methods:\n\n     The 'cbind' data frame method is just a wrapper for\n     'data.frame(..., check.names = FALSE)'.  This means that it will\n     split matrix columns in data frame arguments, and convert\n     character columns to factors unless 'stringsAsFactors = FALSE' is\n     specified.\n\n     The 'rbind' data frame method first drops all zero-column and\n     zero-row arguments.  (If that leaves none, it returns the first\n     argument with columns otherwise a zero-column zero-row data\n     frame.)  It then takes the classes of the columns from the first\n     data frame, and matches columns by name (rather than by position).\n     Factors have their levels expanded as necessary (in the order of\n     the levels of the level sets of the factors encountered) and the\n     result is an ordered factor if and only if all the components were\n     ordered factors.  (The last point differs from S-PLUS.)  Old-style\n     categories (integer vectors with levels) are promoted to factors.\n\n     Note that for result column 'j', 'factor(., exclude = X(j))' is\n     applied, where\n\n       X(j) := if(isTRUE(factor.exclude)) {\n                  if(!NA.lev[j]) NA # else NULL\n               } else factor.exclude\n     \n     where 'NA.lev[j]' is true iff any contributing data frame has had\n     a 'factor' in column 'j' with an explicit 'NA' level.\n\nDispatch:\n\n     The method dispatching is _not_ done via 'UseMethod()', but by\n     C-internal dispatching.  Therefore there is no need for, e.g.,\n     'rbind.default'.\n\n     The dispatch algorithm is described in the source file\n     ('.../src/main/bind.c') as\n\n       1. For each argument we get the list of possible class\n          memberships from the class attribute.\n\n       2. We inspect each class in turn to see if there is an\n          applicable method.\n\n       3. If we find a method, we use it.  Otherwise, if there was an\n          S4 object among the arguments, we try S4 dispatch; otherwise,\n          we use the default code.\n\n     If you want to combine other objects with data frames, it may be\n     necessary to coerce them to data frames first.  (Note that this\n     algorithm can result in calling the data frame method if all the\n     arguments are either data frames or vectors, and this will result\n     in the coercion of character vectors to factors.)\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'c' to combine vectors (and lists) as vectors, 'data.frame' to\n     combine vectors and matrices as a data frame.\n\nExamples:\n\n     m &lt;- cbind(1, 1:7) # the '1' (= shorter vector) is recycled\n     m\n     m &lt;- cbind(m, 8:14)[, c(1, 3, 2)] # insert a column\n     m\n     cbind(1:7, diag(3)) # vector is subset -&gt; warning\n     \n     cbind(0, rbind(1, 1:3))\n     cbind(I = 0, X = rbind(a = 1, b = 1:3))  # use some names\n     xx &lt;- data.frame(I = rep(0,2))\n     cbind(xx, X = rbind(a = 1, b = 1:3))   # named differently\n     \n     cbind(0, matrix(1, nrow = 0, ncol = 4)) #&gt; Warning (making sense)\n     dim(cbind(0, matrix(1, nrow = 2, ncol = 0))) #-&gt; 2 x 1\n     \n     ## deparse.level\n     dd &lt;- 10\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 0) # middle 2 rownames\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 1) # 3 rownames (default)\n     rbind(1:4, c = 2, \"a++\" = 10, dd, deparse.level = 2) # 4 rownames\n     \n     ## cheap row names:\n     b0 &lt;- gl(3,4, labels=letters[1:3])\n     bf &lt;- setNames(b0, paste0(\"o\", seq_along(b0)))\n     df  &lt;- data.frame(a = 1, B = b0, f = gl(4,3))\n     df. &lt;- data.frame(a = 1, B = bf, f = gl(4,3))\n     new &lt;- data.frame(a = 8, B =\"B\", f = \"1\")\n     (df1  &lt;- rbind(df , new))\n     (df.1 &lt;- rbind(df., new))\n     stopifnot(identical(df1, rbind(df,  new, make.row.names=FALSE)),\n               identical(df1, rbind(df., new, make.row.names=FALSE)))\n\n\n\n\n\n?do.call\n\nExecute a Function Call\n\nDescription:\n\n     'do.call' constructs and executes a function call from a name or a\n     function and a list of arguments to be passed to it.\n\nUsage:\n\n     do.call(what, args, quote = FALSE, envir = parent.frame())\n     \nArguments:\n\n    what: either a function or a non-empty character string naming the\n          function to be called.\n\n    args: a _list_ of arguments to the function call.  The 'names'\n          attribute of 'args' gives the argument names.\n\n   quote: a logical value indicating whether to quote the arguments.\n\n   envir: an environment within which to evaluate the call.  This will\n          be most useful if 'what' is a character string and the\n          arguments are symbols or quoted expressions.\n\nDetails:\n\n     If 'quote' is 'FALSE', the default, then the arguments are\n     evaluated (in the calling environment, not in 'envir').  If\n     'quote' is 'TRUE' then each argument is quoted (see 'quote') so\n     that the effect of argument evaluation is to remove the quotes -\n     leaving the original arguments unevaluated when the call is\n     constructed.\n\n     The behavior of some functions, such as 'substitute', will not be\n     the same for functions evaluated using 'do.call' as if they were\n     evaluated from the interpreter.  The precise semantics are\n     currently undefined and subject to change.\n\nValue:\n\n     The result of the (evaluated) function call.\n\nWarning:\n\n     This should not be used to attempt to evade restrictions on the\n     use of '.Internal' and other non-API calls.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'call' which creates an unevaluated call.\n\nExamples:\n\n     do.call(\"complex\", list(imaginary = 1:3))\n     \n     ## if we already have a list (e.g., a data frame)\n     ## we need c() to add further arguments\n     tmp &lt;- expand.grid(letters[1:2], 1:3, c(\"+\", \"-\"))\n     do.call(\"paste\", c(tmp, sep = \"\"))\n     \n     do.call(paste, list(as.name(\"A\"), as.name(\"B\")), quote = TRUE)\n     \n     ## examples of where objects will be found.\n     A &lt;- 2\n     f &lt;- function(x) print(x^2)\n     env &lt;- new.env()\n     assign(\"A\", 10, envir = env)\n     assign(\"f\", f, envir = env)\n     f &lt;- function(x) print(x)\n     f(A)                                      # 2\n     do.call(\"f\", list(A))                     # 2\n     do.call(\"f\", list(A), envir = env)        # 4\n     do.call( f,  list(A), envir = env)        # 2\n     do.call(\"f\", list(quote(A)), envir = env) # 100\n     do.call( f,  list(quote(A)), envir = env) # 10\n     do.call(\"f\", list(as.name(\"A\")), envir = env) # 100\n     \n     eval(call(\"f\", A))                      # 2\n     eval(call(\"f\", quote(A)))               # 2\n     eval(call(\"f\", A), envir = env)         # 4\n     eval(call(\"f\", quote(A)), envir = env)  # 100\n\n\n\n\n\nOK, so basically what happened is that\n\n\ndo.call(rbind, list)\n\n\nGets transformed into\n\n\nrbind(list[[1]], list[[2]], list[[3]], ..., list[[length(list)]])\n\n\nThat’s vectorization magic!"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#you-try-it-if-we-have-time",
     "href": "modules/ModuleXX-Iteration.html#you-try-it-if-we-have-time",
     "title": "Iteration in R",
     "section": "You try it! (if we have time)",
-    "text": "You try it! (if we have time)\n\nUse the code you wrote before the get the incidence per 1000 people on the entire measles data set (add a column for incidence to the full data).\nUse the code plot(NULL, NULL, ...) to make a blank plot. You will need to set the xlim and ylim arguments to sensible values, and specify the axis titles as “Year” and “Incidence per 1000 people”.\nUsing a for loop and the lines() function, make a plot that shows all of the incidence curves over time, overlapping on the plot.\nHINT: use col = adjustcolor(black, alpha.f = 0.25) to make the curves transparent, so you can see the others.\nBONUS PROBLEM: using the function cumsum(), make a plot of the cumulative incidence per 1000 people over time for all of the countries. (Dealing with the NA’s here is tricky!!)",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "You try it! (if we have time)\n\nUse the code you wrote before the get the incidence per 1000 people on the entire measles data set (add a column for incidence to the full data).\nUse the code plot(NULL, NULL, ...) to make a blank plot. You will need to set the xlim and ylim arguments to sensible values, and specify the axis titles as “Year” and “Incidence per 1000 people”.\nUsing a for loop and the lines() function, make a plot that shows all of the incidence curves over time, overlapping on the plot.\nHINT: use col = adjustcolor(black, alpha.f = 0.25) to make the curves transparent, so you can see the others.\nBONUS PROBLEM: using the function cumsum(), make a plot of the cumulative incidence per 1000 people over time for all of the countries. (Dealing with the NA’s here is tricky!!)"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#main-problem-solution",
     "href": "modules/ModuleXX-Iteration.html#main-problem-solution",
     "title": "Iteration in R",
     "section": "Main problem solution",
-    "text": "Main problem solution\n\nmeas$cases_per_thousand &lt;- meas$Cases / as.numeric(meas$total_pop) * 1000\ncountries &lt;- unique(meas$country)\n\nplot(\n    NULL, NULL,\n    xlim = c(1980, 2022),\n    ylim = c(0, 50),\n    xlab = \"Year\",\n    ylab = \"Incidence per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n    country_data &lt;- subset(meas, country == countries[[i]])\n    lines(\n        x = country_data$time,\n        y = country_data$cases_per_thousand,\n        col = adjustcolor(\"black\", alpha.f = 0.25)\n    )\n}",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Main problem solution\n\nmeas$cases_per_thousand &lt;- meas$Cases / as.numeric(meas$total_pop) * 1000\ncountries &lt;- unique(meas$country)\n\nplot(\n    NULL, NULL,\n    xlim = c(1980, 2022),\n    ylim = c(0, 50),\n    xlab = \"Year\",\n    ylab = \"Incidence per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n    country_data &lt;- subset(meas, country == countries[[i]])\n    lines(\n        x = country_data$time,\n        y = country_data$cases_per_thousand,\n        col = adjustcolor(\"black\", alpha.f = 0.25)\n    )\n}"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#bonus-problem-solution",
     "href": "modules/ModuleXX-Iteration.html#bonus-problem-solution",
     "title": "Iteration in R",
     "section": "Bonus problem solution",
-    "text": "Bonus problem solution\n\n# First calculate the cumulative cases, treating NA as zeroes\ncumulative_cases &lt;- ave(\n    x = ifelse(is.na(meas$Cases), 0, meas$Cases),\n    meas$country,\n    FUN = cumsum\n)\n\n# Now put the NAs back where they should be\nmeas$cumulative_cases &lt;- cumulative_cases + (meas$Cases * 0)\n\nplot(\n    NULL, NULL,\n    xlim = c(1980, 2022),\n    ylim = c(1, 6.2e6),\n    xlab = \"Year\",\n    ylab = \"Cumulative cases per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n    country_data &lt;- subset(meas, country == countries[[i]])\n    lines(\n        x = country_data$time,\n        y = country_data$cumulative_cases,\n        col = adjustcolor(\"black\", alpha.f = 0.25)\n    )\n}\n\ntext(\n    x = 2020,\n    y = 6e6,\n    labels = \"China →\"\n)",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Bonus problem solution\n\n# First calculate the cumulative cases, treating NA as zeroes\ncumulative_cases &lt;- ave(\n    x = ifelse(is.na(meas$Cases), 0, meas$Cases),\n    meas$country,\n    FUN = cumsum\n)\n\n# Now put the NAs back where they should be\nmeas$cumulative_cases &lt;- cumulative_cases + (meas$Cases * 0)\n\nplot(\n    NULL, NULL,\n    xlim = c(1980, 2022),\n    ylim = c(1, 6.2e6),\n    xlab = \"Year\",\n    ylab = \"Cumulative cases per 1000 people\"\n)\n\nfor (i in 1:length(countries)) {\n    country_data &lt;- subset(meas, country == countries[[i]])\n    lines(\n        x = country_data$time,\n        y = country_data$cumulative_cases,\n        col = adjustcolor(\"black\", alpha.f = 0.25)\n    )\n}\n\ntext(\n    x = 2020,\n    y = 6e6,\n    labels = \"China →\"\n)"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#main-problem-solution-1",
     "href": "modules/ModuleXX-Iteration.html#main-problem-solution-1",
     "title": "Iteration in R",
     "section": "Main problem solution",
-    "text": "Main problem solution",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Main problem solution"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#bonus-problem-solution-1",
     "href": "modules/ModuleXX-Iteration.html#bonus-problem-solution-1",
     "title": "Iteration in R",
     "section": "Bonus problem solution",
-    "text": "Bonus problem solution",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "Bonus problem solution"
   },
   {
     "objectID": "modules/ModuleXX-Iteration.html#more-practice-on-your-own",
     "href": "modules/ModuleXX-Iteration.html#more-practice-on-your-own",
     "title": "Iteration in R",
     "section": "More practice on your own",
-    "text": "More practice on your own\n\nMerge the countries-regions.csv data with the measles_final.Rds data. Reshape the measles data so that MCV1 and MCV2 vaccine coverage are two separate columns. Then use a loop to fit a poisson regression model for each continent where Cases is the outcome, and MCV1 coverage and MCV2 coverage are the predictors. Discuss your findings, and try adding an interation term.\nAssess the impact of age_months as a confounder in the Diphtheria serology data. First, write code to transform age_months into age ranges for each year. Then, using a loop, calculate the crude odds ratio for the effect of vaccination on infection for each of the age ranges. How does the odds ratio change as age increases? Can you formalize this analysis by fitting a logistic regression model with age_months and vaccination as predictors?",
-    "crumbs": [
-      "Day 2",
-      "Iteration in R"
-    ]
+    "text": "More practice on your own\n\nMerge the countries-regions.csv data with the measles_final.Rds data. Reshape the measles data so that MCV1 and MCV2 vaccine coverage are two separate columns. Then use a loop to fit a poisson regression model for each continent where Cases is the outcome, and MCV1 coverage and MCV2 coverage are the predictors. Discuss your findings, and try adding an interation term.\nAssess the impact of age_months as a confounder in the Diphtheria serology data. First, write code to transform age_months into age ranges for each year. Then, using a loop, calculate the crude odds ratio for the effect of vaccination on infection for each of the age ranges. How does the odds ratio change as age increases? Can you formalize this analysis by fitting a logistic regression model with age_months and vaccination as predictors?"
   },
   {
     "objectID": "modules/ModuleXX-RMarkdown.html#learning-goals",
@@ -760,14 +660,14 @@
     "href": "modules/Module10-DataVisualization.html#prep-data",
     "title": "Module 10: Data Visualization",
     "section": "Prep data",
-    "text": "Prep data\nCreate age_group three level factor variable\n\ndf$age_group &lt;- ifelse(df$age &lt;= 5, \"young\", \n                       ifelse(df$age&lt;=10 & df$age&gt;5, \"middle\", \n                              ifelse(df$age&gt;10, \"old\", NA)))\ndf$age_group &lt;- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n\nCreate seropos binary variable representing seropositivity if antibody concentrations are &gt;10 mIUmL.\n\ndf$seropos &lt;- ifelse(df$IgG_concentration&lt;10, 0, \n                                        ifelse(df$IgG_concentration&gt;=10, 1, NA))"
+    "text": "Prep data\nCreate age_group three level factor variable\n\ndf$age_group &lt;- ifelse(df$age &lt;= 5, \"young\", \n                       ifelse(df$age&lt;=10 & df$age&gt;5, \"middle\", \"old\")) \ndf$age_group &lt;- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n\nCreate seropos binary variable representing seropositivity if antibody concentrations are &gt;10 IU/mL.\n\ndf$seropos &lt;- ifelse(df$IgG_concentration&lt;10, 0, 1)"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#base-r-data-visualizattion-functions",
     "href": "modules/Module10-DataVisualization.html#base-r-data-visualizattion-functions",
     "title": "Module 10: Data Visualization",
     "section": "Base R data visualizattion functions",
-    "text": "Base R data visualizattion functions\nThe Base R ‘graphics’ package has a ton of graphics options.\n\nlibrary(help = \"graphics\")\n\n\n\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n\n\n        Information on package 'graphics'\n\nDescription:\n\nPackage:            graphics\nVersion:            4.3.1\nPriority:           base\nTitle:              The R Graphics Package\nAuthor:             R Core Team and contributors worldwide\nMaintainer:         R Core Team &lt;do-use-Contact-address@r-project.org&gt;\nContact:            R-help mailing list &lt;r-help@r-project.org&gt;\nDescription:        R functions for base graphics.\nImports:            grDevices\nLicense:            Part of R 4.3.1\nNeedsCompilation:   yes\nBuilt:              R 4.3.1; aarch64-apple-darwin20; 2023-06-16\n                    21:53:01 UTC; unix\n\nIndex:\n\nAxis                    Generic Function to Add an Axis to a Plot\nabline                  Add Straight Lines to a Plot\narrows                  Add Arrows to a Plot\nassocplot               Association Plots\naxTicks                 Compute Axis Tickmark Locations\naxis                    Add an Axis to a Plot\naxis.POSIXct            Date and Date-time Plotting Functions\nbarplot                 Bar Plots\nbox                     Draw a Box around a Plot\nboxplot                 Box Plots\nboxplot.matrix          Draw a Boxplot for each Column (Row) of a\n                        Matrix\nbxp                     Draw Box Plots from Summaries\ncdplot                  Conditional Density Plots\nclip                    Set Clipping Region\ncontour                 Display Contours\ncoplot                  Conditioning Plots\ncurve                   Draw Function Plots\ndotchart                Cleveland's Dot Plots\nfilled.contour          Level (Contour) Plots\nfourfoldplot            Fourfold Plots\nframe                   Create / Start a New Plot Frame\ngraphics-package        The R Graphics Package\ngrconvertX              Convert between Graphics Coordinate Systems\ngrid                    Add Grid to a Plot\nhist                    Histograms\nhist.POSIXt             Histogram of a Date or Date-Time Object\nidentify                Identify Points in a Scatter Plot\nimage                   Display a Color Image\nlayout                  Specifying Complex Plot Arrangements\nlegend                  Add Legends to Plots\nlines                   Add Connected Line Segments to a Plot\nlocator                 Graphical Input\nmatplot                 Plot Columns of Matrices\nmosaicplot              Mosaic Plots\nmtext                   Write Text into the Margins of a Plot\npairs                   Scatterplot Matrices\npanel.smooth            Simple Panel Plot\npar                     Set or Query Graphical Parameters\npersp                   Perspective Plots\npie                     Pie Charts\nplot.data.frame         Plot Method for Data Frames\nplot.default            The Default Scatterplot Function\nplot.design             Plot Univariate Effects of a Design or Model\nplot.factor             Plotting Factor Variables\nplot.formula            Formula Notation for Scatterplots\nplot.histogram          Plot Histograms\nplot.raster             Plotting Raster Images\nplot.table              Plot Methods for 'table' Objects\nplot.window             Set up World Coordinates for Graphics Window\nplot.xy                 Basic Internal Plot Function\npoints                  Add Points to a Plot\npolygon                 Polygon Drawing\npolypath                Path Drawing\nrasterImage             Draw One or More Raster Images\nrect                    Draw One or More Rectangles\nrug                     Add a Rug to a Plot\nscreen                  Creating and Controlling Multiple Screens on a\n                        Single Device\nsegments                Add Line Segments to a Plot\nsmoothScatter           Scatterplots with Smoothed Densities Color\n                        Representation\nspineplot               Spine Plots and Spinograms\nstars                   Star (Spider/Radar) Plots and Segment Diagrams\nstem                    Stem-and-Leaf Plots\nstripchart              1-D Scatter Plots\nstrwidth                Plotting Dimensions of Character Strings and\n                        Math Expressions\nsunflowerplot           Produce a Sunflower Scatter Plot\nsymbols                 Draw Symbols (Circles, Squares, Stars,\n                        Thermometers, Boxplots)\ntext                    Add Text to a Plot\ntitle                   Plot Annotation\nxinch                   Graphical Units\nxspline                 Draw an X-spline"
+    "text": "Base R data visualizattion functions\nThe Base R ‘graphics’ package has a ton of graphics options.\n\nhelp(package = \"graphics\")\n\n\n\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n\n\n        Information on package 'graphics'\n\nDescription:\n\nPackage:            graphics\nVersion:            4.3.1\nPriority:           base\nTitle:              The R Graphics Package\nAuthor:             R Core Team and contributors worldwide\nMaintainer:         R Core Team &lt;do-use-Contact-address@r-project.org&gt;\nContact:            R-help mailing list &lt;r-help@r-project.org&gt;\nDescription:        R functions for base graphics.\nImports:            grDevices\nLicense:            Part of R 4.3.1\nNeedsCompilation:   yes\nBuilt:              R 4.3.1; aarch64-apple-darwin20; 2023-06-16\n                    21:53:01 UTC; unix\n\nIndex:\n\nAxis                    Generic Function to Add an Axis to a Plot\nabline                  Add Straight Lines to a Plot\narrows                  Add Arrows to a Plot\nassocplot               Association Plots\naxTicks                 Compute Axis Tickmark Locations\naxis                    Add an Axis to a Plot\naxis.POSIXct            Date and Date-time Plotting Functions\nbarplot                 Bar Plots\nbox                     Draw a Box around a Plot\nboxplot                 Box Plots\nboxplot.matrix          Draw a Boxplot for each Column (Row) of a\n                        Matrix\nbxp                     Draw Box Plots from Summaries\ncdplot                  Conditional Density Plots\nclip                    Set Clipping Region\ncontour                 Display Contours\ncoplot                  Conditioning Plots\ncurve                   Draw Function Plots\ndotchart                Cleveland's Dot Plots\nfilled.contour          Level (Contour) Plots\nfourfoldplot            Fourfold Plots\nframe                   Create / Start a New Plot Frame\ngraphics-package        The R Graphics Package\ngrconvertX              Convert between Graphics Coordinate Systems\ngrid                    Add Grid to a Plot\nhist                    Histograms\nhist.POSIXt             Histogram of a Date or Date-Time Object\nidentify                Identify Points in a Scatter Plot\nimage                   Display a Color Image\nlayout                  Specifying Complex Plot Arrangements\nlegend                  Add Legends to Plots\nlines                   Add Connected Line Segments to a Plot\nlocator                 Graphical Input\nmatplot                 Plot Columns of Matrices\nmosaicplot              Mosaic Plots\nmtext                   Write Text into the Margins of a Plot\npairs                   Scatterplot Matrices\npanel.smooth            Simple Panel Plot\npar                     Set or Query Graphical Parameters\npersp                   Perspective Plots\npie                     Pie Charts\nplot.data.frame         Plot Method for Data Frames\nplot.default            The Default Scatterplot Function\nplot.design             Plot Univariate Effects of a Design or Model\nplot.factor             Plotting Factor Variables\nplot.formula            Formula Notation for Scatterplots\nplot.histogram          Plot Histograms\nplot.raster             Plotting Raster Images\nplot.table              Plot Methods for 'table' Objects\nplot.window             Set up World Coordinates for Graphics Window\nplot.xy                 Basic Internal Plot Function\npoints                  Add Points to a Plot\npolygon                 Polygon Drawing\npolypath                Path Drawing\nrasterImage             Draw One or More Raster Images\nrect                    Draw One or More Rectangles\nrug                     Add a Rug to a Plot\nscreen                  Creating and Controlling Multiple Screens on a\n                        Single Device\nsegments                Add Line Segments to a Plot\nsmoothScatter           Scatterplots with Smoothed Densities Color\n                        Representation\nspineplot               Spine Plots and Spinograms\nstars                   Star (Spider/Radar) Plots and Segment Diagrams\nstem                    Stem-and-Leaf Plots\nstripchart              1-D Scatter Plots\nstrwidth                Plotting Dimensions of Character Strings and\n                        Math Expressions\nsunflowerplot           Produce a Sunflower Scatter Plot\nsymbols                 Draw Symbols (Circles, Squares, Stars,\n                        Thermometers, Boxplots)\ntext                    Add Text to a Plot\ntitle                   Plot Annotation\nxinch                   Graphical Units\nxspline                 Draw an X-spline"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#focus-on-a-handful-here-today",
@@ -788,7 +688,7 @@
     "href": "modules/Module10-DataVisualization.html#histogram-example",
     "title": "Module 10: Data Visualization",
     "section": "histogram() example",
-    "text": "histogram() example\nReminder\nhist(x, breaks = \"Sturges\",\n     freq = NULL, probability = !freq,\n     include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n     density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n     main = paste(\"Histogram of\" , xname),\n     xlim = range(breaks), ylim = NULL,\n     xlab = xname, ylab,\n     axes = TRUE, plot = TRUE, labels = FALSE,\n     nclass = NULL, warn.unused = TRUE, ...)\nLet’s practice\n\nhist(df$age)\n\n\n\n\n\n\n\nhist(\n    df$age, \n    freq=FALSE, \n    main=\"Histogram\", \n    xlab=\"Age (years)\"\n    )"
+    "text": "histogram() example\nReminder function signature\nhist(x, breaks = \"Sturges\",\n     freq = NULL, probability = !freq,\n     include.lowest = TRUE, right = TRUE, fuzz = 1e-7,\n     density = NULL, angle = 45, col = \"lightgray\", border = NULL,\n     main = paste(\"Histogram of\" , xname),\n     xlim = range(breaks), ylim = NULL,\n     xlab = xname, ylab,\n     axes = TRUE, plot = TRUE, labels = FALSE,\n     nclass = NULL, warn.unused = TRUE, ...)\nLet’s practice\n\nhist(df$age)\n\n\n\nhist(\n    df$age, \n    freq=FALSE, \n    main=\"Histogram\", \n    xlab=\"Age (years)\"\n    )"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#plot-help-file",
@@ -809,7 +709,7 @@
     "href": "modules/Module10-DataVisualization.html#plot-example",
     "title": "Module 10: Data Visualization",
     "section": "plot() example",
-    "text": "plot() example\n\nplot(df$age, df$IgG_concentration)\n\n\n\n\n\n\n\nplot(\n    df$age, \n    df$IgG_concentration, \n    type=\"p\", \n    main=\"Age by IgG Concentrations\", \n    xlab=\"Age (years)\", \n    ylab=\"IgG Concentration (mIU/mL)\", \n    pch=16, \n    cex=0.9,\n    col=\"lightblue\")"
+    "text": "plot() example\n\nplot(df$age, df$IgG_concentration)\n\n\n\nplot(\n    df$age, \n    df$IgG_concentration, \n    type=\"p\", \n    main=\"Age by IgG Concentrations\", \n    xlab=\"Age (years)\", \n    ylab=\"IgG Concentration (IU/mL)\", \n    pch=16, \n    cex=0.9,\n    col=\"lightblue\")"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#boxplot-help-file",
@@ -823,21 +723,21 @@
     "href": "modules/Module10-DataVisualization.html#boxplot-example",
     "title": "Module 10: Data Visualization",
     "section": "boxplot() example",
-    "text": "boxplot() example\nReminder\nboxplot(formula, data = NULL, ..., subset, na.action = NULL,\n        xlab = mklab(y_var = horizontal),\n        ylab = mklab(y_var =!horizontal),\n        add = FALSE, ann = !add, horizontal = FALSE,\n        drop = FALSE, sep = \".\", lex.order = FALSE)\nLet’s practice\n\nboxplot(IgG_concentration~age_group, data=df)\n\n\n\n\n\n\n\nboxplot(\n    log(df$IgG_concentration)~df$age_group, \n    main=\"Age by IgG Concentrations\", \n    xlab=\"Age Group (years)\", \n    ylab=\"log IgG Concentration (mIU/mL)\", \n    names=c(\"1-5\",\"6-10\", \"11-15\"), \n    varwidth=T\n    )"
+    "text": "boxplot() example\nReminder function signature\nboxplot(formula, data = NULL, ..., subset, na.action = NULL,\n        xlab = mklab(y_var = horizontal),\n        ylab = mklab(y_var =!horizontal),\n        add = FALSE, ann = !add, horizontal = FALSE,\n        drop = FALSE, sep = \".\", lex.order = FALSE)\nLet’s practice\n\nboxplot(IgG_concentration~age_group, data=df)\n\n\n\nboxplot(\n    log(df$IgG_concentration)~df$age_group, \n    main=\"Age by IgG Concentrations\", \n    xlab=\"Age Group (years)\", \n    ylab=\"log IgG Concentration (mIU/mL)\", \n    names=c(\"1-5\",\"6-10\", \"11-15\"), \n    varwidth=T\n    )"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#barplot-help-file",
     "href": "modules/Module10-DataVisualization.html#barplot-help-file",
     "title": "Module 10: Data Visualization",
     "section": "barplot() Help File",
-    "text": "barplot() Help File\n\n?barplot\n\nBox Plots\nDescription:\n Produce box-and-whisker plot(s) of the given (grouped) values.\nUsage:\n boxplot(x, ...)\n \n ## S3 method for class 'formula'\n boxplot(formula, data = NULL, ..., subset, na.action = NULL,\n         xlab = mklab(y_var = horizontal),\n         ylab = mklab(y_var =!horizontal),\n         add = FALSE, ann = !add, horizontal = FALSE,\n         drop = FALSE, sep = \".\", lex.order = FALSE)\n \n ## Default S3 method:\n boxplot(x, ..., range = 1.5, width = NULL, varwidth = FALSE,\n         notch = FALSE, outline = TRUE, names, plot = TRUE,\n         border = par(\"fg\"), col = \"lightgray\", log = \"\",\n         pars = list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5),\n          ann = !add, horizontal = FALSE, add = FALSE, at = NULL)\n \nArguments:\nformula: a formula, such as ‘y ~ grp’, where ‘y’ is a numeric vector of data values to be split into groups according to the grouping variable ‘grp’ (usually a factor). Note that ‘~ g1 + g2’ is equivalent to ‘g1:g2’.\ndata: a data.frame (or list) from which the variables in 'formula'\n      should be taken.\nsubset: an optional vector specifying a subset of observations to be used for plotting.\nna.action: a function which indicates what should happen when the data contain ’NA’s. The default is to ignore missing values in either the response or the group.\nxlab, ylab: x- and y-axis annotation, since R 3.6.0 with a non-empty default. Can be suppressed by ‘ann=FALSE’.\n ann: 'logical' indicating if axes should be annotated (by 'xlab'\n      and 'ylab').\ndrop, sep, lex.order: passed to ‘split.default’, see there.\n   x: for specifying data from which the boxplots are to be\n      produced. Either a numeric vector, or a single list\n      containing such vectors. Additional unnamed arguments specify\n      further data as separate vectors (each corresponding to a\n      component boxplot).  'NA's are allowed in the data.\n\n ...: For the 'formula' method, named arguments to be passed to the\n      default method.\n\n      For the default method, unnamed arguments are additional data\n      vectors (unless 'x' is a list when they are ignored), and\n      named arguments are arguments and graphical parameters to be\n      passed to 'bxp' in addition to the ones given by argument\n      'pars' (and override those in 'pars'). Note that 'bxp' may or\n      may not make use of graphical parameters it is passed: see\n      its documentation.\nrange: this determines how far the plot whiskers extend out from the box. If ‘range’ is positive, the whiskers extend to the most extreme data point which is no more than ‘range’ times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.\nwidth: a vector giving the relative widths of the boxes making up the plot.\nvarwidth: if ‘varwidth’ is ‘TRUE’, the boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.\nnotch: if ‘notch’ is ‘TRUE’, a notch is drawn in each side of the boxes. If the notches of two plots do not overlap this is ‘strong evidence’ that the two medians differ (Chambers et al, 1983, p. 62). See ‘boxplot.stats’ for the calculations used.\noutline: if ‘outline’ is not true, the outliers are not drawn (as points whereas S+ uses lines).\nnames: group labels which will be printed under each boxplot. Can be a character vector or an expression (see plotmath).\nboxwex: a scale factor to be applied to all boxes. When there are only a few groups, the appearance of the plot can be improved by making the boxes narrower.\nstaplewex: staple line width expansion, proportional to box width.\noutwex: outlier line width expansion, proportional to box width.\nplot: if 'TRUE' (the default) then a boxplot is produced.  If not,\n      the summaries which the boxplots are based on are returned.\nborder: an optional vector of colors for the outlines of the boxplots. The values in ‘border’ are recycled if the length of ‘border’ is less than the number of plots.\n col: if 'col' is non-null it is assumed to contain colors to be\n      used to colour the bodies of the box plots. By default they\n      are in the background colour.\n\n log: character indicating if x or y or both coordinates should be\n      plotted in log scale.\n\npars: a list of (potentially many) more graphical parameters, e.g.,\n      'boxwex' or 'outpch'; these are passed to 'bxp' (if 'plot' is\n      true); for details, see there.\nhorizontal: logical indicating if the boxplots should be horizontal; default ‘FALSE’ means vertical boxes.\n add: logical, if true _add_ boxplot to current plot.\n\n  at: numeric vector giving the locations where the boxplots should\n      be drawn, particularly when 'add = TRUE'; defaults to '1:n'\n      where 'n' is the number of boxes.\nDetails:\n The generic function 'boxplot' currently has a default method\n ('boxplot.default') and a formula interface ('boxplot.formula').\n\n If multiple groups are supplied either as multiple arguments or\n via a formula, parallel boxplots will be plotted, in the order of\n the arguments or the order of the levels of the factor (see\n 'factor').\n\n Missing values are ignored when forming boxplots.\nValue:\n List with the following components:\nstats: a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot. If all the inputs have the same class attribute, so will this component.\n   n: a vector with the number of (non-'NA') observations in each\n      group.\n\nconf: a matrix where each column contains the lower and upper\n      extremes of the notch.\n\n out: the values of any data points which lie beyond the extremes\n      of the whiskers.\ngroup: a vector of the same length as ‘out’ whose elements indicate to which group the outlier belongs.\nnames: a vector of names for the groups.\nReferences:\n Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).  _The New\n S Language_.  Wadsworth & Brooks/Cole.\n\n Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A.\n (1983).  _Graphical Methods for Data Analysis_.  Wadsworth &\n Brooks/Cole.\n\n Murrell, P. (2005).  _R Graphics_.  Chapman & Hall/CRC Press.\n\n See also 'boxplot.stats'.\nSee Also:\n 'boxplot.stats' which does the computation, 'bxp' for the plotting\n and more examples; and 'stripchart' for an alternative (with small\n data sets).\nExamples:\n ## boxplot on a formula:\n boxplot(count ~ spray, data = InsectSprays, col = \"lightgray\")\n # *add* notches (somewhat funny here &lt;--&gt; warning \"notches .. outside hinges\"):\n boxplot(count ~ spray, data = InsectSprays,\n         notch = TRUE, add = TRUE, col = \"blue\")\n \n boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n         log = \"y\")\n ## horizontal=TRUE, switching  y &lt;--&gt; x :\n boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\",\n         log = \"x\", horizontal=TRUE)\n \n rb &lt;- boxplot(decrease ~ treatment, data = OrchardSprays, col = \"bisque\")\n title(\"Comparing boxplot()s and non-robust mean +/- SD\")\n mn.t &lt;- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)\n sd.t &lt;- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)\n xi &lt;- 0.3 + seq(rb$n)\n points(xi, mn.t, col = \"orange\", pch = 18)\n arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,\n        code = 3, col = \"pink\", angle = 75, length = .1)\n \n ## boxplot on a matrix:\n mat &lt;- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),\n              `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))\n boxplot(mat) # directly, calling boxplot.matrix()\n \n ## boxplot on a data frame:\n df. &lt;- as.data.frame(mat)\n par(las = 1) # all axis labels horizontal\n boxplot(df., main = \"boxplot(*, horizontal = TRUE)\", horizontal = TRUE)\n \n ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :\n boxplot(len ~ dose, data = ToothGrowth,\n         boxwex = 0.25, at = 1:3 - 0.2,\n         subset = supp == \"VC\", col = \"yellow\",\n         main = \"Guinea Pigs' Tooth Growth\",\n         xlab = \"Vitamin C dose mg\",\n         ylab = \"tooth length\",\n         xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = \"i\")\n boxplot(len ~ dose, data = ToothGrowth, add = TRUE,\n         boxwex = 0.25, at = 1:3 + 0.2,\n         subset = supp == \"OJ\", col = \"orange\")\n legend(2, 9, c(\"Ascorbic acid\", \"Orange juice\"),\n        fill = c(\"yellow\", \"orange\"))\n \n ## With less effort (slightly different) using factor *interaction*:\n boxplot(len ~ dose:supp, data = ToothGrowth,\n         boxwex = 0.5, col = c(\"orange\", \"yellow\"),\n         main = \"Guinea Pigs' Tooth Growth\",\n         xlab = \"Vitamin C dose mg\", ylab = \"tooth length\",\n         sep = \":\", lex.order = TRUE, ylim = c(0, 35), yaxs = \"i\")\n \n ## more examples in  help(bxp)"
+    "text": "barplot() Help File\n\n?barplot\n\nBar Plots\nDescription:\n Creates a bar plot with vertical or horizontal bars.\nUsage:\n barplot(height, ...)\n \n ## Default S3 method:\n barplot(height, width = 1, space = NULL,\n         names.arg = NULL, legend.text = NULL, beside = FALSE,\n         horiz = FALSE, density = NULL, angle = 45,\n         col = NULL, border = par(\"fg\"),\n         main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n         xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n         axes = TRUE, axisnames = TRUE,\n         cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n         inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n         add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n \n ## S3 method for class 'formula'\n barplot(formula, data, subset, na.action,\n         horiz = FALSE, xlab = NULL, ylab = NULL, ...)\n \nArguments:\nheight: either a vector or matrix of values describing the bars which make up the plot. If ‘height’ is a vector, the plot consists of a sequence of rectangular bars with heights given by the values in the vector. If ‘height’ is a matrix and ‘beside’ is ‘FALSE’ then each bar of the plot corresponds to a column of ‘height’, with the values in the column giving the heights of stacked sub-bars making up the bar. If ‘height’ is a matrix and ‘beside’ is ‘TRUE’, then the values in each column are juxtaposed rather than stacked.\nwidth: optional vector of bar widths. Re-cycled to length the number of bars drawn. Specifying a single value will have no visible effect unless ‘xlim’ is specified.\nspace: the amount of space (as a fraction of the average bar width) left before each bar. May be given as a single number or one number per bar. If ‘height’ is a matrix and ‘beside’ is ‘TRUE’, ‘space’ may be specified by two numbers, where the first is the space between bars in the same group, and the second the space between the groups. If not given explicitly, it defaults to ‘c(0,1)’ if ‘height’ is a matrix and ‘beside’ is ‘TRUE’, and to 0.2 otherwise.\nnames.arg: a vector of names to be plotted below each bar or group of bars. If this argument is omitted, then the names are taken from the ‘names’ attribute of ‘height’ if this is a vector, or the column names if it is a matrix.\nlegend.text: a vector of text used to construct a legend for the plot, or a logical indicating whether a legend should be included. This is only useful when ‘height’ is a matrix. In that case given legend labels should correspond to the rows of ‘height’; if ‘legend.text’ is true, the row names of ‘height’ will be used as labels if they are non-null.\nbeside: a logical value. If ‘FALSE’, the columns of ‘height’ are portrayed as stacked bars, and if ‘TRUE’ the columns are portrayed as juxtaposed bars.\nhoriz: a logical value. If ‘FALSE’, the bars are drawn vertically with the first bar to the left. If ‘TRUE’, the bars are drawn horizontally with the first at the bottom.\ndensity: a vector giving the density of shading lines, in lines per inch, for the bars or bar components. The default value of ‘NULL’ means that no shading lines are drawn. Non-positive values of ‘density’ also inhibit the drawing of shading lines.\nangle: the slope of shading lines, given as an angle in degrees (counter-clockwise), for the bars or bar components.\n col: a vector of colors for the bars or bar components.  By\n      default, '\"grey\"' is used if 'height' is a vector, and a\n      gamma-corrected grey palette if 'height' is a matrix; see\n      'grey.colors'.\nborder: the color to be used for the border of the bars. Use ‘border = NA’ to omit borders. If there are shading lines, ‘border = TRUE’ means use the same colour for the border as for the shading lines.\nmain,sub: main title and subtitle for the plot.\nxlab: a label for the x axis.\n\nylab: a label for the y axis.\n\nxlim: limits for the x axis.\n\nylim: limits for the y axis.\n\n xpd: logical. Should bars be allowed to go outside region?\n\n log: string specifying if axis scales should be logarithmic; see\n      'plot.default'.\n\naxes: logical.  If 'TRUE', a vertical (or horizontal, if 'horiz' is\n      true) axis is drawn.\naxisnames: logical. If ‘TRUE’, and if there are ‘names.arg’ (see above), the other axis is drawn (with ‘lty = 0’) and labeled.\ncex.axis: expansion factor for numeric axis labels (see ‘par(’cex’)’).\ncex.names: expansion factor for axis names (bar labels).\ninside: logical. If ‘TRUE’, the lines which divide adjacent (non-stacked!) bars will be drawn. Only applies when ‘space = 0’ (which it partly is when ‘beside = TRUE’).\nplot: logical.  If 'FALSE', nothing is plotted.\naxis.lty: the graphics parameter ‘lty’ (see ‘par(’lty’)’) applied to the axis and tick marks of the categorical (default horizontal) axis. Note that by default the axis is suppressed.\noffset: a vector indicating how much the bars should be shifted relative to the x axis.\n add: logical specifying if bars should be added to an already\n      existing plot; defaults to 'FALSE'.\n\n ann: logical specifying if the default annotation ('main', 'sub',\n      'xlab', 'ylab') should appear on the plot, see 'title'.\nargs.legend: list of additional arguments to pass to ‘legend()’; names of the list are used as argument names. Only used if ‘legend.text’ is supplied.\nformula: a formula where the ‘y’ variables are numeric data to plot against the categorical ‘x’ variables. The formula can have one of three forms:\n            y ~ x\n            y ~ x1 + x2\n            cbind(y1, y2) ~ x\n      \n      (see the examples).\n\ndata: a data frame (or list) from which the variables in formula\n      should be taken.\nsubset: an optional vector specifying a subset of observations to be used.\nna.action: a function which indicates what should happen when the data contain ‘NA’ values. The default is to ignore missing values in the given variables.\n ...: arguments to be passed to/from other methods.  For the\n      default method these can include further arguments (such as\n      'axes', 'asp' and 'main') and graphical parameters (see\n      'par') which are passed to 'plot.window()', 'title()' and\n      'axis'.\nValue:\n A numeric vector (or matrix, when 'beside = TRUE'), say 'mp',\n giving the coordinates of _all_ the bar midpoints drawn, useful\n for adding to the graph.\n\n If 'beside' is true, use 'colMeans(mp)' for the midpoints of each\n _group_ of bars, see example.\nAuthor(s):\n R Core, with a contribution by Arni Magnusson.\nReferences:\n Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n Language_.  Wadsworth & Brooks/Cole.\n\n Murrell, P. (2005) _R Graphics_. Chapman & Hall/CRC Press.\nSee Also:\n 'plot(..., type = \"h\")', 'dotchart'; 'hist' for bars of a\n _continuous_ variable.  'mosaicplot()', more sophisticated to\n visualize _several_ categorical variables.\nExamples:\n # Formula method\n barplot(GNP ~ Year, data = longley)\n barplot(cbind(Employed, Unemployed) ~ Year, data = longley)\n \n ## 3rd form of formula - 2 categories :\n op &lt;- par(mfrow = 2:1, mgp = c(3,1,0)/2, mar = .1+c(3,3:1))\n summary(d.Titanic &lt;- as.data.frame(Titanic))\n barplot(Freq ~ Class + Survived, data = d.Titanic,\n         subset = Age == \"Adult\" & Sex == \"Male\",\n         main = \"barplot(Freq ~ Class + Survived, *)\", ylab = \"# {passengers}\", legend.text = TRUE)\n # Corresponding table :\n (xt &lt;- xtabs(Freq ~ Survived + Class + Sex, d.Titanic, subset = Age==\"Adult\"))\n # Alternatively, a mosaic plot :\n mosaicplot(xt[,,\"Male\"], main = \"mosaicplot(Freq ~ Class + Survived, *)\", color=TRUE)\n par(op)\n \n \n # Default method\n require(grDevices) # for colours\n tN &lt;- table(Ni &lt;- stats::rpois(100, lambda = 5))\n r &lt;- barplot(tN, col = rainbow(20))\n #- type = \"h\" plotting *is* 'bar'plot\n lines(r, tN, type = \"h\", col = \"red\", lwd = 2)\n \n barplot(tN, space = 1.5, axisnames = FALSE,\n         sub = \"barplot(..., space= 1.5, axisnames = FALSE)\")\n \n barplot(VADeaths, plot = FALSE)\n barplot(VADeaths, plot = FALSE, beside = TRUE)\n \n mp &lt;- barplot(VADeaths) # default\n tot &lt;- colMeans(VADeaths)\n text(mp, tot + 3, format(tot), xpd = TRUE, col = \"blue\")\n barplot(VADeaths, beside = TRUE,\n         col = c(\"lightblue\", \"mistyrose\", \"lightcyan\",\n                 \"lavender\", \"cornsilk\"),\n         legend.text = rownames(VADeaths), ylim = c(0, 100))\n title(main = \"Death Rates in Virginia\", font.main = 4)\n \n hh &lt;- t(VADeaths)[, 5:1]\n mybarcol &lt;- \"gray20\"\n mp &lt;- barplot(hh, beside = TRUE,\n         col = c(\"lightblue\", \"mistyrose\",\n                 \"lightcyan\", \"lavender\"),\n         legend.text = colnames(VADeaths), ylim = c(0,100),\n         main = \"Death Rates in Virginia\", font.main = 4,\n         sub = \"Faked upper 2*sigma error bars\", col.sub = mybarcol,\n         cex.names = 1.5)\n segments(mp, hh, mp, hh + 2*sqrt(1000*hh/100), col = mybarcol, lwd = 1.5)\n stopifnot(dim(mp) == dim(hh))  # corresponding matrices\n mtext(side = 1, at = colMeans(mp), line = -2,\n       text = paste(\"Mean\", formatC(colMeans(hh))), col = \"red\")\n \n # Bar shading example\n barplot(VADeaths, angle = 15+10*1:5, density = 20, col = \"black\",\n         legend.text = rownames(VADeaths))\n title(main = list(\"Death Rates in Virginia\", font = 4))\n \n # Border color\n barplot(VADeaths, border = \"dark blue\") \n \n \n # Log scales (not much sense here)\n barplot(tN, col = heat.colors(12), log = \"y\")\n barplot(tN, col = gray.colors(20), log = \"xy\")\n \n # Legend location\n barplot(height = cbind(x = c(465, 91) / 465 * 100,\n                        y = c(840, 200) / 840 * 100,\n                        z = c(37, 17) / 37 * 100),\n         beside = FALSE,\n         width = c(465, 840, 37),\n         col = c(1, 2),\n         legend.text = c(\"A\", \"B\"),\n         args.legend = list(x = \"topleft\"))"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#barplot-example",
     "href": "modules/Module10-DataVisualization.html#barplot-example",
     "title": "Module 10: Data Visualization",
     "section": "barplot() example",
-    "text": "barplot() example\nThe function takes the a lot of arguments to control the way the way our data is plotted.\nReminder\nbarplot(height, width = 1, space = NULL,\n        names.arg = NULL, legend.text = NULL, beside = FALSE,\n        horiz = FALSE, density = NULL, angle = 45,\n        col = NULL, border = par(\"fg\"),\n        main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n        xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n        axes = TRUE, axisnames = TRUE,\n        cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n        inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n        add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n\nfreq &lt;- table(df$seropos, df$age_group)\nbarplot(freq)\n\n\n\n\n\n\n\nprop &lt;- prop.table(freq)\nbarplot(prop)"
+    "text": "barplot() example\nThe function takes the a lot of arguments to control the way the way our data is plotted.\nReminder function signature\nbarplot(height, width = 1, space = NULL,\n        names.arg = NULL, legend.text = NULL, beside = FALSE,\n        horiz = FALSE, density = NULL, angle = 45,\n        col = NULL, border = par(\"fg\"),\n        main = NULL, sub = NULL, xlab = NULL, ylab = NULL,\n        xlim = NULL, ylim = NULL, xpd = TRUE, log = \"\",\n        axes = TRUE, axisnames = TRUE,\n        cex.axis = par(\"cex.axis\"), cex.names = par(\"cex.axis\"),\n        inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,\n        add = FALSE, ann = !add && par(\"ann\"), args.legend = NULL, ...)\n\nfreq &lt;- table(df$seropos, df$age_group)\nbarplot(freq)\n\n\n\nprop.cell.percentages &lt;- prop.table(freq)\nbarplot(prop.cell.percentages)"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#legend",
@@ -851,7 +751,7 @@
     "href": "modules/Module10-DataVisualization.html#barplot-example-1",
     "title": "Module 10: Data Visualization",
     "section": "barplot() example",
-    "text": "barplot() example\nGetting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.\n\nfreq &lt;- table(df$seropos, df$age_group)\ntot.per.age.group &lt;- colSums(freq)\nage.seropos.matrix &lt;- t(t(freq)/tot.per.age.group)\ncolnames(age.seropos.matrix) &lt;- c(\"1-5 yo\", \"6-10 yo\", \"11-15 yo\")\n\nbarplot(age.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=2.8, y=1.35,\n             fill=c(\"darkblue\",\"red\"), \n             legend = c(\"seronegative\", \"seropositive\"))"
+    "text": "barplot() example\nGetting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.\n\nfreq &lt;- table(df$seropos, df$age_group)\nprop.column.percentages &lt;- prop.table(freq, margin=2)\ncolnames(prop.column.percentages) &lt;- c(\"1-5 yo\", \"6-10 yo\", \"11-15 yo\")\n\nbarplot(prop.column.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=2.8, y=1.35,\n             fill=c(\"darkblue\",\"red\"), \n             legend = c(\"seronegative\", \"seropositive\"))"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#summary",
@@ -865,7 +765,7 @@
     "href": "modules/Module10-DataVisualization.html#acknowledgements",
     "title": "Module 10: Data Visualization",
     "section": "Acknowledgements",
-    "text": "Acknowledgements\nThese are the materials I looked through, modified, or extracted to complete this module’s lecture.\n\n“Base Plotting in R” by Medium\n  [\"Base R margins: a cheatsheet\"](https://r-graph-gallery.com/74-margin-and-oma-cheatsheet.html)"
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Base Plotting in R” by Medium\n  [\"Base R margins: a cheatsheet\"](https://r-graph-gallery.com/74-margin-and-oma-cheatsheet.html)"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#base-r-plotting",
@@ -879,7 +779,7 @@
     "href": "modules/Module10-DataVisualization.html#parameters",
     "title": "Module 10: Data Visualization",
     "section": "1. Parameters",
-    "text": "1. Parameters\nThe parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via par() before you call the plot creates ‘global’ settings for your plot.\nIn the example below, we have set two commonly used optional attributes in the global plot settings. - The mfrow specifies that we have one row and two columns of plots — that is, two plots side by side. - The mar attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).\npar(mfrow = c(1,2), mar = c(5,5,4,1))"
+    "text": "1. Parameters\nThe parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via par() before you call the plot creates ‘global’ settings for your plot.\nIn the example below, we have set two commonly used optional attributes in the global plot settings.\n\nThe mfrow specifies that we have one row and two columns of plots — that is, two plots side by side.\nThe mar attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).\n\npar(mfrow = c(1,2), mar = c(5,5,4,1))"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#plot-attributes",
@@ -893,14 +793,14 @@
     "href": "modules/Module10-DataVisualization.html#barplot-example-2",
     "title": "Module 10: Data Visualization",
     "section": "barplot() example",
-    "text": "barplot() example\nNow, let look at seropositivity by two individual level characteristics in the same plot.\n\npar(mfrow = c(1,2))\nbarplot(age.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=1, y=1.35, fill=c(\"darkblue\",\"red\"), legend = c(\"seronegative\", \"seropositive\"))\n\nbarplot(slum.seropos.matrix, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Residence\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(x=1, y=1.35, fill=c(\"darkblue\",\"red\"),  legend = c(\"seronegative\", \"seropositive\"))"
+    "text": "barplot() example"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#add-legend-to-the-plot",
     "href": "modules/Module10-DataVisualization.html#add-legend-to-the-plot",
     "title": "Module 10: Data Visualization",
     "section": "Add legend to the plot",
-    "text": "Add legend to the plot\nReminder\nlegend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n       border = \"black\", lty, lwd, pch,\n       angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n       box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n       pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n       xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n       adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n       text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n       plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n       inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n       title.cex = cex[1], title.font = text.font[1],\n       seg.len = 2)\nLet’s practice\n\nbarplot(prop, col=c(\"darkblue\",\"red\"), ylim=c(0,0.7), main=\"Seropositivity by Age Group\")\nlegend(x=2.5, y=0.7,\n             fill=c(\"darkblue\",\"red\"), \n             legend = c(\"seronegative\", \"seropositive\"))"
+    "text": "Add legend to the plot\nReminder function signature\nlegend(x, y = NULL, legend, fill = NULL, col = par(\"col\"),\n       border = \"black\", lty, lwd, pch,\n       angle = 45, density = NULL, bty = \"o\", bg = par(\"bg\"),\n       box.lwd = par(\"lwd\"), box.lty = par(\"lty\"), box.col = par(\"fg\"),\n       pt.bg = NA, cex = 1, pt.cex = cex, pt.lwd = lwd,\n       xjust = 0, yjust = 1, x.intersp = 1, y.intersp = 1,\n       adj = c(0, 0.5), text.width = NULL, text.col = par(\"col\"),\n       text.font = NULL, merge = do.lines && has.pch, trace = FALSE,\n       plot = TRUE, ncol = 1, horiz = FALSE, title = NULL,\n       inset = 0, xpd, title.col = text.col[1], title.adj = 0.5,\n       title.cex = cex[1], title.font = text.font[1],\n       seg.len = 2)\nLet’s practice\n\nbarplot(prop.cell.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,0.5), main=\"Seropositivity by Age Group\")\nlegend(x=2.5, y=0.5,\n             fill=c(\"darkblue\",\"red\"), \n             legend = c(\"seronegative\", \"seropositive\"))"
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#lots-of-parameters-options",
@@ -914,7 +814,7 @@
     "href": "modules/Module10-DataVisualization.html#common-parameter-options",
     "title": "Module 10: Data Visualization",
     "section": "Common parameter options",
-    "text": "Common parameter options\nSix useful parameter arguments help improve the readability of the plot:\n\nxlab: specifies the x-axis label of the plot\nylab: specifies the y-axis label\nmain: titles your graph\npch: specifies the symbology of your graph\nlty: specifies the line type of your graph\nlwd: specifies line thickness\ncex : specifies size\ncol: specifies the colors for your graph.\n\nWe will explore use of these arguments below."
+    "text": "Common parameter options\nEight useful parameter arguments help improve the readability of the plot:\n\nxlab: specifies the x-axis label of the plot\nylab: specifies the y-axis label\nmain: titles your graph\npch: specifies the symbology of your graph\nlty: specifies the line type of your graph\nlwd: specifies line thickness\ncex : specifies size\ncol: specifies the colors for your graph.\n\nWe will explore use of these arguments below."
   },
   {
     "objectID": "modules/Module10-DataVisualization.html#common-parameter-options-1",
@@ -928,7 +828,7 @@
     "href": "modules/Module00-Welcome.html#welcome-to-sismid-workshop-introduction-to-r",
     "title": "Welcome to SISMID Workshop: Introduction to R",
     "section": "Welcome to SISMID Workshop: Introduction to R!",
-    "text": "Welcome to SISMID Workshop: Introduction to R!\nAmy Winter (she/her) Assistant Professor, Department of Epidemiology and Biostatistics Email: awinter@uga.edu\nZane Billings (he/him) PhD Candidate, Department of Epidemiology and Biostatistics Email: Wesley.Billings@uga.edu"
+    "text": "Welcome to SISMID Workshop: Introduction to R!\nAmy Winter (she/her)\nAssistant Professor, Department of Epidemiology and Biostatistics\nEmail: awinter@uga.edu\n\nZane Billings (he/him)\nPhD Candidate, Department of Epidemiology and Biostatistics\nEmail: Wesley.Billings@uga.edu"
   },
   {
     "objectID": "modules/Module00-Welcome.html#introductions",
@@ -963,7 +863,7 @@
     "href": "modules/Module00-Welcome.html#what-is-r-3",
     "title": "Welcome to SISMID Workshop: Introduction to R",
     "section": "What is R?",
-    "text": "What is R?\n\nProgram: R is a clear and accessible programming tool\nTransform: R is made up of a collection of libraries designed specifically for data science\nDiscover: Investigate the data, refine your hypothesis and analyze them\nModel: R provides a wide array of tools to capture the right model for your data\nCommunicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world"
+    "text": "What is R?\n\nProgram: R is a clear and accessible programming tool\nTransform: R is made up of a collection of packages/libraries designed specifically for statistical computing\nDiscover: Investigate the data, refine your hypothesis and analyze them\nModel: R provides a wide array of tools to capture the right model for your data\nCommunicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world"
   },
   {
     "objectID": "modules/Module00-Welcome.html#why-r",
@@ -990,8 +890,8 @@
     "objectID": "modules/Module00-Welcome.html#is-r-difficult",
     "href": "modules/Module00-Welcome.html#is-r-difficult",
     "title": "Welcome to SISMID Workshop: Introduction to R",
-    "section": "Is R DIfficult?",
-    "text": "Is R DIfficult?\n\nShort answer – It has a steep learning curve.\nYears ago, R was a difficult language to master. The language was confusing and not as structured as the other programming tools.\nHadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore."
+    "section": "Is R Difficult?",
+    "text": "Is R Difficult?\n\nShort answer – It has a steep learning curve, like all programming languages\nYears ago, R was a difficult language to master.\nHadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore."
   },
   {
     "objectID": "modules/Module00-Welcome.html#workshop-objective",
@@ -1012,7 +912,7 @@
     "href": "modules/Module00-Welcome.html#workshop-overview",
     "title": "Welcome to SISMID Workshop: Introduction to R",
     "section": "Workshop Overview",
-    "text": "Workshop Overview\n14 lecture blocks that will each: - Start with learning objectives - End with summary slides - Include mini-exercise(s) or a full exercise\nThemes that will show up throughout the workshop: - Reproducibility - Good coding techniques - Thinking algorithmically - Basic terms / R jargon"
+    "text": "Workshop Overview\n14 lecture blocks that will each:\n\nStart with learning objectives\nEnd with summary slides\nInclude mini-exercise(s) or a full exercise\n\nThemes that will show up throughout the workshop:\n\nReproducibility\nGood coding techniques\nThinking algorithmically\nBasic terms / R jargon"
   },
   {
     "objectID": "modules/Module00-Welcome.html#course-format",
@@ -1075,7 +975,7 @@
     "href": "modules/Module00-Welcome.html#overall-workshop-objectives",
     "title": "Welcome to SISMID Workshop: Introduction to R",
     "section": "Overall Workshop Objectives",
-    "text": "Overall Workshop Objectives\nBy the end of this workshop, you should be able to\n\nstart a new project, read in data, and conduct basic data manipulation, analysis, and visualization\nknow how to use and find packages/functions that we did not specifically learn in class\ntroubleshoot errors (xxzane? – not included right now)"
+    "text": "Overall Workshop Objectives\nBy the end of this workshop, you should be able to\n\nstart a new project, read in data, and conduct basic data manipulation, analysis, and visualization\nknow how to use and find packages/functions that we did not specifically learn in class\ntroubleshoot errors"
   },
   {
     "objectID": "modules/Module00-Welcome.html#this-workshop-differs-from-introduction-to-tidervyse",
@@ -1089,392 +989,252 @@
     "href": "modules/Module00-Welcome.html#this-workshop-differs-from-introduction-to-tidyverse",
     "title": "Welcome to SISMID Workshop: Introduction to R",
     "section": "This workshop differs from “Introduction to Tidyverse”",
-    "text": "This workshop differs from “Introduction to Tidyverse”\nWe will focus this class on using Base R functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.\nthe Tidyverse is one set of useful/pretty packages, designed to can make your code more intuitive as compared to the original older Base R. Tidyverse advantages:\n\nconsistent structure - making it easier to learn how to use different packages\nparticularly good for wrangling (manipulating, cleaning, joining) data\n\nmore flexible for visualizing data"
+    "text": "This workshop differs from “Introduction to Tidyverse”\nWe will focus this class on using Base R functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.\nThe Tidyverse is one set of useful/pretty sets of packages, designed to can make your code more intuitive as compared to the original older Base R. Tidyverse advantages:\n\nconsistent structure - making it easier to learn how to use different packages\nparticularly good for wrangling (manipulating, cleaning, joining) data\n\nmore flexible for visualizing data"
   },
   {
     "objectID": "modules/Module01-Intro.html#learning-objectives",
     "href": "modules/Module01-Intro.html#learning-objectives",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Learning Objectives",
-    "text": "Learning Objectives\nAfter module 1, you should be able to…\n\nCreate and save an R script\nDescribe the utility and differences b/w the console and an R script\nModify R Studio windows\nCreate objects\nDescribe the difference b/w character, numeric, list, and matrix objects\nReference objects in the RStudio Global Environment\nUse basic arithmetic operators in R\nUse comments within an R script to create header, sections, and make notes",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Learning Objectives\nAfter module 1, you should be able to…\n\nCreate and save an R script\nDescribe the utility and differences b/w the Console and the Source panes\nModify R Studio panes\nCreate objects\nDescribe the difference b/w character, numeric, list, and matrix objects\nReference objects in the RStudio Environment pane\nUse basic arithmetic operators in R\nUse comments within an R script to create header, sections, and make notes"
   },
   {
     "objectID": "modules/Module01-Intro.html#working-with-r-rstudio",
     "href": "modules/Module01-Intro.html#working-with-r-rstudio",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Working with R – RStudio",
-    "text": "Working with R – RStudio\nRStudio is an Integrated Development Environment (IDE) for R\n\nIt helps the user effectively use R\nMakes things easier\nIs NOT a dropdown statistical tool (such as Stata)\n\nSee Rcmdr or Radiant",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Working with R – RStudio\nRStudio is an Integrated Development Environment (IDE) for R\n\nIt helps the user effectively use R\nMakes things easier\nIs NOT a dropdown statistical tool (such as Stata)\n\nSee jamovi or also Rcmdr, Radiant"
   },
   {
     "objectID": "modules/Module01-Intro.html#rstudio",
     "href": "modules/Module01-Intro.html#rstudio",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "RStudio",
-    "text": "RStudio\nEasier working with R\n\nSyntax highlighting, code completion, and smart indentation\nEasily manage multiple working directories and projects\n\nMore information\n\nWorkspace browser and data viewer\nPlot history, zooming, and flexible image and file export\nIntegrated R help and documentation\nSearchable command history",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "RStudio\nEasier working with R\n\nSyntax highlighting, code completion, and smart indentation\nEasily manage multiple working directories and projects\n\nMore information\n\nWorkspace browser and data viewer\nPlot history, zooming, and flexible image and file export\nIntegrated R help and documentation\nSearchable command history"
   },
   {
     "objectID": "modules/Module01-Intro.html#rstudio-1",
     "href": "modules/Module01-Intro.html#rstudio-1",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "RStudio",
-    "text": "RStudio",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "RStudio"
   },
   {
     "objectID": "modules/Module01-Intro.html#getting-the-editor",
     "href": "modules/Module01-Intro.html#getting-the-editor",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Getting the editor",
-    "text": "Getting the editor",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Getting the editor"
   },
   {
     "objectID": "modules/Module01-Intro.html#working-with-r-in-rstudio---2-major-panes",
     "href": "modules/Module01-Intro.html#working-with-r-in-rstudio---2-major-panes",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Working with R in RStudio - 2 major panes:",
-    "text": "Working with R in RStudio - 2 major panes:\n\nThe Source/Editor: “Analysis” Script + Interactive Exploration\n\nStatic copy of what you did (reproducibility)\nTop by default\n\nThe R Console: “interprets” whatever you type\n\nCalculator\nTry things out interactively, then add to your editor\nBottom by default",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Working with R in RStudio - 2 major panes:\n\nThe Source/Editor: xxamy\n\n\n“Analysis” Script\nStatic copy of what you did (reproducibility)\nTop by default\n\n\nThe R Console: “interprets” whatever you type:\n\nCalculator\nTry things out interactively, then add to your editor\nBottom by default"
   },
   {
     "objectID": "modules/Module01-Intro.html#source-editor",
     "href": "modules/Module01-Intro.html#source-editor",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Source / Editor",
-    "text": "Source / Editor\n\nWhere files open to\nHave R code and comments in them\nWhere code is saved",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Source / Editor\n\nWhere files open to\nHave R code and comments in them\nWhere code is saved"
   },
   {
     "objectID": "modules/Module01-Intro.html#r-console",
     "href": "modules/Module01-Intro.html#r-console",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "R Console",
-    "text": "R Console\n\nWhere code is executed (where things happen)\nYou can type here for things interactively\nCode is not saved",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "R Console\n\nWhere code is executed (where things happen)\nYou can type here for things interactively\nCode is not saved"
   },
   {
     "objectID": "modules/Module01-Intro.html#rstudio-2",
     "href": "modules/Module01-Intro.html#rstudio-2",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "RStudio",
-    "text": "RStudio\nUseful RStudio “cheat sheet”: https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "RStudio\nUseful RStudio “cheat sheet”: https://github.com/rstudio/cheatsheets/blob/main/rstudio-ide.pdf"
   },
   {
     "objectID": "modules/Module01-Intro.html#rstudio-layout",
     "href": "modules/Module01-Intro.html#rstudio-layout",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "RStudio Layout",
-    "text": "RStudio Layout\nIf RStudio doesn’t look the way you want (or like our RStudio), then do:\nRStudio –&gt; View –&gt; Panes –&gt; Pane Layout",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "RStudio Layout\nIf RStudio doesn’t look the way you want (or like our RStudio), then do:\nIn R Studio Menu Bar go to View Menu –&gt; Panes –&gt; Pane Layout"
   },
   {
     "objectID": "modules/Module01-Intro.html#workspaceenvironment",
     "href": "modules/Module01-Intro.html#workspaceenvironment",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Workspace/Environment",
-    "text": "Workspace/Environment\n\nTells you what objects are in R\nWhat exists in memory/what is loaded?/what did I read in?",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Workspace/Environment\n\nTells you what objects are in R\nWhat exists in memory/what is loaded?/what did I read in?"
   },
   {
     "objectID": "modules/Module01-Intro.html#workspacehistory",
     "href": "modules/Module01-Intro.html#workspacehistory",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Workspace/History",
-    "text": "Workspace/History\n\nShows previous commands. Good to look at for debugging, but don’t rely on it.\nAlso type the “up” key in the Console to scroll through previous commands",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Workspace/History\n\nShows previous commands. Good to look at for debugging, but don’t rely on it.\nAlso type the “up” and “down” key in the Console to scroll through previous commands"
   },
   {
     "objectID": "modules/Module01-Intro.html#workspaceother-panes",
     "href": "modules/Module01-Intro.html#workspaceother-panes",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Workspace/Other Panes",
-    "text": "Workspace/Other Panes\n\nFiles - shows the files on your computer of the directory you are working in\nViewer - can view data or R objects\nHelp - shows help of R commands\nPlots - pictures and figures\nPackages - list of R packages that are loaded in memory",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Workspace/Other Panes\n\nFiles - shows the files on your computer of the directory you are working in\nViewer - can view data or R objects\nHelp - shows help of R commands\nPlots - pictures and figures\nPackages - list of R packages that are loaded in memory"
   },
   {
     "objectID": "modules/Module01-Intro.html#getting-started",
     "href": "modules/Module01-Intro.html#getting-started",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Getting Started",
-    "text": "Getting Started\n\nFile –&gt; New File –&gt; R Script\nSave the blank R script as Module1.R",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Getting Started\n\nIn R Studio Menu Bar go to File Menu –&gt; New File –&gt; R Script\nSave the blank R script as Module1.R"
   },
   {
     "objectID": "modules/Module01-Intro.html#explaining-output-on-slides",
     "href": "modules/Module01-Intro.html#explaining-output-on-slides",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Explaining output on slides",
-    "text": "Explaining output on slides\nIn slides, a command (we’ll also call them code or a code chunk) will look like this\n\nprint(\"I'm code\")\n\n[1] \"I'm code\"\n\n\nAnd then directly after it, will be the output of the code.\nSo print(\"I'm code\") is the code chunk and [1] \"I'm code\" is the output.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Explaining output on slides\nIn slides, the R command/code will be in a box, and then directly after it, will be the output of the code starting with [1]\n\nprint(\"I'm code\")\n\n[1] \"I'm code\"\n\n\nSo print(\"I'm code\") is the command and [1] \"I'm code\" is the output.\n\nCommands/code and output written as inline text will be typewriter blue font. For example code"
   },
   {
     "objectID": "modules/Module01-Intro.html#r-as-a-calculator",
     "href": "modules/Module01-Intro.html#r-as-a-calculator",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "R as a calculator",
-    "text": "R as a calculator\nYou can do basic arithmetic in R, which I surprisingly use all the time.\n\n2 + 2\n\n[1] 4\n\n2 * 4\n\n[1] 8\n\n2^3\n\n[1] 8",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "R as a calculator\nYou can do basic arithmetic in R, which I surprisingly use all the time.\n\n2 + 2\n\n[1] 4\n\n2 * 4\n\n[1] 8\n\n2^3\n\n[1] 8"
   },
   {
     "objectID": "modules/Module01-Intro.html#r-as-a-calculator-1",
     "href": "modules/Module01-Intro.html#r-as-a-calculator-1",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "R as a calculator",
-    "text": "R as a calculator\n\nThe R console is a full calculator\nTry to play around with it:\n\n+, -, /, * are add, subtract, divide and multiply\n^ or ** is power\nparentheses – ( and ) – work with order of operations\n%% finds the remainder",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "R as a calculator\n\nThe R console is a full calculator\nArithmetic operators:\n\n+, -, /, * are add, subtract, divide and multiply\n^ or ** is power\nparentheses – ( and ) – work with order of operations\n%% finds the remainder"
   },
   {
     "objectID": "modules/Module01-Intro.html#execute-run-code",
     "href": "modules/Module01-Intro.html#execute-run-code",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Execute / Run Code",
-    "text": "Execute / Run Code\nTo execute or run a line of code, you just put your cursor on line of code and then:\n\nPress Run (which you will find at the top of your window)\n\nOR\n\nPress Cmd + Return (iOS) OR Ctrl + Enter (Windows).\n\nTo execute or run multiple lines of code, you just need to highlight the code you want to run and then follow option 1 or 2.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Execute / Run Code\nTo execute or run a line of code (i.e., command), you just put your cursor on the command and then:\n\nPress Run (which you will find at the top of your window)\n\nOR\n\nPress Cmd + Return (iOS) OR Ctrl + Enter (Windows).\n\nTo execute or run multiple lines of code, you need to highlight the code you want to run and then follow option 1 or 2."
   },
   {
     "objectID": "modules/Module01-Intro.html#mini-exercise",
     "href": "modules/Module01-Intro.html#mini-exercise",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Mini exercise",
-    "text": "Mini exercise\nExecute 5+4 from your .R file, and then find the answer 9 in the Console.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Mini exercise\nExecute 5+4 from your .R file, and then find the answer 9 in the Console."
   },
   {
     "objectID": "modules/Module01-Intro.html#commenting-in-scripts",
     "href": "modules/Module01-Intro.html#commenting-in-scripts",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Commenting in Scripts",
-    "text": "Commenting in Scripts\nThe syntax # creates a comment, which means anything to the right of # will not be executed / run\nCommenting is useful to:\n\nCreate headers for R Scripts\nCreate sections within an R Script\nExplain what is happening in your code",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Commenting in Scripts\nThe syntax # creates a comment, which means anything to the right of # will not be executed / run\nCommenting is useful to:\n\nCreate headers for R Scripts\nCreate sections within an R Script\nExplain what is happening in your code"
   },
   {
     "objectID": "modules/Module01-Intro.html#commenting-an-r-script-header",
     "href": "modules/Module01-Intro.html#commenting-an-r-script-header",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Commenting an R Script header",
-    "text": "Commenting an R Script header\nAdd a comment header to Module1.R. This is the one I typically use, but you may have your own preference. The goal is that you are consistent so that future you / collaborators can make sense of your code.\n\n### Title: Module 1\n### Author: Amy Winter \n### Objective: Mini Exercise - Developing first R Script\n### Date: 15 July 2024",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Commenting an R Script header\nAdd a comment header to Module1.R. This is the one I typically use, but you may have your own preference. The goal is that you are consistent so that future you / collaborators can make sense of your code.\n\n### Title: Module 1\n### Author: Amy Winter \n### Objective: Mini Exercise - Developing first R Script\n### Date: 15 July 2024"
   },
   {
     "objectID": "modules/Module01-Intro.html#commenting-to-create-sections",
     "href": "modules/Module01-Intro.html#commenting-to-create-sections",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Commenting to create sections",
-    "text": "Commenting to create sections\nYou can also create sections within your code by ending a comment with 4 hash marks. This is very useful for creating an outline of your R Script. The “Outline” can be found in the top right of the your source window.\n\n# Section 1 Header ####\n## Section 2 Sub-header ####\n### Section 3 Sub-sub-header ####\n#### Section 4 Sub-sub-sub-header ####",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Commenting to create sections\nYou can also create sections within your code by ending a comment with 4 hash marks. This is very useful for creating an outline of your R Script. The “Outline” can be found in the top right of the your Source pane\n\n# Section 1 Header ####\n## Section 2 Sub-header ####\n### Section 3 Sub-sub-header ####\n#### Section 4 Sub-sub-sub-header ####"
   },
   {
     "objectID": "modules/Module01-Intro.html#commenting-to-explain-code",
     "href": "modules/Module01-Intro.html#commenting-to-explain-code",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Commenting to explain code",
-    "text": "Commenting to explain code\n\n## this # is still a comment\n### you can use many #'s as you want\n\n# sometimes you have a really long comment,\n#    like explaining what you are doing\n#    for a step in analysis. \n# Take it to another line",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Commenting to explain code\n\n## this # is still a comment\n### you can use many #'s as you want\n\n# sometimes you have a really long comment,\n#    like explaining what you are doing\n#    for a step in analysis. \n# Take it to another line\n\nI tend to use:\n\nOne hash mark with a space to describe what is happening in the following few lines of code\nOne hash mark with no space after a command to list specifics\n\n\n# Practicing my arithmetic\n5+2\n3*5\n9/8\n\n5+2 #5 plus 2"
   },
   {
     "objectID": "modules/Module01-Intro.html#commenting-to-explain-code-1",
     "href": "modules/Module01-Intro.html#commenting-to-explain-code-1",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Commenting to explain code",
-    "text": "Commenting to explain code\nI tend to use:\n\nOne hash tag with a space to describe what is happening in the following few lines of code\nOne hastag with no space after a command to list specifics\n\n\n# Practicing my arithmetic\n5+2\n3*5\n9/8\n\n5+2 #5 plus 2",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Commenting to explain code\nI tend to use:\n\nOne hash tag with a space to describe what is happening in the following few lines of code\nOne hastag with no space after a command to list specifics\n\n\n# Practicing my arithmetic\n5+2\n3*5\n9/8\n\n5+2 #5 plus 2"
   },
   {
     "objectID": "modules/Module01-Intro.html#object---basic-terms",
     "href": "modules/Module01-Intro.html#object---basic-terms",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Object - Basic terms",
-    "text": "Object - Basic terms\nObject - an object is something that can be worked with in R - can be lots of different things!\n\na scalar / number\na vector\na matrix of numbers\na list\na plot\na function\n\n… many more",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Object - Basic terms\nObject - an object is something that can be worked with in R - can be lots of different things!\n\na scalar / number\na vector\na matrix of numbers\na list\na plot\na function\n\n… many more"
   },
   {
     "objectID": "modules/Module01-Intro.html#objects",
     "href": "modules/Module01-Intro.html#objects",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Objects",
-    "text": "Objects\n\nYou can create objects from within the R environment and from files on your computer\nR uses &lt;- to assign values to an object name\nNote: Object names are case-sensitive, i.e. X and x are different\nHere are examples of creating five different objects:\n\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n\nNote, c() and matrix() are functions, which we will talk more about in module 2.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Objects\n\nYou can create objects from within the R environment and from files on your computer\nR uses &lt;- to assign values to an object name\nNote: Object names are case-sensitive, i.e. X and x are different\nHere are examples of creating five different objects:\n\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n\nNote, c() and matrix() are functions, which we will talk more about in module 2."
   },
   {
     "objectID": "modules/Module01-Intro.html#objects-1",
     "href": "modules/Module01-Intro.html#objects-1",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Objects",
-    "text": "Objects\nNote, you can find these objects now in the Global Environment.\n\nAlso, you can call them anytime (i.e, see them in the Console) by executing (running) the object. For example,\n\ncharacter.object\n\n[1] \"blue\"\n\n\n\nmatrix.object\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Objects\nNote, you can find these objects now in the Global Environment.\n\nAlso, you can print them anytime (i.e, see them in the Console) by executing (running) the object. For example,\n\ncharacter.object\n\n[1] \"blue\"\n\n\n\nmatrix.object\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5"
   },
   {
     "objectID": "modules/Module01-Intro.html#assignment---good-coding",
     "href": "modules/Module01-Intro.html#assignment---good-coding",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Assignment - Good coding",
-    "text": "Assignment - Good coding\n= and &lt;- can both be used for assignment, but &lt;- is better coding practice, because == is a logical operator. We will talk about this more, later.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Assignment - Good coding\n= and &lt;- can both be used for assignment, but &lt;- is better coding practice, because == is a logical operator. We will talk about this more, later."
   },
   {
     "objectID": "modules/Module01-Intro.html#lists",
     "href": "modules/Module01-Intro.html#lists",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Lists",
-    "text": "Lists\nList is a special data class, that can hold vectors, strings, matrices, models, list of other lists.\n\nlist.object &lt;- list(number.object, vector.object2, matrix.object)\nlist.object\n\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Lists\nList is a special data class, that can hold vectors, strings, matrices, models, list of other lists.\n\nlist.object &lt;- list(number.object, vector.object2, matrix.object)\nlist.object\n\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5"
   },
   {
     "objectID": "modules/Module01-Intro.html#useful-r-studio-shortcuts",
     "href": "modules/Module01-Intro.html#useful-r-studio-shortcuts",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Useful R Studio Shortcuts",
-    "text": "Useful R Studio Shortcuts\nWill certainly save you time\n\nCmd + Return (iOS) OR Ctrl + Enter (Windows) in your script evaluates current line/selection\n\nIt’s like copying and pasting the code into the console for it to run.\n\npressing Up/Down in the Console allows you to navigate command history\n\nSee http://www.rstudio.com/ide/docs/using/keyboard_shortcuts for many more",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Useful R Studio Shortcuts\nWill certainly save you time\n\nCmd + Return (iOS) OR Ctrl + Enter (Windows) in your script evaluates current line/selection\n\nIt’s like copying and pasting the code into the console for it to run.\n\npressing Up/Down in the Console allows you to navigate command history\n\nSee http://www.rstudio.com/ide/docs/using/keyboard_shortcuts for many more"
   },
   {
     "objectID": "modules/Module01-Intro.html#rstudio-helps-with-tab-completion",
     "href": "modules/Module01-Intro.html#rstudio-helps-with-tab-completion",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "RStudio helps with “tab completion”",
-    "text": "RStudio helps with “tab completion”\nIf you start typing a object, RStudio will show you options that you can choose without typing out the whole object.",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "RStudio helps with “tab completion”\nIf you start typing a object, RStudio will show you options that you can choose without typing out the whole object."
   },
   {
     "objectID": "modules/Module01-Intro.html#summary",
     "href": "modules/Module01-Intro.html#summary",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Summary",
-    "text": "Summary\n\nRStudio makes working in R easier\nThe Editor is for static code like R Scripts\nThe Console is for testing code that can’t be saved\nCommenting is your new best friend\nIn R we create objects that can be viewed in the Environment panel and called anytime\nAn object is something that can be worked with in R\nUse &lt;- syntax to create objects",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Summary\n\nRStudio makes working in R easier\nThe Editor is for static code like R Scripts\nThe Console is for testing code that can’t be saved\nCommenting is your new best friend\nIn R we create objects that can be viewed in the Environment pane and used anytime\nAn object is something that can be worked with in R\nUse &lt;- syntax to create objects"
   },
   {
     "objectID": "modules/Module01-Intro.html#mini-exercise-1",
     "href": "modules/Module01-Intro.html#mini-exercise-1",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Mini Exercise",
-    "text": "Mini Exercise\nTry creating one or two of these objects in your R script\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Mini Exercise\nTry creating one or two of these objects in your R script\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)"
   },
   {
     "objectID": "modules/Module01-Intro.html#acknowledgements",
     "href": "modules/Module01-Intro.html#acknowledgements",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Acknowledgements",
-    "text": "Acknowledgements\nThese are the materials I looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R for Public Health Researchers” Johns Hopkins University\nSome RStudio snapshots were pulled from http://ayeimanol-r.net/2013/04/21/289/",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R for Public Health Researchers” Johns Hopkins University\nSome RStudio snapshots were pulled from http://ayeimanol-r.net/2013/04/21/289/"
   },
   {
     "objectID": "modules/Module01-Intro.html#mini-exercise-2",
     "href": "modules/Module01-Intro.html#mini-exercise-2",
     "title": "Module 1: Introduction to RStudio and R Basics",
     "section": "Mini Exercise",
-    "text": "Mini Exercise\n\nCreate a new number object and name it my.object\nCreate a vector of 4 numbers and name it my.vector using the c() function\nAdd my.object and my.vector together use arithmatic operator",
-    "crumbs": [
-      "Day 1",
-      "Module 1: Introduction to RStudio and R Basics"
-    ]
+    "text": "Mini Exercise\n\nCreate a new number object and name it my.object\nCreate a vector of 4 numbers and name it my.vector using the c() function\nAdd my.object and my.vector together using an arithmetic operator"
   },
   {
     "objectID": "modules/Module02-Functions.html#learning-objectives",
@@ -1495,7 +1255,7 @@
     "href": "modules/Module02-Functions.html#function",
     "title": "Module 2: Functions",
     "section": "Function",
-    "text": "Function\nThe general usage for a function is the name of the function followed by parentheses. Within the parentheses are arguments.\n\nfunction_name(argument1, argument2, ...)"
+    "text": "Function\nThe general usage for a function is the name of the function followed by parentheses (i.e., the function signature). Within the parentheses are arguments.\n\nfunction_name(argument1, argument2, ...)"
   },
   {
     "objectID": "modules/Module02-Functions.html#arguments---basic-term",
@@ -1537,28 +1297,28 @@
     "href": "modules/Module02-Functions.html#package---basic-term",
     "title": "Module 2: Functions",
     "section": "Package - Basic term",
-    "text": "Package - Basic term\nWhen you download R, it has a “base” set of functions, that are associated with a “base” set of packages including: ‘base’, ‘datasets’, ‘graphics’, ‘grDevices’, ‘methods’, ‘stats’, ‘methods’ (typically just referred to as Base R).\n\ne.g., the log() function comes from the ‘base’ package\n\nPackage - a package in R is a bundle or “package” of code (and or possibly data) that can be loaded together for easy repeated use or for sharing with others.\nPackages are analogous to software applications like Microsoft Word. After installation, your operating system allows you to use it, just like having Word installed allows you to use it."
+    "text": "Package - Basic term\nWhen you download R, it has a “base” set of functions, that are associated with a “base” set of packages including: ‘base’, ‘datasets’, ‘graphics’, ‘grDevices’, ‘methods’, ‘stats’ (typically just referred to as Base R).\n\ne.g., the log() function comes from the ‘base’ package\n\nPackage - a package in R is a bundle or “package” of code (and or possibly data) that can be loaded together for easy repeated use or for sharing with others.\nPackages are analogous to software applications like Microsoft Word. After installation, your operating system allows you to use it, just like having Word installed allows you to use it."
   },
   {
     "objectID": "modules/Module02-Functions.html#packages",
     "href": "modules/Module02-Functions.html#packages",
     "title": "Module 2: Functions",
     "section": "Packages",
-    "text": "Packages\nThe Packages window in RStudio can help you identify what have been installed (listed), and which one have been called (check mark).\nLets go look at the Packages window, find the base package and find the log() function. It automatically loads the help file that we looked at earlier using ?log."
+    "text": "Packages\nThe Packages pane in RStudio can help you identify what have been installed (listed), and which one have been attached (check mark).\nLets go look at the Packages window, find the base package and find the log() function. It automatically loads the help file that we looked at earlier using ?log."
   },
   {
     "objectID": "modules/Module02-Functions.html#additional-packages",
     "href": "modules/Module02-Functions.html#additional-packages",
     "title": "Module 2: Functions",
     "section": "Additional Packages",
-    "text": "Additional Packages\nYou can install additional packages for your uses from CRAN or GitHub. These additional packages are written by RStudio or R users/developers (like us)\n\nNot all packages available on CRAN or GitHub are trustworthy\nRStudio (the company) makes a lot of great packages\nWho wrote it? Hadley Wickham is a major authority on R (Employee and Developer at RStudio)\nHow to trust an R package"
+    "text": "Additional Packages\nYou can install additional packages for your use from CRAN or GitHub. These additional packages are written by RStudio or R users/developers (like us)\n\nNot all packages available on CRAN or GitHub are trustworthy\nRStudio (the company) makes a lot of great packages\nWho wrote it? Hadley Wickham is a major authority on R (Employee and Developer at RStudio)\nHow to trust an R package"
   },
   {
     "objectID": "modules/Module02-Functions.html#installing-and-calling-packages",
     "href": "modules/Module02-Functions.html#installing-and-calling-packages",
     "title": "Module 2: Functions",
     "section": "Installing and calling packages",
-    "text": "Installing and calling packages\nTo use the bundle or “package” of code (and or possibly data) from a package, you need to install and also call the package.\nTo install a package you can\n\ngo to Tools —&gt; Install Packages in the RStudio header\n\nOR\n\nuse the following code:\n\n\ninstall.packages(package_name)"
+    "text": "Installing and calling packages\nTo use the bundle or “package” of code (and or possibly data) from a package, you need to install and also call the package.\nTo install a package you can\n\ngo to Tools —&gt; Install Packages in the RStudio header\n\nOR\n\nuse the following code:\n\n\ninstall.packages(\"package_name\")"
   },
   {
     "objectID": "modules/Module02-Functions.html#mini-exercise",
@@ -1572,28 +1332,28 @@
     "href": "modules/Module02-Functions.html#functions-from-module-1",
     "title": "Module 2: Functions",
     "section": "Functions from Module 1",
-    "text": "Functions from Module 1\nThe combine function c() collects/combines/joins single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types.\n\n?c\n\n\n\nCombine Values into a Vector or List\n\nDescription:\n\n     This is a generic function which combines its arguments.\n\n     The default method combines its arguments to form a vector.  All\n     arguments are coerced to a common type which is the type of the\n     returned value, and all attributes except names are removed.\n\nUsage:\n\n     ## S3 Generic function\n     c(...)\n     \n     ## Default S3 method:\n     c(..., recursive = FALSE, use.names = TRUE)\n     \nArguments:\n\n     ...: objects to be concatenated.  All 'NULL' entries are dropped\n          before method dispatch unless at the very beginning of the\n          argument list.\n\nrecursive: logical.  If 'recursive = TRUE', the function recursively\n          descends through lists (and pairlists) combining all their\n          elements into a vector.\n\nuse.names: logical indicating if 'names' should be preserved.\n\nDetails:\n\n     The output type is determined from the highest type of the\n     components in the hierarchy NULL &lt; raw &lt; logical &lt; integer &lt;\n     double &lt; complex &lt; character &lt; list &lt; expression.  Pairlists are\n     treated as lists, whereas non-vector components (such as 'name's /\n     'symbol's and 'call's) are treated as one-element 'list's which\n     cannot be unlisted even if 'recursive = TRUE'.\n\n     There is a 'c.factor' method which combines factors into a factor.\n\n     'c' is sometimes used for its side effect of removing attributes\n     except names, for example to turn an 'array' into a vector.\n     'as.vector' is a more intuitive way to do this, but also drops\n     names.  Note that methods other than the default are not required\n     to do this (and they will almost certainly preserve a class\n     attribute).\n\n     This is a primitive function.\n\nValue:\n\n     'NULL' or an expression or a vector of an appropriate mode.  (With\n     no arguments the value is 'NULL'.)\n\nS4 methods:\n\n     This function is S4 generic, but with argument list '(x, ...)'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'unlist' and 'as.vector' to produce attribute-free vectors.\n\nExamples:\n\n     c(1,7:9)\n     c(1:5, 10.5, \"next\")\n     \n     ## uses with a single argument to drop attributes\n     x &lt;- 1:4\n     names(x) &lt;- letters[1:4]\n     x\n     c(x)          # has names\n     as.vector(x)  # no names\n     dim(x) &lt;- c(2,2)\n     x\n     c(x)\n     as.vector(x)\n     \n     ## append to a list:\n     ll &lt;- list(A = 1, c = \"C\")\n     ## do *not* use\n     c(ll, d = 1:3) # which is == c(ll, as.list(c(d = 1:3)))\n     ## but rather\n     c(ll, d = list(1:3))  # c() combining two lists\n     \n     c(list(A = c(B = 1)), recursive = TRUE)\n     \n     c(options(), recursive = TRUE)\n     c(list(A = c(B = 1, C = 2), B = c(E = 7)), recursive = TRUE)"
+    "text": "Functions from Module 1\nThe combine function c() concatenate/collects/combines single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types.\n\n?c\n\n\n\nCombine Values into a Vector or List\n\nDescription:\n\n     This is a generic function which combines its arguments.\n\n     The default method combines its arguments to form a vector.  All\n     arguments are coerced to a common type which is the type of the\n     returned value, and all attributes except names are removed.\n\nUsage:\n\n     ## S3 Generic function\n     c(...)\n     \n     ## Default S3 method:\n     c(..., recursive = FALSE, use.names = TRUE)\n     \nArguments:\n\n     ...: objects to be concatenated.  All 'NULL' entries are dropped\n          before method dispatch unless at the very beginning of the\n          argument list.\n\nrecursive: logical.  If 'recursive = TRUE', the function recursively\n          descends through lists (and pairlists) combining all their\n          elements into a vector.\n\nuse.names: logical indicating if 'names' should be preserved.\n\nDetails:\n\n     The output type is determined from the highest type of the\n     components in the hierarchy NULL &lt; raw &lt; logical &lt; integer &lt;\n     double &lt; complex &lt; character &lt; list &lt; expression.  Pairlists are\n     treated as lists, whereas non-vector components (such as 'name's /\n     'symbol's and 'call's) are treated as one-element 'list's which\n     cannot be unlisted even if 'recursive = TRUE'.\n\n     There is a 'c.factor' method which combines factors into a factor.\n\n     'c' is sometimes used for its side effect of removing attributes\n     except names, for example to turn an 'array' into a vector.\n     'as.vector' is a more intuitive way to do this, but also drops\n     names.  Note that methods other than the default are not required\n     to do this (and they will almost certainly preserve a class\n     attribute).\n\n     This is a primitive function.\n\nValue:\n\n     'NULL' or an expression or a vector of an appropriate mode.  (With\n     no arguments the value is 'NULL'.)\n\nS4 methods:\n\n     This function is S4 generic, but with argument list '(x, ...)'.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'unlist' and 'as.vector' to produce attribute-free vectors.\n\nExamples:\n\n     c(1,7:9)\n     c(1:5, 10.5, \"next\")\n     \n     ## uses with a single argument to drop attributes\n     x &lt;- 1:4\n     names(x) &lt;- letters[1:4]\n     x\n     c(x)          # has names\n     as.vector(x)  # no names\n     dim(x) &lt;- c(2,2)\n     x\n     c(x)\n     as.vector(x)\n     \n     ## append to a list:\n     ll &lt;- list(A = 1, c = \"C\")\n     ## do *not* use\n     c(ll, d = 1:3) # which is == c(ll, as.list(c(d = 1:3)))\n     ## but rather\n     c(ll, d = list(1:3))  # c() combining two lists\n     \n     c(list(A = c(B = 1)), recursive = TRUE)\n     \n     c(options(), recursive = TRUE)\n     c(list(A = c(B = 1, C = 2), B = c(E = 7)), recursive = TRUE)"
   },
   {
     "objectID": "modules/Module02-Functions.html#functions-from-module-1-1",
     "href": "modules/Module02-Functions.html#functions-from-module-1-1",
     "title": "Module 2: Functions",
     "section": "Functions from Module 1",
-    "text": "Functions from Module 1\nThe matrix() function creates a matrix from the given set of values.\n\n?matrix\n\nxxamy - doesn’t seem to work - may need to paste in a screen shot figure\n\n\nNo documentation for 'matix' in specified packages and libraries"
+    "text": "Functions from Module 1\nThe matrix() function creates a matrix from the given set of values.\n\n?matrix\n\n\n\nMatrices\n\nDescription:\n\n     'matrix' creates a matrix from the given set of values.\n\n     'as.matrix' attempts to turn its argument into a matrix.\n\n     'is.matrix' tests if its argument is a (strict) matrix.\n\nUsage:\n\n     matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,\n            dimnames = NULL)\n     \n     as.matrix(x, ...)\n     ## S3 method for class 'data.frame'\n     as.matrix(x, rownames.force = NA, ...)\n     \n     is.matrix(x)\n     \nArguments:\n\n    data: an optional data vector (including a list or 'expression'\n          vector).  Non-atomic classed R objects are coerced by\n          'as.vector' and all attributes discarded.\n\n    nrow: the desired number of rows.\n\n    ncol: the desired number of columns.\n\n   byrow: logical. If 'FALSE' (the default) the matrix is filled by\n          columns, otherwise the matrix is filled by rows.\n\ndimnames: A 'dimnames' attribute for the matrix: 'NULL' or a 'list' of\n          length 2 giving the row and column names respectively.  An\n          empty list is treated as 'NULL', and a list of length one as\n          row names.  The list can be named, and the list names will be\n          used as names for the dimensions.\n\n       x: an R object.\n\n     ...: additional arguments to be passed to or from methods.\n\nrownames.force: logical indicating if the resulting matrix should have\n          character (rather than 'NULL') 'rownames'.  The default,\n          'NA', uses 'NULL' rownames if the data frame has 'automatic'\n          row.names or for a zero-row data frame.\n\nDetails:\n\n     If one of 'nrow' or 'ncol' is not given, an attempt is made to\n     infer it from the length of 'data' and the other parameter.  If\n     neither is given, a one-column matrix is returned.\n\n     If there are too few elements in 'data' to fill the matrix, then\n     the elements in 'data' are recycled.  If 'data' has length zero,\n     'NA' of an appropriate type is used for atomic vectors ('0' for\n     raw vectors) and 'NULL' for lists.\n\n     'is.matrix' returns 'TRUE' if 'x' is a vector and has a '\"dim\"'\n     attribute of length 2 and 'FALSE' otherwise.  Note that a\n     'data.frame' is *not* a matrix by this test.  The function is\n     generic: you can write methods to handle specific classes of\n     objects, see InternalMethods.\n\n     'as.matrix' is a generic function.  The method for data frames\n     will return a character matrix if there is only atomic columns and\n     any non-(numeric/logical/complex) column, applying 'as.vector' to\n     factors and 'format' to other non-character columns.  Otherwise,\n     the usual coercion hierarchy (logical &lt; integer &lt; double &lt;\n     complex) will be used, e.g., all-logical data frames will be\n     coerced to a logical matrix, mixed logical-integer will give a\n     integer matrix, etc.\n\n     The default method for 'as.matrix' calls 'as.vector(x)', and hence\n     e.g. coerces factors to character vectors.\n\n     When coercing a vector, it produces a one-column matrix, and\n     promotes the names (if any) of the vector to the rownames of the\n     matrix.\n\n     'is.matrix' is a primitive function.\n\n     The 'print' method for a matrix gives a rectangular layout with\n     dimnames or indices.  For a list matrix, the entries of length not\n     one are printed in the form 'integer,7' indicating the type and\n     length.\n\nNote:\n\n     If you just want to convert a vector to a matrix, something like\n\n       dim(x) &lt;- c(nx, ny)\n       dimnames(x) &lt;- list(row_names, col_names)\n     \n     will avoid duplicating 'x' _and_ preserve 'class(x)' which may be\n     useful, e.g., for 'Date' objects.\n\nReferences:\n\n     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n     Language_.  Wadsworth & Brooks/Cole.\n\nSee Also:\n\n     'data.matrix', which attempts to convert to a numeric matrix.\n\n     A matrix is the special case of a two-dimensional 'array'.\n     'inherits(m, \"array\")' is true for a 'matrix' 'm'.\n\nExamples:\n\n     is.matrix(as.matrix(1:10))\n     !is.matrix(warpbreaks)  # data.frame, NOT matrix!\n     warpbreaks[1:10,]\n     as.matrix(warpbreaks[1:10,])  # using as.matrix.data.frame(.) method\n     \n     ## Example of setting row and column names\n     mdat &lt;- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,\n                    dimnames = list(c(\"row1\", \"row2\"),\n                                    c(\"C.1\", \"C.2\", \"C.3\")))\n     mdat"
   },
   {
     "objectID": "modules/Module02-Functions.html#summary",
     "href": "modules/Module02-Functions.html#summary",
     "title": "Module 2: Functions",
     "section": "Summary",
-    "text": "Summary\n\nFunctions are “self contained” modules of code that accomplish specific tasks.\nArguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)\nArguments may include defaults that the author of the function specified as being “good enough in standard cases”, but that can be changed.\nAn R Package is a bundle or “package” of code (and or possibly data) that can be used by installing it once and calling it (using library()) each time R/Rstudio is opened\nThe Help window in RStudio is useful for to get more information about functions and packages"
+    "text": "Summary\n\nFunctions are “self contained” modules of code that accomplish specific tasks.\nArguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)\nArguments may include defaults that the author of the function specified as being “good enough in standard cases”, but that can be changed.\nAn R Package is a bundle or “package” of code (and or possibly data) that can be used by installing it once and attaching it (using library()) each time R/Rstudio is opened\nThe Help window in RStudio is useful for to get more information about functions and packages"
   },
   {
     "objectID": "modules/Module02-Functions.html#acknowledgements",
     "href": "modules/Module02-Functions.html#acknowledgements",
     "title": "Module 2: Functions",
     "section": "Acknowledgements",
-    "text": "Acknowledgements\nThese are the materials I looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R - ARCHIVED” from Harvard Chan Bioinformatics Core (HBC)"
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R - ARCHIVED” from Harvard Chan Bioinformatics Core (HBC)"
   },
   {
     "objectID": "modules/Module02-Functions.html#sure-that-is-easy-enough-but-how-do-you-know",
@@ -1614,7 +1374,7 @@
     "href": "modules/Module03-WorkingDirectories.html#learning-objectives",
     "title": "Module 3: Working Directories",
     "section": "Learning Objectives",
-    "text": "Learning Objectives\nAfter module 3, you should be able to…\n\nUnderstand your own systems file structure and the purpose of the working directory\nDetermine the working directory\nChange the working directory"
+    "text": "Learning Objectives\nAfter module 3, you should be able to…\n\nUnderstand your own systems’ file structure and the purpose of the working directory\nDetermine the working directory\nChange the working directory"
   },
   {
     "objectID": "modules/Module03-WorkingDirectories.html#file-structure",
@@ -1628,7 +1388,7 @@
     "href": "modules/Module03-WorkingDirectories.html#working-directory-basic-term",
     "title": "Module 3: Working Directories",
     "section": "Working Directory – Basic term",
-    "text": "Working Directory – Basic term\n\nR “looks” for files on your computer relative to the “working” directory\nFor example, if you want to load data into R or save a figure, you will need to tell R where/store the file\nMany people recommend not setting a directory in the scripts, rather assume you’re in the directory the script is in"
+    "text": "Working Directory – Basic term\n\nR “looks” for files on your computer relative to the “working” directory\nFor example, if you want to load data into R or save a figure, you will need to tell R where to look for or store the file\nMany people recommend not setting a directory in the scripts, rather assume you’re in the directory the script is in"
   },
   {
     "objectID": "modules/Module03-WorkingDirectories.html#getting-and-setting-the-working-directory-using-code",
@@ -1670,14 +1430,14 @@
     "href": "modules/Module03-WorkingDirectories.html#setting-the-working-directory",
     "title": "Module 3: Working Directories",
     "section": "Setting the Working Directory",
-    "text": "Setting the Working Directory\nIf you have not yet saved a “source” file, it will set working directory to the default location. See RStudio -&gt; Preferences -&gt; General for default location.\nTo change the working directory to another location, go to Session –&gt; Set Working Directory –&gt; Choose Directory`\nAgain, RStudio will show the code in the Console for the action you took with your cursor."
+    "text": "Setting the Working Directory\nIf you have not yet saved a “source” file, it will set working directory to the default location.Find the Tool Menu in the Menu Bar -&gt; Global Opsions -&gt; General for default location.\nTo change the working directory to another location, find Session Menu in the Menu Bar –&gt; Set Working Directory –&gt; Choose Directory`\nAgain, RStudio will show the code in the Console for the action you took with your cursor."
   },
   {
     "objectID": "modules/Module03-WorkingDirectories.html#summary",
     "href": "modules/Module03-WorkingDirectories.html#summary",
     "title": "Module 3: Working Directories",
     "section": "Summary",
-    "text": "Summary\n\nR “looks” for files on your computer relative to the “working” directory\nAbsolute path points to the same location in a file system - it is specific to your system and your system alone\nRelative path points is based on the current working directory\nTwo functions, setwd() and getwd(), are your new best friends."
+    "text": "Summary\n\nR “looks” for files on your computer relative to the “working” directory\nAbsolute path points to the same location in a file system - it is specific to your system and your system alone\nRelative path points is based on the current working directory\nTwo functions, setwd() and getwd() are useful for identifying and manipulating the working directory."
   },
   {
     "objectID": "modules/Module03-WorkingDirectories.html#acknowledgements",
@@ -1698,7 +1458,7 @@
     "href": "modules/Module04-RProject.html#learning-objectives",
     "title": "Module 4: R Project",
     "section": "Learning Objectives",
-    "text": "Learning Objectives\nAfter module 4, you should be able to…\n\nCreate an R Project\nCheck you are in the desired R Project\nReference the Files window in RStudio\nDescribe “good” R Project organization"
+    "text": "Learning Objectives\nAfter module 4, you should be able to…\n\nCreate an R Project\nCheck you are in the desired R Project\nReference the Files pane in RStudio\nDescribe “good” R Project organization"
   },
   {
     "objectID": "modules/Module04-RProject.html#rstudio-project",
@@ -1712,42 +1472,42 @@
     "href": "modules/Module04-RProject.html#rstudio-project-creation",
     "title": "Module 4: R Project",
     "section": "RStudio Project Creation",
-    "text": "RStudio Project Creation\nLet’s create a new RStudio Project.\nGo to File –&gt; New Project –&gt; New Directory –&gt; New Project\nCall your Project “IntroToR_RProject”"
+    "text": "RStudio Project Creation\nLet’s create a new RStudio Project.\nFind the File Menu in the Menu Bar –&gt; New Project –&gt; New Directory –&gt; New Project\nName your Project “IntroToR_RProject”"
   },
   {
     "objectID": "modules/Module04-RProject.html#rstudio-project-organization",
     "href": "modules/Module04-RProject.html#rstudio-project-organization",
     "title": "Module 4: R Project",
     "section": "RStudio Project Organization",
-    "text": "RStudio Project Organization\nThis is my personal preference for organizing an R Project. But, for this workshop it will be mandatory as it will help us help you. A critical component of conducting any data analysis is being able to reproduce it! Organizing your code, data, output, and figures is a necessary (although not sufficient) condition for reproducibility.\nCreate 4 sub-directories with the following names within your “SISMID_IntroToR_RProject” folder:\n\ncode\ndata\noutput\nfigures\n\nWe will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the ‘code’ sub-directories."
+    "text": "RStudio Project Organization\nThis is my personal preference for organizing an R Project. But, for this workshop it will be mandatory as it will help us help you. A critical component of conducting any data analysis is being able to reproduce it! Organizing your code, data, output, and figures is a necessary (although not sufficient) condition for reproducibility.\nCreate 4 sub-directories with the following names within your “SISMID_IntroToR_RProject” folder:\n\ncode\ndata\noutput\nfigures\n\nWe will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the ‘code’ sub-directory."
   },
   {
     "objectID": "modules/Module04-RProject.html#some-things-to-notice-in-an-r-project",
     "href": "modules/Module04-RProject.html#some-things-to-notice-in-an-r-project",
     "title": "Module 4: R Project",
     "section": "Some things to notice in an R Project",
-    "text": "Some things to notice in an R Project\n\nThe name of the R Project will be shown at the top of the RStudio application\nIf you check the working directory using getwd() you will find the working directory is set to the location where the R Project was saved.\nThe Files window in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio."
+    "text": "Some things to notice in an R Project\n\nThe name of the R Project will be shown at the top of the RStudio Window\nIf you check the working directory using getwd() you will find the working directory is set to the location where the R Project was saved.\nThe Files pane in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio."
   },
   {
     "objectID": "modules/Module04-RProject.html#r-project---common-issues",
     "href": "modules/Module04-RProject.html#r-project---common-issues",
     "title": "Module 4: R Project",
     "section": "R Project - Common issues",
-    "text": "R Project - Common issues\nIf you simply open RStudio, it will not automatically open your R Project. As a result, when you say run a function to import data using the relative path based on your working directory, it won’t be able to find the data.\nTo open a previously created R Project, you need to open the R Project (i.e., SISMID_IntroToR_RProject.RProj)"
+    "text": "R Project - Common issues\nIf you simply open RStudio, it will not automatically open your R Project. As a result, when you say run a function to import data using the relative path based on your working directory, it won’t be able to find the data.\nTo open a previously created R Project, you need to open the R Project (i.e., double click on SISMID_IntroToR_RProject.RProj)"
   },
   {
     "objectID": "modules/Module04-RProject.html#summary",
     "href": "modules/Module04-RProject.html#summary",
     "title": "Module 4: R Project",
     "section": "Summary",
-    "text": "Summary\n\nR Projects are really helpful for lots of reasons, including to improve the reproducibility of your work\nConsistently set up your R Project’s sub-directories so that you can easily navigate the project"
+    "text": "Summary\n\nR Projects are really helpful for lots of reasons, including to improve the reproducibility of your work\nConsistently set up your R Project’s sub-directories so that you can easily navigate the project\nIf you get an error that a file can’t be found, make sure you correctly opened the R Project by looking for the Project name at the top of the RStudio application window."
   },
   {
     "objectID": "modules/Module04-RProject.html#mini-exercise",
     "href": "modules/Module04-RProject.html#mini-exercise",
     "title": "Module 4: R Project",
     "section": "Mini Exercise",
-    "text": "Mini Exercise\n\nClose R Studio\nReopen you R Project\nCheck that you are actually in the R Project\nCreate a new R script and save it in your ‘code’ subdirectory\nCreate a vector of numbers and then get a summary statistics of that vector (e.g., sum, mean, median)\nAdd comment(s) to your R script to explain your code."
+    "text": "Mini Exercise\n\nClose R Studio\nReopen your R Project\nCheck that you are actually in the R Project\nCreate a new R script and save it in your ‘code’ subdirectory\nCreate a vector of numbers\nCreate a vector a character values\nAdd comment(s) to your R script to explain your code."
   },
   {
     "objectID": "modules/Module04-RProject.html#acknowledgements",
@@ -1761,21 +1521,21 @@
     "href": "modules/Module05-DataImportExport.html#learning-objectives",
     "title": "Module 5: Data Import and Export",
     "section": "Learning Objectives",
-    "text": "Learning Objectives\nAfter module 5, you should be able to…\n\nUse Base R functions to load data\nInstall and call external R Packages to extend R’s functionality\nInstall any type of data into R\nFind loaded data in the Global Environment window of RStudio\nReading and writing R .Rds and .Rda/.RData files"
+    "text": "Learning Objectives\nAfter module 5, you should be able to…\n\nUse Base R functions to load data\nInstall and attach external R Packages to extend R’s functionality\nLoad any type of data into R\nFind loaded data in the Environment pane of RStudio\nReading and writing R .Rds and .Rda/.RData files"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#import-read-data",
     "href": "modules/Module05-DataImportExport.html#import-read-data",
     "title": "Module 5: Data Import and Export",
     "section": "Import (read) Data",
-    "text": "Import (read) Data\n\nImporting or ‘Reading in’ data is the first step of any real project/analysis\nR can read almost any file format, especially with external, non-Base R, packages\nWe are going to focus on simple delimited files first.\n\ncomma separated (e.g. ‘.csv’)\ntab delimited (e.g. ‘.txt’)\n\n\nA delimited file is a sequential file with column delimiters. Each delimited file is a stream of records, which consists of fields that are ordered by column. Each record contains fields for one row. Within each row, individual fields are separated by column delimiters (IBM.com definition)"
+    "text": "Import (read) Data\n\nImporting or ‘Reading in’ data are the first step of any real project / data analysis\nR can read almost any file format, especially with external, non-Base R, packages\nWe are going to focus on simple delimited files first.\n\ncomma separated (e.g. ‘.csv’)\ntab delimited (e.g. ‘.txt’)\n\n\nA delimited file is a sequential file with column delimiters. Each delimited file is a stream of records, which consists of fields that are ordered by column. Each record contains fields for one row. Within each row, individual fields are separated by column delimiters (IBM.com definition)"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#mini-exercise",
     "href": "modules/Module05-DataImportExport.html#mini-exercise",
     "title": "Module 5: Data Import and Export",
     "section": "Mini exercise",
-    "text": "Mini exercise\n\nDownload Module 5 data from the website and save the data to your data subdirectory – specifically SISMID_IntroToR_RProject/data\nOpen the data files in a text editor application and familiarize you self with the data.\nDetermine the delminiter of the two ‘.txt’ files"
+    "text": "Mini exercise\n\nDownload Module 5 data from the website and save the data to your data subdirectory – specifically SISMID_IntroToR_RProject/data\nOpen the ‘.csv’ and ‘.txt’ data files in a text editor application and familiarize yourself with the data (i.e., Notepad for Windows and TextEdit for Mac)\nOpen the ‘.xlsx’ data file in excel and familiarize yourself with the data - if you use a Mac do not open in Numbers, it can corrupt the file - if you do not have excel, you can upload it to Google Sheets\nDetermine the delimiter of the two ‘.txt’ files"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#import-delimited-data",
@@ -1789,35 +1549,35 @@
     "href": "modules/Module05-DataImportExport.html#import-.csv-files",
     "title": "Module 5: Data Import and Export",
     "section": "Import .csv files",
-    "text": "Import .csv files\nReminder\nread.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n         dec = \".\", fill = TRUE, comment.char = \"\", ...)\nfile is the first argument and is the path to your file, in quotes\n-       can be path in your local computer -- absolute file path or relative file path \n-       can be path to a file on a website"
+    "text": "Import .csv files\nFunction signature reminder\nread.csv(file, header = TRUE, sep = \",\", quote = \"\\\"\",\n         dec = \".\", fill = TRUE, comment.char = \"\", ...)\n    -       `file` is the first argument and is the path to your file, in quotes \n    \n            -       can be path in your local computer -- absolute file path or relative file path \n            -       can be path to a file on a website"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#import-.csv-files-1",
     "href": "modules/Module05-DataImportExport.html#import-.csv-files-1",
     "title": "Module 5: Data Import and Export",
     "section": "Import .csv files",
-    "text": "Import .csv files\nLets import a new data file\n\n## Examples\ndf &lt;- read.csv(file = \"data/serodata.csv\") #relative path\ndf &lt;- read.csv(file = \"~/Dropbox/Git/SISMID-2024/modules/data/serodata.csv\") #absolute path starting from my home directory\n\nNote #1, I assigned the data frame to an object called df. I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object.\nNote #2, Look to the Environment window, you will see the df object ready to be used."
+    "text": "Import .csv files\nLets import a new data file\n\n## Examples\ndf &lt;- read.csv(file = \"data/serodata.csv\") #relative path\n\nNote #1, I assigned the data frame to an object called df. I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object.\nNote #2, Look to the Environment pane, you will see the df object ready to be used."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#import-.txt-files",
     "href": "modules/Module05-DataImportExport.html#import-.txt-files",
     "title": "Module 5: Data Import and Export",
     "section": "Import .txt files",
-    "text": "Import .txt files\nread.csv() is a special case of read.delim() – a general function to read a delimited file into a data frame\nread.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n           dec = \".\", fill = TRUE, comment.char = \"\", ...)\n\nfile is the path to your file, in quotes\ndelim is what separates the fields within a record. The default for csv is comma"
+    "text": "Import .txt files\nread.csv() is a special case of read.delim() – a general function to read a delimited file into a data frame\nReminder function signature\nread.delim(file, header = TRUE, sep = \"\\t\", quote = \"\\\"\",\n           dec = \".\", fill = TRUE, comment.char = \"\", ...)\n    - `file` is the path to your file, in quotes \n    - `delim` is what separates the fields within a record. The default for csv is comma"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#import-.txt-files-1",
     "href": "modules/Module05-DataImportExport.html#import-.txt-files-1",
     "title": "Module 5: Data Import and Export",
     "section": "Import .txt files",
-    "text": "Import .txt files\nLets first import ‘serodata1.txt’ which uses a tab delminiter and ‘serodata2.txt’ which uses a semicolon delminiter.\n\n## Examples\ndf &lt;- read.delim(file = \"data/serodata.txt\", sep = \"\\t\")\ndf &lt;- read.delim(file = \"data/serodata.txt\", sep = \";\")\n\nThe data is now successfully read into your R workspace, many times actually. Notice, that each time we imported the data we assigned the data to the df object, meaning we replaced it each time we reassinged the df object."
+    "text": "Import .txt files\nLets first import ‘serodata1.txt’ which uses a tab delimiter and ‘serodata2.txt’ which uses a semicolon delimiter.\n\n## Examples\ndf &lt;- read.delim(file = \"data/serodata.txt\", sep = \"\\t\")\ndf &lt;- read.delim(file = \"data/serodata.txt\", sep = \";\")\n\nThe dataset is now successfully read into your R workspace, many times actually. Notice, that each time we imported the data we assigned the data to the df object, meaning we replaced it each time we reassinged the df object."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#what-if-we-have-a-.xlsx-file---what-do-we-do",
     "href": "modules/Module05-DataImportExport.html#what-if-we-have-a-.xlsx-file---what-do-we-do",
     "title": "Module 5: Data Import and Export",
     "section": "What if we have a .xlsx file - what do we do?",
-    "text": "What if we have a .xlsx file - what do we do?\n\nGoogle / Ask ChatGPT\nFind and vet function and package you want\nInstall package\nCall package\nUse function"
+    "text": "What if we have a .xlsx file - what do we do?\n\nGoogle / Ask ChatGPT\nFind and vet function and package you want\nInstall package\nAttach package\nUse function"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#internet-search",
@@ -1859,35 +1619,35 @@
     "href": "modules/Module05-DataImportExport.html#use-function-1",
     "title": "Module 5: Data Import and Export",
     "section": "5. Use Function",
-    "text": "5. Use Function\nReminder\nread_excel(\n  path,\n  sheet = NULL,\n  range = NULL,\n  col_names = TRUE,\n  col_types = NULL,\n  na = \"\",\n  trim_ws = TRUE,\n  skip = 0,\n  n_max = Inf,\n  guess_max = min(1000, n_max),\n  progress = readxl_progress(),\n  .name_repair = \"unique\"\n)\nLet’s practice\n\ndf &lt;- read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")"
+    "text": "5. Use Function\nReminder of function signature\nread_excel(\n  path,\n  sheet = NULL,\n  range = NULL,\n  col_names = TRUE,\n  col_types = NULL,\n  na = \"\",\n  trim_ws = TRUE,\n  skip = 0,\n  n_max = Inf,\n  guess_max = min(1000, n_max),\n  progress = readxl_progress(),\n  .name_repair = \"unique\"\n)\nLet’s practice\n\ndf &lt;- read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\")"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#lets-make-some-mistakes",
     "href": "modules/Module05-DataImportExport.html#lets-make-some-mistakes",
     "title": "Module 5: Data Import and Export",
     "section": "Lets make some mistakes",
-    "text": "Lets make some mistakes\n\nWhat if we read in the data without assinging it to an object (i.e., read_xlsx(path = \"data/serodata.xlsx\", sheet = \"Data\"))?\nWhat if we forget to specify the sheet argument? (i.e., dd &lt;- read_xlsx(path = \"data/serodata.xlsx\"))?"
+    "text": "Lets make some mistakes\n\nWhat if we read in the data without assigning it to an object (i.e., read_excel(path = \"data/serodata.xlsx\", sheet = \"Data\"))?\nWhat if we forget to specify the sheet argument? (i.e., dd &lt;- read_excel(path = \"data/serodata.xlsx\"))?"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#installing-and-calling-packages---common-confusion",
     "href": "modules/Module05-DataImportExport.html#installing-and-calling-packages---common-confusion",
     "title": "Module 5: Data Import and Export",
     "section": "Installing and calling packages - Common confusion",
-    "text": "Installing and calling packages - Common confusion\nYou only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it.\nThe exception to this rule are the “base” set of packages (i.e., Base R) that are installed automatically when you install R and that automatically called whenever you open R or RStudio."
+    "text": "Installing and calling packages - Common confusion\n\nYou only need to install a package once (unless you update R or want to update the package), but you will need to call or load a package each time you want to use it.\n\nThe exception to this rule are the “base” set of packages (i.e., Base R) that are installed automatically when you install R and that automatically called whenever you open R or RStudio."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#common-error",
     "href": "modules/Module05-DataImportExport.html#common-error",
     "title": "Module 5: Data Import and Export",
     "section": "Common Error",
-    "text": "Common Error\nBe prepared to see the error\n\nError: could not find function \"some_function\"\n\nThis usually mean that either\n\nyou called the function by the wrong name\nyou have not installed a package that contains the function\nyou have installed a package but you forgot to call it (i.e., library(package_name)) – most likely"
+    "text": "Common Error\nBe prepared to see this error\n\nError: could not find function \"some_function_name\"\n\nThis usually means that either\n\nyou called the function by the wrong name\nyou have not installed a package that contains the function\nyou have installed a package but you forgot to attach it (i.e., require(package_name)) – most likely"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#export-write-data",
     "href": "modules/Module05-DataImportExport.html#export-write-data",
     "title": "Module 5: Data Import and Export",
     "section": "Export (write) Data",
-    "text": "Export (write) Data\n\nExporting or ‘Writing out’ data allows you to save modified files to future use or sharing\nR can write almost any file format, especially with external, non-Base R, packages\nWe are going to focus again on writing delimited files"
+    "text": "Export (write) Data\n\nExporting or ‘Writing out’ data allows you to save modified files for future use or sharing\nR can write almost any file format, especially with external, non-Base R, packages\nWe are going to focus again on writing delimited files"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#export-delimited-data",
@@ -1901,14 +1661,14 @@
     "href": "modules/Module05-DataImportExport.html#export-delimited-data-1",
     "title": "Module 5: Data Import and Export",
     "section": "Export delimited data",
-    "text": "Export delimited data\n\nwrite.csv(df, file=\"data/serodata_new.csv\", row.names = FALSE) #comma delimited\nwrite.table(df, file=\"data/serodata1_new.txt\", sep=\"\\t\", row.names = FALSE) #tab delimited\nwrite.table(df, file=\"data/serodata2_new.txt\", sep=\";\", row.names = FALSE) #semicolon delimited\n\nNote, I wrote the data to new file names. Even though we didn’t change the data at all in this module, it is good practice to keep raw data raw, and not to write over it."
+    "text": "Export delimited data\nLet’s practice exporting the data as three files with three different delimiters (comma, tab, semicolon)\n\nwrite.csv(df, file=\"data/serodata_new.csv\", row.names = FALSE) #comma delimited\nwrite.table(df, file=\"data/serodata1_new.txt\", sep=\"\\t\", row.names = FALSE) #tab delimited\nwrite.table(df, file=\"data/serodata2_new.txt\", sep=\";\", row.names = FALSE) #semicolon delimited\n\nNote, I wrote the data to new file names. Even though we didn’t change the data at all in this module, it is good practice to keep raw data raw, and not to write over it."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#r-.rds-and-.rdardata-files",
     "href": "modules/Module05-DataImportExport.html#r-.rds-and-.rdardata-files",
     "title": "Module 5: Data Import and Export",
     "section": "R .rds and .rda/RData files",
-    "text": "R .rds and .rda/RData files\nThere are two file extensions worth discussing.\nR has two native data formats—Rdata (sometimes shortened to Rda) and Rds. These formats are used when R objects are saved for later use. Rdata is used to save multiple R objects, while Rds is used to save a single R object."
+    "text": "R .rds and .rda/RData files\nThere are two file extensions worth discussing.\nR has two native data formats—‘Rdata’ (sometimes shortened to ‘Rda’) and ‘Rds’. These formats are used when R objects are saved for later use. ‘Rdata’ is used to save multiple R objects, while ‘Rds’ is used to save a single R object. ‘Rds’ is fast to write/read and is very small."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#rds-binary-file",
@@ -1922,14 +1682,14 @@
     "href": "modules/Module05-DataImportExport.html#rdardata-files",
     "title": "Module 5: Data Import and Export",
     "section": ".rda/RData files",
-    "text": ".rda/RData files\nThe Base R functions save() and load() can be used to save and load multiple R objects.\nsave() writes an external representation of R objects to the specified file, and can by loaded back into the environment using load(). A nice feature about using save and load is that the R object is directly imported into the environment and you don’t have to assign it to an object. The files can be saved as .RData or .rda files.\nsave(object1, object2, file = \"filename.RData\")\nload(\"filename.RData\")\nNote, that when you read .RData files you don’t need to assign it to an abjecct. It simply reads in the objects as they were saved. Therefore, load(\"filename.RData\") will read in object1 and object2 directly into the Global Environment."
+    "text": ".rda/RData files\nThe Base R functions save() and load() can be used to save and load multiple R objects.\nsave() writes an external representation of R objects to the specified file, and can by loaded back into the environment using load(). A nice feature about using save and load is that the R object(s) is directly imported into the environment and you don’t have to specify the name. The files can be saved as .RData or .Rda files.\nFunction signature\nsave(object1, object2, file = \"filename.RData\")\nload(\"filename.RData\")\nNote, that you separate the objects you want to save with commas."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#summary",
     "href": "modules/Module05-DataImportExport.html#summary",
     "title": "Module 5: Data Import and Export",
     "section": "Summary",
-    "text": "Summary\n\nImporting or ‘Reading in’ data is the first step of any real project/analysis\nThe Base R ‘util’ package we can find a handful of useful functions including read.csv() and read.delim() to importing/reading data or write.csv() and write.table() for exporti/writing data\nWhen importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be called/used\nProperly read data can be found in the Environment window of RStudio\nYou only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it.\nTo complete a tasek you don’t know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Call package, 5. Use function"
+    "text": "Summary\n\nImporting or ‘Reading in’ data are the first step of any real project / data analysis\nThe Base R ‘util’ package has useful functions including read.csv() and read.delim() to importing/reading data or write.csv() and write.table() for exporting/writing data\nWhen importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be used\nIf data are imported correctly, they can be found in the Environment pane of RStudio\nYou only need to install a package once (unless you update R or the package), but you will need to attach a package each time you want to use it.\nTo complete a task you don’t know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Attach package, 5. Use function"
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#acknowledgements",
@@ -1942,8 +1702,8 @@
     "objectID": "modules/Module05-DataImportExport.html#mini-exercise-1",
     "href": "modules/Module05-DataImportExport.html#mini-exercise-1",
     "title": "Module 5: Data Import and Export",
-    "section": "Mini Exercise",
-    "text": "Mini Exercise\nIf your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data."
+    "section": "Mini exercise",
+    "text": "Mini exercise\nIf your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data."
   },
   {
     "objectID": "modules/Module05-DataImportExport.html#mini-exercise-2",
@@ -1978,63 +1738,63 @@
     "href": "modules/Module06-DataSubset.html#description-of-data",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Description of data",
-    "text": "Description of data\nThis is data based on a simulated pathogen X IgG antibody serological survey. The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization. We will use this dataset for lectures throughout the Workshop."
+    "text": "Description of data\nThis is data based on a simulated pathogen X IgG antibody serological survey. The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization. We will use this dataset for modules throughout the Workshop."
   },
   {
     "objectID": "modules/Module06-DataSubset.html#view-the-data-as-a-whole-dataframe",
     "href": "modules/Module06-DataSubset.html#view-the-data-as-a-whole-dataframe",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "View the data as a whole dataframe",
-    "text": "View the data as a whole dataframe\nThe View() function, one of the few Base R functions with a capital letter can be used to open a new tab in the Console and view the data as you would in excel.\n\nView(df)"
+    "text": "View the data as a whole dataframe\nThe View() function, one of the few Base R functions with a capital letter, and can be used to open a new tab in the Console and view the data as you would in excel.\n\nView(df)"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#view-the-data-as-a-whole-dataframe-1",
     "href": "modules/Module06-DataSubset.html#view-the-data-as-a-whole-dataframe-1",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "View the data as a whole dataframe",
-    "text": "View the data as a whole dataframe\nYou can also open a new tab of the data by clicking on the data icon beside the object in the Environment window."
+    "text": "View the data as a whole dataframe\nYou can also open a new tab of the data by clicking on the data icon beside the object in the Environment pane\n\nYou can also hold down Cmd or CTRL and click on the name of a data frame in your code."
   },
   {
     "objectID": "modules/Module06-DataSubset.html#indexing",
     "href": "modules/Module06-DataSubset.html#indexing",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Indexing",
-    "text": "Indexing\nR contains several constructs which allow access to individual elements or subsets through indexing operations. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing syntax: [ ], [[ ]] and $.\n\nx[i] #if x is a vector\nx[i, j] #if x is a matrix/data frame\nx[[i]] #if x is a list\nx$a #if x is a data frame or list\nx$\"a\" #if x is a data frame or list"
+    "text": "Indexing\nR contains several operators which allow access to individual elements or subsets through indexing. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing operators: [, [[ and $.\n\nx[i] #if x is a vector\nx[i, j] #if x is a matrix/data frame\nx[[i]] #if x is a list\nx$a #if x is a data frame or list\nx$\"a\" #if x is a data frame or list"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#vectors-and-multi-dimensional-objects",
     "href": "modules/Module06-DataSubset.html#vectors-and-multi-dimensional-objects",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Vectors and multi-dimensional objects",
-    "text": "Vectors and multi-dimensional objects\nTo index a vector, vector[i] select the ith element. To index a multi-dimensional objects such as a matrix, matrix[i, j] selects the element in row i and column j, where as in a three dimensional array[k, i, i, j] selects the element in matrix k, row i, and column j.\nLet’s practice by first creating the same objects as we did in Module 1.\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n\nHere is a reminder of what these objects look like.\n\nvector.object1\n\n[1] 2 3 4 5\n\nmatrix.object\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nFinally, let’s use indexing to pull our elements of the objects.\n\nvector.object1[2] #pulling the second element\n\n[1] 3\n\nmatrix.object[1,2] #pulling the element in row 1 column 2\n\n[1] 3"
+    "text": "Vectors and multi-dimensional objects\nTo index a vector, vector[i] select the ith element. To index a multi-dimensional objects such as a matrix, matrix[i, j] selects the element in row i and column j, where as in a three dimensional array[k, i, j] selects the element in matrix k, row i, and column j.\nLet’s practice by first creating the same objects as we did in Module 1.\n\nnumber.object &lt;- 3\ncharacter.object &lt;- \"blue\"\nvector.object1 &lt;- c(2,3,4,5)\nvector.object2 &lt;- c(\"blue\", \"red\", \"yellow\")\nmatrix.object &lt;- matrix(data=vector.object1, nrow=2, ncol=2, byrow=TRUE)\n\nHere is a reminder of what these objects look like.\n\nvector.object1\n\n[1] 2 3 4 5\n\nmatrix.object\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nFinally, let’s use indexing to pull out elements of the objects.\n\nvector.object1[2] #pulling the second element\n\n[1] 3\n\nmatrix.object[1,2] #pulling the element in row 1 column 2\n\n[1] 3"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#list-objects",
     "href": "modules/Module06-DataSubset.html#list-objects",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "List objects",
-    "text": "List objects\nFor lists, one generally uses list[[p]] to select any single element p.\nLet’s practice by creating the same list as we did in Module 1.\n\nlist.object &lt;- list(number.object, vector.object2, matrix.object)\nlist.object\n\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nNow we use indexing to pull out the 3rd element in the list.\n\nlist.object[[3]]\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5"
+    "text": "List objects\nFor lists, one generally uses list[[p]] to select any single element p.\nLet’s practice by creating the same list as we did in Module 1.\n\nlist.object &lt;- list(number.object, vector.object2, matrix.object)\nlist.object\n\n[[1]]\n[1] 3\n\n[[2]]\n[1] \"blue\"   \"red\"    \"yellow\"\n\n[[3]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nNow we use indexing to pull out the 3rd element in the list.\n\nlist.object[[3]]\n\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nWhat happens if we use a single square bracket?\n\nlist.object[3]\n\n[[1]]\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nThe [[ operator is called the “extract” operator and gives us the element from the list. The [ operator is called the “subset” operator and gives us a subset of the list, that is still a list."
   },
   {
     "objectID": "modules/Module06-DataSubset.html#for-indexing",
     "href": "modules/Module06-DataSubset.html#for-indexing",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "$ for indexing",
-    "text": "$ for indexing\n$ allows only a literal character string or a symbol as the index.\n\ndf$IgG_concentration\n\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n\n\nNote, if you have spaces in your variable name, you will need to use back ticks variable name after the $. This is a good reason to not create variables / column names with spaces."
+    "text": "$ for indexing\n$ allows only a literal character string or a symbol as the index.\n\ndf$IgG_concentration\n\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n\n\nNote, if you have spaces in your variable name, you will need to use back ticks ` after the $. This is a good reason to not create variables / column names with spaces."
   },
   {
     "objectID": "modules/Module06-DataSubset.html#for-indexing-with-lists",
     "href": "modules/Module06-DataSubset.html#for-indexing-with-lists",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "$ for indexing with lists",
-    "text": "$ for indexing with lists\nList elements can be named\n\nlist.object.named &lt;- list(\n  emory = number.object,\n  uga = vector.object2,\n  gsu = matrix.object\n)\nlist.object.named\n\n$emory\n[1] 3\n\n$uga\n[1] \"blue\"   \"red\"    \"yellow\"\n\n$gsu\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nIf list elements are named, than you can reference data from list using $ or using double square brackets, [[ ]]\n\nlist.object.named$uga \n\n[1] \"blue\"   \"red\"    \"yellow\"\n\nlist.object.named[[\"uga\"]] \n\n[1] \"blue\"   \"red\"    \"yellow\""
+    "text": "$ for indexing with lists\n$ allows only a literal character string or a symbol as the index. For a list it extracts a named element.\nList elements can be named\n\nlist.object.named &lt;- list(\n  emory = number.object,\n  uga = vector.object2,\n  gsu = matrix.object\n)\nlist.object.named\n\n$emory\n[1] 3\n\n$uga\n[1] \"blue\"   \"red\"    \"yellow\"\n\n$gsu\n     [,1] [,2]\n[1,]    2    3\n[2,]    4    5\n\n\nIf list elements are named, than you can reference data from list using $ or using double square brackets, [[\n\nlist.object.named$uga \n\n[1] \"blue\"   \"red\"    \"yellow\"\n\nlist.object.named[[\"uga\"]] \n\n[1] \"blue\"   \"red\"    \"yellow\""
   },
   {
     "objectID": "modules/Module06-DataSubset.html#using-indexing-to-rename-columns",
     "href": "modules/Module06-DataSubset.html#using-indexing-to-rename-columns",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Using indexing to rename columns",
-    "text": "Using indexing to rename columns\nAs mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).\n\ncolnames(df) # just prints\n\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n\ncolnames(df)[1:2] &lt;- c(\"IgG_concentration_mIU/mL\", \"age_year\") # reassigns\ncolnames(df)\n\n[1] \"IgG_concentration_mIU/mL\" \"age_year\"                \n[3] \"age\"                      \"gender\"                  \n[5] \"slum\"                    \n\ncolnames(df)[1:2] &lt;- c(\"IgG_concentration\", \"age\") #reset"
+    "text": "Using indexing to rename columns\nAs mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).\n\ncolnames(df) \n\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n\ncolnames(df)[2:3] &lt;- c(\"IgG_concentration_IU/mL\", \"age_year\") # reassigns\ncolnames(df)\n\n[1] \"observation_id\"          \"IgG_concentration_IU/mL\"\n[3] \"age_year\"                \"gender\"                 \n[5] \"slum\"                   \n\n\nFor the sake of the module, I am going to reassign them back to the original variable names\n\ncolnames(df)[2:3] &lt;- c(\"IgG_concentration\", \"age\") #reset"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#using-indexing-to-subset-data",
@@ -2048,7 +1808,7 @@
     "href": "modules/Module06-DataSubset.html#logical-operators",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Logical operators",
-    "text": "Logical operators\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE\n\n\n\noperator\noperator option\ndescription\n\n\n\n\n&lt;\n%l%\nless than\n\n\n&lt;=\n%le%\nless than or equal to\n\n\n&gt;\n%g%\ngreater than\n\n\n&gt;=\n%ge%\ngreater than or equal to\n\n\n==\n\nequal to\n\n\n!=\nnot equal to\n\n\n\nx&y\n\nx and y\n\n\nx|y\n\nx or y\n\n\n%in%\n\nmatch\n\n\n%!in%\n\ndo not match"
+    "text": "Logical operators\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE\n\n\n\noperator\noperator option\ndescription\n\n\n\n\n&lt;\n%l%\nless than\n\n\n&lt;=\n%le%\nless than or equal to\n\n\n&gt;\n%g%\ngreater than\n\n\n&gt;=\n%ge%\ngreater than or equal to\n\n\n==\n\nequal to\n\n\n!=\n\nnot equal to\n\n\nx&y\n\nx and y\n\n\nx|y\n\nx or y\n\n\n%in%\n\nmatch\n\n\n%!in%\n\ndo not match"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#logical-operators-examples",
@@ -2062,21 +1822,21 @@
     "href": "modules/Module06-DataSubset.html#using-indexing-and-logical-operators-to-rename-columns",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Using indexing and logical operators to rename columns",
-    "text": "Using indexing and logical operators to rename columns\n\nWe can assign the column names from data frame df to an object cn, then we can modify cn directly using indexing and logical operators, finally we reassign the column names, cn, back to the data frame df:\n\n\ncn &lt;- colnames(df)\ncn\n\n[1] \"IgG_concentration\" \"age\"               \"age\"              \n[4] \"gender\"            \"slum\"             \n\ncn[cn==\"IgG_concentration\"] &lt;-\"IgG_concentration_mIU\" #rename cn to \"IgG_concentration_mIU\" when cn is \"IgG_concentration\"\ncolnames(df) &lt;- cn\n\nNote, I am resetting the column name back to the original name for the sake of the rest of the module.\n\ncolnames(df)[colnames(df)==\"IgG_concentration_mIU\"] &lt;- \"IgG_concentration\" #reset"
+    "text": "Using indexing and logical operators to rename columns\n\nWe can assign the column names from data frame df to an object cn, then we can modify cn directly using indexing and logical operators, finally we reassign the column names, cn, back to the data frame df:\n\n\ncn &lt;- colnames(df)\ncn\n\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n\ncn==\"IgG_concentration\"\n\n[1] FALSE  TRUE FALSE FALSE FALSE\n\ncn[cn==\"IgG_concentration\"] &lt;-\"IgG_concentration_mIU\" #rename cn to \"IgG_concentration_mIU\" when cn is \"IgG_concentration\"\ncolnames(df) &lt;- cn\ncolnames(df)\n\n[1] \"observation_id\"        \"IgG_concentration_mIU\" \"age\"                  \n[4] \"gender\"                \"slum\"                 \n\n\nNote, I am resetting the column name back to the original name for the sake of the rest of the module.\n\ncolnames(df)[colnames(df)==\"IgG_concentration_mIU\"] &lt;- \"IgG_concentration\" #reset"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#using-indexing-and-logical-operators-to-subset-data",
     "href": "modules/Module06-DataSubset.html#using-indexing-and-logical-operators-to-subset-data",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Using indexing and logical operators to subset data",
-    "text": "Using indexing and logical operators to subset data\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 10 and then saved the subset data to df_lt10. Note that the logical operators df$age&lt;=10 is before the comma because I want to subset by rows (the first dimension).\n\ndf_lte10 &lt;- df[df$age&lt;=10, ]\n\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.\n\ndf_lte5_gt10 &lt;- df[df$age&lt;=5 | df$age&gt;10, ]\n\nLets check that my subsets worked using the summary() function.\n\nsummary(df_lte10$age)\n\n    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's \n0.005391 0.300000 0.300000 0.724742 0.640788 9.545455       10 \n\nsummary(df_lte5_gt10$age)\n\n    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's \n  0.0054   0.3000   1.6018  87.9886 142.8362 916.4179       10"
+    "text": "Using indexing and logical operators to subset data\nIn this example, we subset by rows and pull only observations with an age of less than or equal to 10 and then saved the subset data to df_lt10. Note that the logical operators df$age&lt;=10 is before the comma because I want to subset by rows (the first dimension).\n\ndf_lte10 &lt;- df[df$age&lt;=10, ]\n\nLets check that my subsets worked using the summary() function.\n\nsummary(df_lte10$age)\n\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's \n    1.0     3.0     4.0     4.8     7.0    10.0       9 \n\n\n\nIn the next example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.\n\ndf_lte5_gt10 &lt;- df[df$age&lt;=5 | df$age&gt;10, ]\n\nLets check that my subsets worked using the summary() function.\n\nsummary(df_lte5_gt10$age)\n\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's \n   1.00    2.50    4.00    6.08   11.00   15.00       9"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#missing-values",
     "href": "modules/Module06-DataSubset.html#missing-values",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Missing values",
-    "text": "Missing values\nMissing data need to be carefully described and dealt with in data analysis. Understanding the different types of missing data and how you can identify them, is the first step to data cleaning.\nTypes of “missing” values:\n\nNA - general missing data\nNaN - stands for “Not a Number”, happens when you do 0/0.\nInf and -Inf - Infinity, happens when you divide a positive number (or negative number) by 0.\nblank space - sometimes when data is read it, there is a blank space left"
+    "text": "Missing values\nMissing data need to be carefully described and dealt with in data analysis. Understanding the different types of missing data and how you can identify them, is the first step to data cleaning.\nTypes of “missing” values:\n\nNA - Not Applicable general missing data\nNaN - stands for “Not a Number”, happens when you do 0/0.\nInf and -Inf - Infinity, happens when you divide a positive number (or negative number) by 0.\nblank space - sometimes when data is read it, there is a blank space left\nan empty string (e.g., \"\")\nNULL- undefined value that represents something that does not exist"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#more-logical-operators",
@@ -2097,7 +1857,7 @@
     "href": "modules/Module06-DataSubset.html#more-logical-operators-examples-1",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "More logical operators examples",
-    "text": "More logical operators examples\nany(is.na(x)) means do we have any NA’s in the object x?\n\nany(is.na(df$IgG_concentration)) # are there any NAs - YES/TRUE\n\n[1] FALSE\n\nany(is.na(df$slum)) # are there any NAs- NO/FALSE\n\n[1] FALSE\n\n\nwhich(is.na(x)) means which of the elements in object x are NA’s?\n\nwhich(is.na(df$IgG_concentration)) \n\ninteger(0)\n\nwhich(is.na(df$slum)) \n\ninteger(0)"
+    "text": "More logical operators examples\nany(is.na(x)) means do we have any NA’s in the object x?\n\nany(is.na(df$IgG_concentration)) # are there any NAs - YES/TRUE\n\n[1] TRUE\n\nany(is.na(df$slum)) # are there any NAs- NO/FALSE\n\n[1] FALSE\n\n\nwhich(is.na(x)) means which of the elements in object x are NA’s?\n\nwhich(is.na(df$IgG_concentration)) \n\n [1]  13  55  57  72 182 406 414 478 488 595\n\nwhich(is.na(df$slum)) \n\ninteger(0)"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#subset-function",
@@ -2118,14 +1878,14 @@
     "href": "modules/Module06-DataSubset.html#subset-function-vs-logical-operators",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "subset() function vs logical operators",
-    "text": "subset() function vs logical operators\nsubset() automatically removes NAs, which is a different behavior from doing logical operations on NAs.\n\nsummary(df_lte10$age)\n\n\n\n\nMin.\n1st Qu.\nMedian\nMean\n3rd Qu.\nMax.\nNA’s\n\n\n\n\n0.0053908\n0.3\n0.3\n0.7247421\n0.6407876\n9.545454\n10\n\n\n\n\nsummary(df_lte10_v2$age)\n\n\n\n\nMin.\n1st Qu.\nMedian\nMean\n3rd Qu.\nMax.\n\n\n\n\n0.0053908\n0.3\n0.3\n0.7247421\n0.6407876\n9.545454\n\n\n\n\n\nWe can also see this by looking at the number or rows in each dataset.\n\nnrow(df_lte10)\n\n[1] 370\n\nnrow(df_lte10_v2)\n\n[1] 360"
+    "text": "subset() function vs logical operators\nsubset() automatically removes NAs, which is a different behavior from doing logical operations on NAs.\n\nsummary(df_lte10$age) #created with indexing\n\n\n\n\nMin.\n1st Qu.\nMedian\nMean\n3rd Qu.\nMax.\nNA’s\n\n\n\n\n1\n3\n4\n4.8\n7\n10\n9\n\n\n\n\nsummary(df_lte10_v2$age) #created with the subset function\n\n\n\n\nMin.\n1st Qu.\nMedian\nMean\n3rd Qu.\nMax.\n\n\n\n\n1\n3\n4\n4.8\n7\n10\n\n\n\n\n\nWe can also see this by looking at the number or rows in each dataset.\n\nnrow(df_lte10)\n\n[1] 504\n\nnrow(df_lte10_v2)\n\n[1] 495"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#summary",
     "href": "modules/Module06-DataSubset.html#summary",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Summary",
-    "text": "Summary\n\ncolnames(), str() and summary()functions from Base R are great functions to assess the data type and some summary statistics\nThere are three basic indexing syntax: [ ], [[ ]] and $\nIndexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing\nThere are 5 “types” of missing values, the most common being “NA”\nLogical operators meant to determine missing values are very helpful for data cleaning\nThe Base R subset() function is a slightly easier way to select variables and observations."
+    "text": "Summary\n\ncolnames(), str() and summary()functions from Base R are functions to assess the data type and some summary statistics\nThere are three basic indexing syntax: [, [[ and $\nIndexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)\nLogical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing\nThere are 7 “types” of missing values, the most common being “NA”\nLogical operators meant to determine missing values are very helpful for data cleaning\nThe Base R subset() function is a slightly easier way to select variables and observations."
   },
   {
     "objectID": "modules/Module06-DataSubset.html#acknowledgements",
@@ -2139,20 +1899,615 @@
     "href": "modules/Module06-DataSubset.html#using-indexing-to-subset-by-columns",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Using indexing to subset by columns",
-    "text": "Using indexing to subset by columns\nWe can also subset a data frames and matrices (2-dimensional objects) using the bracket [ row , column ]. We can subset by columns and pull the x column using the index of the column or the column name.\nFor example, here I am pulling the 3nd column, which has the variable name age\n\ndf[ , \"age\"] #same as df[ , 3]\n\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01\n\n\nWe can select multiple columns using multiple column names:\n\ndf[, c(\"age\", \"gender\")] #same as df[ , c(3,4)]\n\n             age gender\n1   3.176895e-01 Female\n2   3.436823e+00 Female\n3   3.000000e-01   Male\n4   1.432363e+02   Male\n5   4.476534e-01   Male\n6   2.527076e-02   Male\n7   6.101083e-01 Female\n8   3.000000e-01 Female\n9   2.916968e+00   Male\n10  1.649819e+00   Male\n11  4.574007e+00   Male\n12  1.583904e+02 Female\n13            NA   Male\n14  1.065068e+02   Male\n15  1.113870e+02   Male\n16  4.144893e+01   Male\n17  3.000000e-01   Male\n18  2.527076e-01 Female\n19  8.159247e+01 Female\n20  1.825342e+02   Male\n21  4.244656e+01   Male\n22  1.193493e+02 Female\n23  3.000000e-01   Male\n24  3.000000e-01 Female\n25  9.025271e-01 Female\n26  3.501805e-01   Male\n27  3.000000e-01   Male\n28  1.227437e+00 Female\n29  1.702055e+02 Female\n30  3.000000e-01 Female\n31  4.801444e-01   Male\n32  2.527076e-02   Male\n33  3.000000e-01 Female\n34  5.776173e-02   Male\n35  4.801444e-01 Female\n36  3.826715e-01 Female\n37  3.000000e-01   Male\n38  4.048558e+02   Male\n39  3.000000e-01   Male\n40  5.451264e-01   Male\n41  3.000000e-01 Female\n42  5.590753e+01   Male\n43  2.202166e-01 Female\n44  1.709760e+02   Male\n45  1.227437e+00   Male\n46  4.567527e+02   Male\n47  4.838480e+01   Male\n48  1.227437e-01 Female\n49  1.877256e-01 Female\n50  3.000000e-01 Female\n51  3.501805e-01   Male\n52  3.339350e+00   Male\n53  3.000000e-01 Female\n54  5.451264e-01 Female\n55            NA   Male\n56  2.104693e+00   Male\n57            NA   Male\n58  3.826715e-01 Female\n59  3.926366e+01 Female\n60  1.129964e+00   Male\n61  3.501805e+00 Female\n62  7.542808e+01 Female\n63  4.800475e+01 Female\n64  1.000000e+00   Male\n65  4.068884e+01   Male\n66  3.000000e-01 Female\n67  4.377672e+01 Female\n68  1.193493e+02   Male\n69  6.977740e+01   Male\n70  1.373288e+02 Female\n71  1.642979e+02   Male\n72            NA Female\n73  1.542808e+02   Male\n74  6.033058e-01   Male\n75  2.809917e-01   Male\n76  1.966942e+00   Male\n77  2.041322e+00   Male\n78  2.115702e+00 Female\n79  4.663043e+02   Male\n80  3.000000e-01   Male\n81  1.500796e+02   Male\n82  1.543790e+02 Female\n83  2.561983e-01 Female\n84  1.596338e+02   Male\n85  1.732484e+02 Female\n86  4.641304e+02 Female\n87  3.736364e+01   Male\n88  1.572452e+02 Female\n89  3.000000e-01   Male\n90  3.000000e-01   Male\n91  8.264463e-02   Male\n92  6.776859e-01 Female\n93  7.272727e-01   Male\n94  2.066116e-01 Female\n95  1.966942e+00   Male\n96  3.000000e-01   Male\n97  3.000000e-01   Male\n98  2.809917e-01 Female\n99  8.016529e-01 Female\n100 1.818182e-01 Female\n101 1.818182e-01   Male\n102 8.264463e-02 Female\n103 3.422727e+01 Female\n104 8.743506e+00   Male\n105 3.000000e-01   Male\n106 1.641720e+02 Female\n107 4.049587e-01   Male\n108 1.001592e+02   Male\n109 4.489130e+02 Female\n110 1.101911e+02 Female\n111 4.440909e+01   Male\n112 1.288217e+02 Female\n113 2.840909e+01   Male\n114 1.003981e+02 Female\n115 8.512397e-01 Female\n116 1.322314e-01   Male\n117 1.297521e+00 Female\n118 1.570248e-01   Male\n119 1.966942e+00 Female\n120 1.536624e+02   Male\n121 3.000000e-01 Female\n122 3.000000e-01 Female\n123 1.074380e+00   Male\n124 1.099174e+00 Female\n125 3.057851e-01 Female\n126 3.000000e-01 Female\n127 5.785124e-02 Female\n128 4.391304e+02 Female\n129 6.130435e+02 Female\n130 1.074380e-01   Male\n131 7.125796e+01   Male\n132 4.222727e+01   Male\n133 1.620223e+02 Female\n134 3.750000e+01 Female\n135 1.534236e+02 Female\n136 6.239130e+02 Female\n137 5.521739e+02   Male\n138 5.785124e-02 Female\n139 6.547945e-01 Female\n140 8.767123e-02 Female\n141 3.000000e-01   Male\n142 2.849315e+00 Female\n143 3.835616e-02   Male\n144 2.849315e-01   Male\n145 4.649315e+00   Male\n146 1.369863e-01 Female\n147 3.589041e-01   Male\n148 1.049315e+00   Male\n149 4.668998e+01 Female\n150 1.473510e+02 Female\n151 4.589744e+01   Male\n152 2.109589e-01   Male\n153 1.741722e+02 Female\n154 2.496503e+01 Female\n155 1.850993e+02   Male\n156 1.863014e-01   Male\n157 1.863014e-01   Male\n158 4.589744e+01 Female\n159 1.942881e+02 Female\n160 5.079646e+02 Female\n161 8.767123e-01   Male\n162 2.750685e+00   Male\n163 1.503311e+02 Female\n164 3.000000e-01   Male\n165 3.095890e-01   Male\n166 3.000000e-01   Male\n167 6.371681e+02 Female\n168 6.054795e-01 Female\n169 1.955298e+02 Female\n170 1.786424e+02   Male\n171 1.120861e+02 Female\n172 1.331954e+02   Male\n173 2.159292e+02   Male\n174 5.628319e+02   Male\n175 1.900662e+02 Female\n176 6.547945e-01   Male\n177 1.665753e+00   Male\n178 1.739238e+02   Male\n179 9.991722e+01   Male\n180 9.321192e+01   Male\n181 8.767123e-02 Female\n182           NA   Male\n183 6.794521e-01 Female\n184 5.808219e-01   Male\n185 1.369863e-01 Female\n186 2.060274e+00 Female\n187 1.610099e+02   Male\n188 4.082192e-01 Female\n189 8.273973e-01   Male\n190 4.601770e+02 Female\n191 1.389073e+02 Female\n192 3.867133e+01 Female\n193 9.260274e-01 Female\n194 5.918874e+01 Female\n195 1.870861e+02 Female\n196 4.328767e-01   Male\n197 6.301370e-02   Male\n198 3.000000e-01 Female\n199 1.548013e+02   Male\n200 5.819536e+01 Female\n201 1.724338e+02 Female\n202 1.932401e+01 Female\n203 2.164420e+00 Female\n204 9.757412e-01 Female\n205 1.509434e-01   Male\n206 1.509434e-01 Female\n207 7.766571e+01   Male\n208 4.319563e+01 Female\n209 1.752022e-01   Male\n210 3.094775e+01 Female\n211 1.266846e-01   Male\n212 2.919806e+01   Male\n213 9.545455e+00 Female\n214 2.735115e+01 Female\n215 1.314841e+02 Female\n216 3.643985e+01   Male\n217 1.498559e+02 Female\n218 9.363636e+00 Female\n219 2.479784e-01   Male\n220 5.390836e-02 Female\n221 8.787062e-01 Female\n222 1.994609e-01   Male\n223 3.000000e-01 Female\n224 3.000000e-01   Male\n225 5.390836e-03 Female\n226 4.177898e-01 Female\n227 3.000000e-01 Female\n228 2.479784e-01   Male\n229 2.964960e-02   Male\n230 2.964960e-01   Male\n231 5.148248e+00 Female\n232 1.994609e-01   Male\n233 3.000000e-01   Male\n234 1.779539e+02   Male\n235 3.290210e+02 Female\n236 3.000000e-01   Male\n237 1.809798e+02 Female\n238 4.905660e-01   Male\n239 1.266846e-01   Male\n240 1.543948e+02 Female\n241 1.379683e+02 Female\n242 6.153846e+02   Male\n243 1.474784e+02   Male\n244 3.000000e-01 Female\n245 1.024259e+00   Male\n246 4.444056e+02 Female\n247 3.000000e-01   Male\n248 2.504043e+00 Female\n249 3.000000e-01 Female\n250 3.000000e-01 Female\n251 7.816712e-02 Female\n252 3.000000e-01 Female\n253 5.390836e-02   Male\n254 1.494236e+02 Female\n255 5.972622e+01   Male\n256 6.361186e-01 Female\n257 1.837896e+02 Female\n258 1.320809e+02 Female\n259 1.571906e-01   Male\n260 1.520231e+02   Male\n261 3.000000e-01 Female\n262 3.000000e-01 Female\n263 1.823699e+02   Male\n264 3.000000e-01   Male\n265 2.173913e+00   Male\n266 2.142202e+01   Male\n267 3.000000e-01 Female\n268 3.408027e+00   Male\n269 4.155963e+01   Male\n270 9.698997e-02   Male\n271 1.238532e+01 Female\n272 9.528926e+00   Male\n273 1.916185e+02 Female\n274 1.060201e+00   Male\n275 3.679104e+02 Female\n276 4.288991e+01   Male\n277 9.971098e+01   Male\n278 3.000000e-01   Male\n279 1.208092e+02   Male\n280 3.000000e-01   Male\n281 6.688963e-03 Female\n282 2.505017e+00 Female\n283 1.481605e+00   Male\n284 3.000000e-01 Female\n285 5.183946e-01 Female\n286 3.000000e-01 Female\n287 1.872910e-01   Male\n288 3.678930e-01 Female\n289 3.000000e-01   Male\n290 4.529851e+02 Female\n291 3.169725e+01 Female\n292 3.000000e-01   Male\n293 4.922018e+01   Male\n294 2.548507e+02   Male\n295 1.661850e+02   Male\n296 9.164179e+02   Male\n297 3.678930e-01 Female\n298 1.236994e+02   Male\n299 6.705202e+01   Male\n300 3.834862e+01   Male\n301 1.963211e+00 Female\n302 3.000000e-01   Male\n303 2.474916e-01   Male\n304 3.000000e-01 Female\n305 2.173913e-01   Male\n306 8.193980e-01   Male\n307 2.444816e+00 Female\n308 3.000000e-01   Male\n309 1.571906e-01 Female\n310 1.849711e+02   Male\n311 6.119403e+02 Female\n312 3.000000e-01 Female\n313 4.280936e-01 Female\n314 9.698997e-02   Male\n315 3.678930e-02 Female\n316 4.832090e+02   Male\n317 1.390173e+02 Female\n318 3.000000e-01   Male\n319 6.555970e+02 Female\n320 1.526012e+02 Female\n321 3.000000e-01 Female\n322 7.222222e-01   Male\n323 7.724426e+01   Male\n324 3.000000e-01   Male\n325 6.111111e-01 Female\n326 1.555556e+00 Female\n327 3.055556e-01   Male\n328 1.500000e+00   Male\n329 1.470772e+02   Male\n330 1.694444e+00 Female\n331 3.138298e+02 Female\n332 1.414405e+02 Female\n333 1.990605e+02 Female\n334 4.212766e+02   Male\n335 3.000000e-01   Male\n336 3.000000e-01   Male\n337 6.478723e+02   Male\n338 3.000000e-01   Male\n339 2.222222e+00 Female\n340 3.000000e-01   Male\n341 2.055556e+00   Male\n342 2.777778e-02 Female\n343 8.333333e-02   Male\n344 1.032359e+02 Female\n345 1.611111e+00 Female\n346 8.333333e-02 Female\n347 2.333333e+00 Female\n348 5.755319e+02   Male\n349 1.686848e+02 Female\n350 1.111111e-01   Male\n351 3.000000e-01   Male\n352 8.372340e+02 Female\n353 3.000000e-01   Male\n354 3.784504e+01   Male\n355 3.819149e+02   Male\n356 5.555556e-02 Female\n357 3.000000e+02 Female\n358 1.855950e+02   Male\n359 1.944444e-01 Female\n360 3.000000e-01   Male\n361 5.555556e-02 Female\n362 1.138889e+00   Male\n363 4.254237e+01 Female\n364 3.000000e-01   Male\n365 3.000000e-01   Male\n366 3.000000e-01 Female\n367 3.000000e-01 Female\n368 3.138298e+02 Female\n369 1.235908e+02   Male\n370 4.159574e+02   Male\n371 3.009685e+01 Female\n372 1.567850e+02 Female\n373 1.367432e+02 Female\n374 3.731235e+01 Female\n375 9.164927e+01   Male\n376 2.936170e+02 Female\n377 8.820459e+01 Female\n378 1.035491e+02   Male\n379 7.379958e+01 Female\n380 3.000000e-01   Male\n381 1.718750e+02   Male\n382 2.128527e+00   Male\n383 1.253918e+00 Female\n384 2.382445e-01   Male\n385 4.639498e-01 Female\n386 1.253918e-01   Male\n387 1.253918e-01   Male\n388 3.000000e-01 Female\n389 1.000000e+00   Male\n390 1.570043e+02   Male\n391 4.344086e+02 Female\n392 2.184953e+00   Male\n393 1.507837e+00 Female\n394 3.228840e-01 Female\n395 4.588024e+01   Male\n396 1.660560e+02   Male\n397 3.000000e-01   Male\n398 3.043011e+02   Male\n399 2.612903e+02 Female\n400 1.621767e+02   Male\n401 3.228840e-01   Male\n402 4.639498e-01 Female\n403 2.495298e+00 Female\n404 3.257053e+00 Female\n405 3.793103e-01 Female\n406           NA   Male\n407 6.896552e-02 Female\n408 3.000000e-01   Male\n409 1.423197e+00 Female\n410 3.000000e-01 Female\n411 3.000000e-01 Female\n412 1.786638e+02   Male\n413 3.279570e+02   Male\n414           NA Female\n415 1.903017e+02   Male\n416 1.654095e+02 Female\n417 4.639498e-01 Female\n418 1.815733e+02   Male\n419 1.366771e+00   Male\n420 1.536050e-01 Female\n421 1.306587e+01   Male\n422 2.129032e+02 Female\n423 1.925647e+02   Male\n424 3.000000e-01 Female\n425 1.028213e+00 Female\n426 3.793103e-01 Female\n427 8.025078e-01 Female\n428 4.860215e+02 Female\n429 3.000000e-01 Female\n430 2.100313e-01   Male\n431 2.767665e+01 Female\n432 1.592476e+00   Male\n433 9.717868e-02 Female\n434 1.028213e+00 Female\n435 3.793103e-01   Male\n436 1.292026e+02   Male\n437 4.425150e+01 Female\n438 3.193548e+02 Female\n439 1.860991e+02 Female\n440 6.614420e-01 Female\n441 5.203762e-01   Male\n442 1.330819e+02   Male\n443 1.673491e+02 Female\n444 3.000000e-01   Male\n445 1.117457e+02   Male\n446 3.045509e+01 Female\n447 3.000000e-01   Male\n448 8.280255e-02 Female\n449 3.000000e-01 Female\n450 1.200637e+00 Female\n451 1.687898e-01   Male\n452 7.367273e+02 Female\n453 8.280255e-02   Male\n454 5.127389e-01   Male\n455 1.974522e-01   Male\n456 7.993631e-01 Female\n457 3.000000e-01   Male\n458 3.298182e+02   Male\n459 9.736842e+01 Female\n460 3.000000e-01 Female\n461 3.000000e-01 Female\n462 4.214545e+02 Female\n463 3.000000e-01   Male\n464 2.578182e+02 Female\n465 2.261147e-01   Male\n466 3.000000e-01 Female\n467 1.883901e+02   Male\n468 9.458204e+01 Female\n469 3.000000e-01 Female\n470 3.000000e-01   Male\n471 7.707006e-01 Female\n472 5.032727e+02   Male\n473 1.544586e+00 Female\n474 1.431115e+02 Female\n475 3.000000e-01   Male\n476 1.458599e+00   Male\n477 1.247678e+02 Female\n478           NA Female\n479 4.334545e+02   Male\n480 3.000000e-01 Female\n481 6.156364e+02 Female\n482 9.574303e+01   Male\n483 1.928019e+02   Male\n484 1.888545e+02   Male\n485 1.598297e+02 Female\n486 5.127389e-01   Male\n487 1.171053e+02 Female\n488           NA   Male\n489 2.547771e-02 Female\n490 1.707430e+02 Female\n491 3.000000e-01   Male\n492 1.869969e+02   Male\n493 4.731481e+01   Male\n494 1.988390e+02 Female\n495 3.000000e-01   Male\n496 8.808050e+01   Male\n497 2.003185e+00 Female\n498 3.000000e-01   Male\n499 3.509259e+01 Female\n500 9.365325e+01 Female\n501 3.000000e-01   Male\n502 3.736111e+01 Female\n503 1.674923e+02 Female\n504 8.808050e+01   Male\n505 1.656347e+02 Female\n506 3.722222e+01 Female\n507 6.756364e+02 Female\n508 3.000000e-01   Male\n509 1.698142e+02   Male\n510 1.628483e+02 Female\n511 5.985130e-01   Male\n512 1.903346e+00 Female\n513 3.000000e-01   Male\n514 3.000000e-01   Male\n515 8.996283e-01   Male\n516 3.977695e-01 Female\n517 3.000000e-01   Male\n518 3.000000e-01   Male\n519 3.000000e-01   Male\n520 3.000000e-01 Female\n521 7.446809e+02   Male\n522 6.095745e+02 Female\n523 1.427445e+02   Male\n524 3.000000e-01 Female\n525 2.973978e-02   Male\n526 3.977695e-01 Female\n527 4.095745e+02 Female\n528 4.595745e+02   Male\n529 3.000000e-01 Female\n530 1.976341e+02 Female\n531 3.776596e+02 Female\n532 1.777603e+02 Female\n533 4.312268e-01   Male\n534 6.765957e+02 Female\n535 7.978723e+02   Male\n536 9.665427e-02   Male\n537 1.879338e+02   Male\n538 4.358670e+01 Female\n539 3.000000e-01 Female\n540 3.000000e-01   Male\n541 2.638955e+01   Male\n542 3.180523e+01 Female\n543 1.746845e+02   Male\n544 1.876972e+02   Male\n545 1.044164e+02   Male\n546 1.202681e+02   Male\n547 1.630915e+02 Female\n548 1.276025e+02 Female\n549 8.880126e+01   Male\n550 3.563830e+02   Male\n551 2.212766e+02   Male\n552 1.969121e+01 Female\n553 3.755319e+02 Female\n554 1.214511e+02   Male\n555 1.034700e+02 Female\n556 3.000000e-01 Female\n557 3.643123e-01 Female\n558 6.319703e-02 Female\n559 3.000000e-01   Male\n560 3.000000e-01   Male\n561 3.000000e-01 Female\n562 3.000000e-01 Female\n563 3.000000e-01   Male\n564 3.000000e-01   Male\n565 3.000000e-01 Female\n566 3.000000e-01   Male\n567 1.664038e+02 Female\n568 2.946809e+02 Female\n569 4.391924e+01   Male\n570 1.874606e+02 Female\n571 1.143533e+02   Male\n572 1.600158e+02   Male\n573 1.635688e-01   Male\n574 8.809148e+01 Female\n575 1.337539e+02   Male\n576 1.985804e+02   Male\n577 1.578864e+02 Female\n578 3.000000e-01 Female\n579 3.000000e-01   Male\n580 1.953642e-01 Female\n581 1.119205e+00   Male\n582 2.523636e+02   Male\n583 3.000000e-01   Male\n584 4.844371e+00 Female\n585 3.000000e-01   Male\n586 1.492553e+02 Female\n587 1.993617e+02   Male\n588 2.847682e-01 Female\n589 3.145695e-01 Female\n590 3.000000e-01   Male\n591 3.406429e+01 Female\n592 6.595745e+01   Male\n593 3.000000e-01   Male\n594 2.174545e+02   Male\n595           NA Female\n596 5.957447e+01 Female\n597 7.236364e+02 Female\n598 3.000000e-01   Male\n599 3.000000e-01 Female\n600 3.000000e-01   Male\n601 2.676364e+02   Male\n602 1.891489e+02   Male\n603 3.036364e+02 Female\n604 3.000000e-01 Female\n605 3.000000e-01   Male\n606 3.000000e-01   Male\n607 3.000000e-01 Female\n608 3.000000e-01   Male\n609 1.447020e+00   Male\n610 2.130909e+02 Female\n611 1.357616e-01 Female\n612 3.000000e-01 Female\n613 3.000000e-01 Female\n614 5.534545e+02 Female\n615 1.891489e+02 Female\n616 7.202128e+01 Female\n617 3.250287e+01   Male\n618 1.655629e-02   Male\n619 3.123636e+02   Male\n620 3.000000e-01   Male\n621 7.138298e+01   Male\n622 3.000000e-01 Female\n623 6.946809e+01 Female\n624 4.012629e+01   Male\n625 1.629787e+02 Female\n626 1.508511e+02 Female\n627 1.655629e-02   Male\n628 3.000000e-01   Male\n629 4.635762e-02   Male\n630 3.000000e-01 Female\n631 3.000000e-01 Female\n632 3.000000e-01   Male\n633 1.942553e+02   Male\n634 3.690909e+02   Male\n635 3.000000e-01 Female\n636 3.000000e-01 Female\n637 2.847682e+00   Male\n638 1.435106e+02 Female\n639 3.000000e-01   Male\n640 4.752009e+01 Female\n641 2.621125e+01 Female\n642 1.055319e+02 Female\n643 3.000000e-01 Female\n644 1.149007e+00   Male\n645 2.927273e+02 Female\n646 3.000000e-01 Female\n647 3.000000e-01 Female\n648 4.839265e+01   Male\n649 3.000000e-01   Male\n650 3.000000e-01 Female\n651 2.251656e-01 Female\n\n\nWe can remove select columns using indexing as well, OR by simply changing the column to NULL\n\ndf[, -5] #remove column 5, \"slum\" variable\n\n    IgG_concentration          age age.1 gender\n1                5772 3.176895e-01     2 Female\n2                8095 3.436823e+00     4 Female\n3                9784 3.000000e-01     4   Male\n4                9338 1.432363e+02     4   Male\n5                6369 4.476534e-01     1   Male\n6                6885 2.527076e-02     4   Male\n7                6252 6.101083e-01     4 Female\n8                8913 3.000000e-01    NA Female\n9                7332 2.916968e+00     4   Male\n10               6941 1.649819e+00     2   Male\n11               5104 4.574007e+00     3   Male\n12               9078 1.583904e+02    15 Female\n13               9960           NA     8   Male\n14               9651 1.065068e+02    12   Male\n15               9229 1.113870e+02    15   Male\n16               5210 4.144893e+01     9   Male\n17               5105 3.000000e-01     8   Male\n18               7607 2.527076e-01     7 Female\n19               7582 8.159247e+01    11 Female\n20               8179 1.825342e+02    10   Male\n21               5660 4.244656e+01     8   Male\n22               6696 1.193493e+02    11 Female\n23               7842 3.000000e-01     2   Male\n24               6578 3.000000e-01     2 Female\n25               9619 9.025271e-01     3 Female\n26               9838 3.501805e-01     5   Male\n27               6935 3.000000e-01     1   Male\n28               5885 1.227437e+00     3 Female\n29               9657 1.702055e+02     5 Female\n30               9146 3.000000e-01     5 Female\n31               7056 4.801444e-01     3   Male\n32               9144 2.527076e-02     1   Male\n33               8696 3.000000e-01     4 Female\n34               7042 5.776173e-02     3   Male\n35               5278 4.801444e-01     2 Female\n36               6541 3.826715e-01    11 Female\n37               6070 3.000000e-01     7   Male\n38               5490 4.048558e+02     8   Male\n39               6527 3.000000e-01     6   Male\n40               5389 5.451264e-01     6   Male\n41               9003 3.000000e-01    11 Female\n42               6682 5.590753e+01    10   Male\n43               7844 2.202166e-01     6 Female\n44               8257 1.709760e+02    12   Male\n45               7767 1.227437e+00    11   Male\n46               8391 4.567527e+02    10   Male\n47               8317 4.838480e+01    11   Male\n48               7397 1.227437e-01    13 Female\n49               8495 1.877256e-01     3 Female\n50               8093 3.000000e-01     4 Female\n51               7375 3.501805e-01     3   Male\n52               5255 3.339350e+00     1   Male\n53               8445 3.000000e-01     2 Female\n54               8959 5.451264e-01     2 Female\n55               8400           NA     4   Male\n56               7420 2.104693e+00     2   Male\n57               5206           NA     2   Male\n58               7431 3.826715e-01     3 Female\n59               7230 3.926366e+01     3 Female\n60               8208 1.129964e+00     4   Male\n61               8538 3.501805e+00     1 Female\n62               6125 7.542808e+01    13 Female\n63               5767 4.800475e+01    13 Female\n64               5487 1.000000e+00     6   Male\n65               5539 4.068884e+01    13   Male\n66               5759 3.000000e-01     5 Female\n67               6845 4.377672e+01    13 Female\n68               7170 1.193493e+02    14   Male\n69               6588 6.977740e+01    13   Male\n70               7939 1.373288e+02     8 Female\n71               5006 1.642979e+02     7   Male\n72               9180           NA     6 Female\n73               9638 1.542808e+02    13   Male\n74               7781 6.033058e-01     3   Male\n75               6932 2.809917e-01     4   Male\n76               8120 1.966942e+00     2   Male\n77               9292 2.041322e+00    NA   Male\n78               9228 2.115702e+00     5 Female\n79               8185 4.663043e+02     3   Male\n80               6797 3.000000e-01     3   Male\n81               5970 1.500796e+02    14   Male\n82               7219 1.543790e+02    11 Female\n83               6870 2.561983e-01     7 Female\n84               7653 1.596338e+02     7   Male\n85               8824 1.732484e+02    11 Female\n86               8311 4.641304e+02     9 Female\n87               9458 3.736364e+01    14   Male\n88               8275 1.572452e+02    13 Female\n89               6786 3.000000e-01     1   Male\n90               6595 3.000000e-01     1   Male\n91               5264 8.264463e-02     4   Male\n92               9188 6.776859e-01     1 Female\n93               6611 7.272727e-01     2   Male\n94               6840 2.066116e-01     3 Female\n95               5663 1.966942e+00     2   Male\n96               9611 3.000000e-01     1   Male\n97               7717 3.000000e-01     2   Male\n98               8374 2.809917e-01     2 Female\n99               5134 8.016529e-01     4 Female\n100              8122 1.818182e-01     5 Female\n101              6192 1.818182e-01     5   Male\n102              9668 8.264463e-02     6 Female\n103              9577 3.422727e+01    14 Female\n104              6403 8.743506e+00    14   Male\n105              9464 3.000000e-01    10   Male\n106              8157 1.641720e+02     6 Female\n107              9451 4.049587e-01     6   Male\n108              6615 1.001592e+02     8   Male\n109              9074 4.489130e+02     6 Female\n110              7479 1.101911e+02    12 Female\n111              8946 4.440909e+01    12   Male\n112              5296 1.288217e+02    14 Female\n113              6238 2.840909e+01    15   Male\n114              6303 1.003981e+02    12 Female\n115              6662 8.512397e-01     4 Female\n116              6251 1.322314e-01     4   Male\n117              9110 1.297521e+00     3 Female\n118              8480 1.570248e-01    NA   Male\n119              5229 1.966942e+00     2 Female\n120              9173 1.536624e+02     3   Male\n121              9896 3.000000e-01    NA Female\n122              5057 3.000000e-01     3 Female\n123              7732 1.074380e+00     3   Male\n124              6882 1.099174e+00     2 Female\n125              9587 3.057851e-01     4 Female\n126              9930 3.000000e-01    10 Female\n127              6960 5.785124e-02     7 Female\n128              6335 4.391304e+02    11 Female\n129              6286 6.130435e+02     6 Female\n130              9035 1.074380e-01    11   Male\n131              5720 7.125796e+01     9   Male\n132              7368 4.222727e+01     6   Male\n133              5170 1.620223e+02    13 Female\n134              6691 3.750000e+01    10 Female\n135              6173 1.534236e+02     6 Female\n136              8170 6.239130e+02    11 Female\n137              9637 5.521739e+02     7   Male\n138              9482 5.785124e-02     6 Female\n139              7880 6.547945e-01     4 Female\n140              6307 8.767123e-02     4 Female\n141              8822 3.000000e-01     4   Male\n142              8190 2.849315e+00     4 Female\n143              7554 3.835616e-02     4   Male\n144              6519 2.849315e-01     4   Male\n145              9764 4.649315e+00     3   Male\n146              8792 1.369863e-01     4 Female\n147              6721 3.589041e-01     3   Male\n148              9042 1.049315e+00     3   Male\n149              7407 4.668998e+01    13 Female\n150              7229 1.473510e+02     7 Female\n151              7532 4.589744e+01    10   Male\n152              6516 2.109589e-01     6   Male\n153              7941 1.741722e+02    10 Female\n154              8124 2.496503e+01    12 Female\n155              7869 1.850993e+02    10   Male\n156              5647 1.863014e-01    10   Male\n157              9120 1.863014e-01    13   Male\n158              6608 4.589744e+01    13 Female\n159              8635 1.942881e+02     5 Female\n160              9341 5.079646e+02     3 Female\n161              9982 8.767123e-01     4   Male\n162              6976 2.750685e+00     1   Male\n163              6008 1.503311e+02     3 Female\n164              5432 3.000000e-01     4   Male\n165              5749 3.095890e-01     4   Male\n166              6428 3.000000e-01     1   Male\n167              5947 6.371681e+02     5 Female\n168              6027 6.054795e-01     6 Female\n169              5064 1.955298e+02    14 Female\n170              5861 1.786424e+02     6   Male\n171              6702 1.120861e+02    13 Female\n172              7851 1.331954e+02     9   Male\n173              8310 2.159292e+02    11   Male\n174              5897 5.628319e+02    10   Male\n175              9249 1.900662e+02     5 Female\n176              9163 6.547945e-01    14   Male\n177              6550 1.665753e+00     7   Male\n178              5859 1.739238e+02    10   Male\n179              5607 9.991722e+01     6   Male\n180              8746 9.321192e+01     5   Male\n181              5274 8.767123e-02     3 Female\n182              9412           NA     4   Male\n183              5691 6.794521e-01     2 Female\n184              9016 5.808219e-01     3   Male\n185              9128 1.369863e-01     3 Female\n186              8539 2.060274e+00     2 Female\n187              5703 1.610099e+02     3   Male\n188              9573 4.082192e-01     5 Female\n189              5852 8.273973e-01     2   Male\n190              5971 4.601770e+02     3 Female\n191              7015 1.389073e+02    14 Female\n192              8221 3.867133e+01     9 Female\n193              6752 9.260274e-01    14 Female\n194              7436 5.918874e+01     9 Female\n195              6869 1.870861e+02     8 Female\n196              8947 4.328767e-01     7   Male\n197              7360 6.301370e-02    13   Male\n198              7494 3.000000e-01     8 Female\n199              8243 1.548013e+02     6   Male\n200              6176 5.819536e+01    12 Female\n201              6818 1.724338e+02    14 Female\n202              8083 1.932401e+01    15 Female\n203              6711 2.164420e+00     2 Female\n204              8890 9.757412e-01     4 Female\n205              5576 1.509434e-01     3   Male\n206              8396 1.509434e-01     3 Female\n207              5986 7.766571e+01     3   Male\n208              9758 4.319563e+01     4 Female\n209              5444 1.752022e-01     3   Male\n210              6394 3.094775e+01    14 Female\n211              5694 1.266846e-01     8   Male\n212              9604 2.919806e+01     7   Male\n213              7895 9.545455e+00    14 Female\n214              5141 2.735115e+01    13 Female\n215              8034 1.314841e+02    13 Female\n216              6566 3.643985e+01     7   Male\n217              6827 1.498559e+02     8 Female\n218              7400 9.363636e+00    10 Female\n219              9094 2.479784e-01     9   Male\n220              9474 5.390836e-02     9 Female\n221              7984 8.787062e-01     3 Female\n222              9524 1.994609e-01     4   Male\n223              9598 3.000000e-01     4 Female\n224              9664 3.000000e-01     4   Male\n225              9910 5.390836e-03     2 Female\n226              9216 4.177898e-01     1 Female\n227              9706 3.000000e-01     3 Female\n228              5320 2.479784e-01     2   Male\n229              5256 2.964960e-02     3   Male\n230              9006 2.964960e-01     5   Male\n231              6413 5.148248e+00     2 Female\n232              8717 1.994609e-01     2   Male\n233              9873 3.000000e-01     9   Male\n234              6699 1.779539e+02    13   Male\n235              8228 3.290210e+02    10 Female\n236              6494 3.000000e-01     6   Male\n237              9294 1.809798e+02    13 Female\n238              7680 4.905660e-01    11   Male\n239              7534 1.266846e-01    10   Male\n240              9920 1.543948e+02     8 Female\n241              9814 1.379683e+02     9 Female\n242              5363 6.153846e+02    10   Male\n243              5842 1.474784e+02    14   Male\n244              7992 3.000000e-01     1 Female\n245              5565 1.024259e+00     2   Male\n246              5258 4.444056e+02     3 Female\n247              8200 3.000000e-01     2   Male\n248              8795 2.504043e+00     3 Female\n249              7676 3.000000e-01     2 Female\n250              7029 3.000000e-01     3 Female\n251              7535 7.816712e-02     5 Female\n252              5026 3.000000e-01    10 Female\n253              8630 5.390836e-02     7   Male\n254              6989 1.494236e+02    13 Female\n255              8454 5.972622e+01    15   Male\n256              9741 6.361186e-01    11 Female\n257              6418 1.837896e+02    10 Female\n258              9922 1.320809e+02     3 Female\n259              8504 1.571906e-01     2   Male\n260              6491 1.520231e+02     3   Male\n261              6002 3.000000e-01     3 Female\n262              7127 3.000000e-01     3 Female\n263              8540 1.823699e+02     4   Male\n264              7115 3.000000e-01     3   Male\n265              7268 2.173913e+00     2   Male\n266              8279 2.142202e+01     4   Male\n267              8880 3.000000e-01     2 Female\n268              8076 3.408027e+00     8   Male\n269              6250 4.155963e+01    11   Male\n270              8542 9.698997e-02     6   Male\n271              5393 1.238532e+01    14 Female\n272              9197 9.528926e+00    14   Male\n273              6651 1.916185e+02     5 Female\n274              7473 1.060201e+00     5   Male\n275              6589 3.679104e+02    10 Female\n276              6867 4.288991e+01    13   Male\n277              5413 9.971098e+01     6   Male\n278              6765 3.000000e-01     5   Male\n279              8933 1.208092e+02    12   Male\n280              6294 3.000000e-01     2   Male\n281              8688 6.688963e-03     3 Female\n282              8108 2.505017e+00     1 Female\n283              6926 1.481605e+00     1   Male\n284              5880 3.000000e-01     1 Female\n285              5529 5.183946e-01     2 Female\n286              8963 3.000000e-01     5 Female\n287              9594 1.872910e-01     5   Male\n288              8075 3.678930e-01     4 Female\n289              5680 3.000000e-01     2   Male\n290              5617 4.529851e+02    NA Female\n291              5080 3.169725e+01     6 Female\n292              7719 3.000000e-01     8   Male\n293              6780 4.922018e+01    15   Male\n294              8768 2.548507e+02    11   Male\n295              7031 1.661850e+02    14   Male\n296              7740 9.164179e+02     6   Male\n297              8855 3.678930e-01    10 Female\n298              7241 1.236994e+02    12   Male\n299              8156 6.705202e+01    14   Male\n300              7333 3.834862e+01    10   Male\n301              6906 1.963211e+00     1 Female\n302              9511 3.000000e-01     3   Male\n303              9336 2.474916e-01     2   Male\n304              6644 3.000000e-01     3 Female\n305              5554 2.173913e-01     4   Male\n306              8094 8.193980e-01     3   Male\n307              8836 2.444816e+00     4 Female\n308              7147 3.000000e-01     4   Male\n309              7745 1.571906e-01     1 Female\n310              9345 1.849711e+02     7   Male\n311              5606 6.119403e+02    11 Female\n312              9766 3.000000e-01     7 Female\n313              6666 4.280936e-01     5 Female\n314              9965 9.698997e-02    10   Male\n315              7927 3.678930e-02     9 Female\n316              6266 4.832090e+02    13   Male\n317              9487 1.390173e+02    11 Female\n318              7089 3.000000e-01    13   Male\n319              5731 6.555970e+02     9 Female\n320              7962 1.526012e+02    15 Female\n321              9532 3.000000e-01     7 Female\n322              6687 7.222222e-01     4   Male\n323              6570 7.724426e+01     1   Male\n324              5781 3.000000e-01     1   Male\n325              8935 6.111111e-01     2 Female\n326              5780 1.555556e+00     2 Female\n327              9029 3.055556e-01     3   Male\n328              5668 1.500000e+00     2   Male\n329              8203 1.470772e+02     3   Male\n330              7381 1.694444e+00     4 Female\n331              7734 3.138298e+02     7 Female\n332              7257 1.414405e+02    11 Female\n333              8418 1.990605e+02    10 Female\n334              8259 4.212766e+02     5   Male\n335              5587 3.000000e-01     8   Male\n336              8499 3.000000e-01    15   Male\n337              7897 6.478723e+02    14   Male\n338              8300 3.000000e-01     2   Male\n339              9691 2.222222e+00     2 Female\n340              5873 3.000000e-01     2   Male\n341              6690 2.055556e+00     5   Male\n342              9970 2.777778e-02     4 Female\n343              8978 8.333333e-02     3   Male\n344              6181 1.032359e+02     5 Female\n345              8218 1.611111e+00     4 Female\n346              5387 8.333333e-02     2 Female\n347              7850 2.333333e+00     1 Female\n348              7326 5.755319e+02     7   Male\n349              8448 1.686848e+02     8 Female\n350              7264 1.111111e-01    NA   Male\n351              8361 3.000000e-01     9   Male\n352              7497 8.372340e+02     8 Female\n353              5559 3.000000e-01     5   Male\n354              7321 3.784504e+01    14   Male\n355              8372 3.819149e+02    14   Male\n356              5030 5.555556e-02     7 Female\n357              6936 3.000000e+02    13 Female\n358              9628 1.855950e+02     2   Male\n359              8558 1.944444e-01     1 Female\n360              7840 3.000000e-01     1   Male\n361              5100 5.555556e-02     4 Female\n362              8244 1.138889e+00     3   Male\n363              9115 4.254237e+01     4 Female\n364              5489 3.000000e-01     3   Male\n365              5766 3.000000e-01     1   Male\n366              5024 3.000000e-01     5 Female\n367              8599 3.000000e-01     4 Female\n368              8895 3.138298e+02     4 Female\n369              7708 1.235908e+02     4   Male\n370              7646 4.159574e+02    11   Male\n371              6640 3.009685e+01    15 Female\n372              8958 1.567850e+02    12 Female\n373              6477 1.367432e+02    11 Female\n374              7910 3.731235e+01     8 Female\n375              7829 9.164927e+01    13   Male\n376              7503 2.936170e+02    10 Female\n377              5209 8.820459e+01    10 Female\n378              6763 1.035491e+02    15   Male\n379              8976 7.379958e+01     8 Female\n380              9223 3.000000e-01    14   Male\n381              7692 1.718750e+02     4   Male\n382              7453 2.128527e+00     1   Male\n383              9775 1.253918e+00     5 Female\n384              9662 2.382445e-01     2   Male\n385              8733 4.639498e-01     2 Female\n386              5695 1.253918e-01     4   Male\n387              7714 1.253918e-01     4   Male\n388              9224 3.000000e-01     2 Female\n389              7635 1.000000e+00     3   Male\n390              7176 1.570043e+02    11   Male\n391              6102 4.344086e+02    10 Female\n392              7817 2.184953e+00     6   Male\n393              9719 1.507837e+00    12 Female\n394              9740 3.228840e-01    10 Female\n395              9528 4.588024e+01     8   Male\n396              7142 1.660560e+02     8   Male\n397              5689 3.000000e-01    13   Male\n398              5439 3.043011e+02    10   Male\n399              6718 2.612903e+02    13 Female\n400              6569 1.621767e+02    10   Male\n401              9444 3.228840e-01     2   Male\n402              6964 4.639498e-01     4 Female\n403              6420 2.495298e+00     3 Female\n404              9189 3.257053e+00     2 Female\n405              9368 3.793103e-01     1 Female\n406              6360           NA     3   Male\n407              8196 6.896552e-02     3 Female\n408              8297 3.000000e-01     4   Male\n409              6674 1.423197e+00     5 Female\n410              5269 3.000000e-01     5 Female\n411              6599 3.000000e-01     1 Female\n412              7713 1.786638e+02    11   Male\n413              8644 3.279570e+02     6   Male\n414              9680           NA    14 Female\n415              6305 1.903017e+02     8   Male\n416              8493 1.654095e+02     8 Female\n417              5297 4.639498e-01     9 Female\n418              7723 1.815733e+02     7   Male\n419              7510 1.366771e+00     6   Male\n420              5102 1.536050e-01    12 Female\n421              7816 1.306587e+01     8   Male\n422              5143 2.129032e+02    11 Female\n423              7414 1.925647e+02    14   Male\n424              5127 3.000000e-01     3 Female\n425              5830 1.028213e+00     1 Female\n426              8929 3.793103e-01     5 Female\n427              7993 8.025078e-01     2 Female\n428              8092 4.860215e+02     3 Female\n429              9750 3.000000e-01     4 Female\n430              6660 2.100313e-01     2   Male\n431              8054 2.767665e+01     3 Female\n432              6086 1.592476e+00     4   Male\n433              6878 9.717868e-02     1 Female\n434              8125 1.028213e+00     7 Female\n435              9500 3.793103e-01    10   Male\n436              8105 1.292026e+02    11   Male\n437              9593 4.425150e+01     7 Female\n438              5202 3.193548e+02    10 Female\n439              7207 1.860991e+02    14 Female\n440              5518 6.614420e-01     7 Female\n441              9820 5.203762e-01    11   Male\n442              6958 1.330819e+02    12   Male\n443              9445 1.673491e+02    10 Female\n444              8774 3.000000e-01     6   Male\n445              9614 1.117457e+02    13   Male\n446              9810 3.045509e+01     8 Female\n447              7271 3.000000e-01     2   Male\n448              8031 8.280255e-02     3 Female\n449              7232 3.000000e-01     1 Female\n450              7452 1.200637e+00     2 Female\n451              5921 1.687898e-01    NA   Male\n452              8136 7.367273e+02    NA Female\n453              6605 8.280255e-02     4   Male\n454              5125 5.127389e-01     4   Male\n455              5911 1.974522e-01     1   Male\n456              9644 7.993631e-01     2 Female\n457              5760 3.000000e-01     2   Male\n458              7055 3.298182e+02    12   Male\n459              9064 9.736842e+01    12 Female\n460              6925 3.000000e-01     8 Female\n461              7757 3.000000e-01    14 Female\n462              8527 4.214545e+02    13 Female\n463              8521 3.000000e-01     6   Male\n464              6260 2.578182e+02    11 Female\n465              9578 2.261147e-01    11   Male\n466              9570 3.000000e-01    10 Female\n467              6246 1.883901e+02    12   Male\n468              9622 9.458204e+01    14 Female\n469              7661 3.000000e-01    11 Female\n470              9374 3.000000e-01     1   Male\n471              8446 7.707006e-01     2 Female\n472              8332 5.032727e+02     3   Male\n473              8008 1.544586e+00     3 Female\n474              9365 1.431115e+02     5 Female\n475              9819 3.000000e-01     3   Male\n476              5173 1.458599e+00     1   Male\n477              6722 1.247678e+02     4 Female\n478              7668           NA     4 Female\n479              8980 4.334545e+02     4   Male\n480              5204 3.000000e-01     2 Female\n481              6412 6.156364e+02     5 Female\n482              6404 9.574303e+01     7   Male\n483              5693 1.928019e+02     8   Male\n484              8100 1.888545e+02    10   Male\n485              9760 1.598297e+02     6 Female\n486              6377 5.127389e-01     7   Male\n487              6012 1.171053e+02    10 Female\n488              6224           NA     6   Male\n489              6561 2.547771e-02     6 Female\n490              8475 1.707430e+02    15 Female\n491              6629 3.000000e-01     5   Male\n492              7200 1.869969e+02     3   Male\n493              9453 4.731481e+01     5   Male\n494              6449 1.988390e+02     3 Female\n495              9452 3.000000e-01     5   Male\n496              7162 8.808050e+01     5   Male\n497              8962 2.003185e+00     1 Female\n498              7328 3.000000e-01     1   Male\n499              9097 3.509259e+01     7 Female\n500              9131 9.365325e+01    14 Female\n501              7280 3.000000e-01     9   Male\n502              5783 3.736111e+01    10 Female\n503              9895 1.674923e+02    10 Female\n504              7986 8.808050e+01    11   Male\n505              7146 1.656347e+02    11 Female\n506              8671 3.722222e+01    12 Female\n507              5273 6.756364e+02    11 Female\n508              5063 3.000000e-01    12   Male\n509              6729 1.698142e+02    12   Male\n510              9085 1.628483e+02    10 Female\n511              9929 5.985130e-01     1   Male\n512              8479 1.903346e+00     2 Female\n513              7395 3.000000e-01     4   Male\n514              6374 3.000000e-01     2   Male\n515              7878 8.996283e-01     3   Male\n516              9603 3.977695e-01     3 Female\n517              7994 3.000000e-01     2   Male\n518              5277 3.000000e-01     4   Male\n519              5054 3.000000e-01     3   Male\n520              5440 3.000000e-01     1 Female\n521              6551 7.446809e+02     4   Male\n522              5281 6.095745e+02    12 Female\n523              7145 1.427445e+02     6   Male\n524              5275 3.000000e-01     7 Female\n525              9542 2.973978e-02     7   Male\n526              9371 3.977695e-01    13 Female\n527              5598 4.095745e+02     8 Female\n528              7148 4.595745e+02     7   Male\n529              5624 3.000000e-01     8 Female\n530              6998 1.976341e+02     8 Female\n531              9286 3.776596e+02    11 Female\n532              7589 1.777603e+02    14 Female\n533              7095 4.312268e-01     3   Male\n534              5455 6.765957e+02     2 Female\n535              6257 7.978723e+02     2   Male\n536              8627 9.665427e-02     3   Male\n537              9786 1.879338e+02     2   Male\n538              8176 4.358670e+01     2 Female\n539              9198 3.000000e-01     3 Female\n540              6586 3.000000e-01     2   Male\n541              8850 2.638955e+01     5   Male\n542              9560 3.180523e+01    10 Female\n543              7144 1.746845e+02    14   Male\n544              8230 1.876972e+02     9   Male\n545              7559 1.044164e+02     6   Male\n546              5312 1.202681e+02     7   Male\n547              6560 1.630915e+02    14 Female\n548              6091 1.276025e+02     7 Female\n549              5578 8.880126e+01     7   Male\n550              5837 3.563830e+02     9   Male\n551              8347 2.212766e+02    14   Male\n552              6453 1.969121e+01    10 Female\n553              5758 3.755319e+02    13 Female\n554              5569 1.214511e+02     5   Male\n555              8766 1.034700e+02     4 Female\n556              8002 3.000000e-01     4 Female\n557              7839 3.643123e-01     5 Female\n558              5434 6.319703e-02     4 Female\n559              7636 3.000000e-01     4   Male\n560              6164 3.000000e-01     4   Male\n561              9243 3.000000e-01     3 Female\n562              5872 3.000000e-01     1 Female\n563              8079 3.000000e-01     4   Male\n564              9762 3.000000e-01     1   Male\n565              9476 3.000000e-01     1 Female\n566              8345 3.000000e-01     7   Male\n567              8128 1.664038e+02    13 Female\n568              7956 2.946809e+02    10 Female\n569              8677 4.391924e+01    14   Male\n570              5881 1.874606e+02    12 Female\n571              7498 1.143533e+02    14   Male\n572              8134 1.600158e+02     8   Male\n573              7748 1.635688e-01     7   Male\n574              7990 8.809148e+01    11 Female\n575              6184 1.337539e+02     8   Male\n576              6339 1.985804e+02    12   Male\n577              5113 1.578864e+02     9 Female\n578              9449 3.000000e-01     5 Female\n579              8110 3.000000e-01     4   Male\n580              9307 1.953642e-01     3 Female\n581              5555 1.119205e+00     2   Male\n582              9152 2.523636e+02     2   Male\n583              7969 3.000000e-01     3   Male\n584              6116 4.844371e+00     4 Female\n585              8294 3.000000e-01     4   Male\n586              8938 1.492553e+02     4 Female\n587              9539 1.993617e+02     5   Male\n588              9470 2.847682e-01     3 Female\n589              6677 3.145695e-01     6 Female\n590              8752 3.000000e-01     3   Male\n591              5574 3.406429e+01    11 Female\n592              5989 6.595745e+01    11   Male\n593              9813 3.000000e-01     7   Male\n594              6150 2.174545e+02     8   Male\n595              5730           NA     6 Female\n596              8038 5.957447e+01    10 Female\n597              5964 7.236364e+02     8 Female\n598              9043 3.000000e-01     8   Male\n599              5095 3.000000e-01     9 Female\n600              8922 3.000000e-01     8   Male\n601              5469 2.676364e+02    13   Male\n602              6726 1.891489e+02    11   Male\n603              7495 3.036364e+02     8 Female\n604              8159 3.000000e-01     2 Female\n605              6709 3.000000e-01     4   Male\n606              5855 3.000000e-01     2   Male\n607              6058 3.000000e-01     2 Female\n608              7292 3.000000e-01     4   Male\n609              6437 1.447020e+00     2   Male\n610              9326 2.130909e+02     4 Female\n611              8222 1.357616e-01     2 Female\n612              6789 3.000000e-01     4 Female\n613              6348 3.000000e-01     1 Female\n614              5958 5.534545e+02     4 Female\n615              9211 1.891489e+02    12 Female\n616              9450 7.202128e+01     7 Female\n617              6540 3.250287e+01    11   Male\n618              8796 1.655629e-02     6   Male\n619              7971 3.123636e+02     8   Male\n620              7549 3.000000e-01    14   Male\n621              9799 7.138298e+01    11   Male\n622              7013 3.000000e-01     7 Female\n623              5599 6.946809e+01    14 Female\n624              8601 4.012629e+01     6   Male\n625              7383 1.629787e+02    13 Female\n626              6656 1.508511e+02    13 Female\n627              5641 1.655629e-02     3   Male\n628              6222 3.000000e-01     1   Male\n629              7674 4.635762e-02     3   Male\n630              5293 3.000000e-01     1 Female\n631              6715 3.000000e-01     1 Female\n632              7057 3.000000e-01     2   Male\n633              7072 1.942553e+02     4   Male\n634              6380 3.690909e+02     4   Male\n635              6762 3.000000e-01     2 Female\n636              5799 3.000000e-01     4 Female\n637              6681 2.847682e+00     5   Male\n638              8755 1.435106e+02     3 Female\n639              6896 3.000000e-01     3   Male\n640              5945 4.752009e+01     6 Female\n641              5035 2.621125e+01    11 Female\n642              6776 1.055319e+02     9 Female\n643              7863 3.000000e-01     7 Female\n644              9836 1.149007e+00     8   Male\n645              7860 2.927273e+02    NA Female\n646              5248 3.000000e-01     8 Female\n647              5677 3.000000e-01    14 Female\n648              9576 4.839265e+01    10   Male\n649              5824 3.000000e-01    10   Male\n650              9184 3.000000e-01    11 Female\n651              5397 2.251656e-01    13 Female\n\n\n\ndf$slum &lt;- NULL # this is the same as above\n\nWe can also grab the age column using the $ operator.\n\ndf$age\n\n  [1] 3.176895e-01 3.436823e+00 3.000000e-01 1.432363e+02 4.476534e-01\n  [6] 2.527076e-02 6.101083e-01 3.000000e-01 2.916968e+00 1.649819e+00\n [11] 4.574007e+00 1.583904e+02           NA 1.065068e+02 1.113870e+02\n [16] 4.144893e+01 3.000000e-01 2.527076e-01 8.159247e+01 1.825342e+02\n [21] 4.244656e+01 1.193493e+02 3.000000e-01 3.000000e-01 9.025271e-01\n [26] 3.501805e-01 3.000000e-01 1.227437e+00 1.702055e+02 3.000000e-01\n [31] 4.801444e-01 2.527076e-02 3.000000e-01 5.776173e-02 4.801444e-01\n [36] 3.826715e-01 3.000000e-01 4.048558e+02 3.000000e-01 5.451264e-01\n [41] 3.000000e-01 5.590753e+01 2.202166e-01 1.709760e+02 1.227437e+00\n [46] 4.567527e+02 4.838480e+01 1.227437e-01 1.877256e-01 3.000000e-01\n [51] 3.501805e-01 3.339350e+00 3.000000e-01 5.451264e-01           NA\n [56] 2.104693e+00           NA 3.826715e-01 3.926366e+01 1.129964e+00\n [61] 3.501805e+00 7.542808e+01 4.800475e+01 1.000000e+00 4.068884e+01\n [66] 3.000000e-01 4.377672e+01 1.193493e+02 6.977740e+01 1.373288e+02\n [71] 1.642979e+02           NA 1.542808e+02 6.033058e-01 2.809917e-01\n [76] 1.966942e+00 2.041322e+00 2.115702e+00 4.663043e+02 3.000000e-01\n [81] 1.500796e+02 1.543790e+02 2.561983e-01 1.596338e+02 1.732484e+02\n [86] 4.641304e+02 3.736364e+01 1.572452e+02 3.000000e-01 3.000000e-01\n [91] 8.264463e-02 6.776859e-01 7.272727e-01 2.066116e-01 1.966942e+00\n [96] 3.000000e-01 3.000000e-01 2.809917e-01 8.016529e-01 1.818182e-01\n[101] 1.818182e-01 8.264463e-02 3.422727e+01 8.743506e+00 3.000000e-01\n[106] 1.641720e+02 4.049587e-01 1.001592e+02 4.489130e+02 1.101911e+02\n[111] 4.440909e+01 1.288217e+02 2.840909e+01 1.003981e+02 8.512397e-01\n[116] 1.322314e-01 1.297521e+00 1.570248e-01 1.966942e+00 1.536624e+02\n[121] 3.000000e-01 3.000000e-01 1.074380e+00 1.099174e+00 3.057851e-01\n[126] 3.000000e-01 5.785124e-02 4.391304e+02 6.130435e+02 1.074380e-01\n[131] 7.125796e+01 4.222727e+01 1.620223e+02 3.750000e+01 1.534236e+02\n[136] 6.239130e+02 5.521739e+02 5.785124e-02 6.547945e-01 8.767123e-02\n[141] 3.000000e-01 2.849315e+00 3.835616e-02 2.849315e-01 4.649315e+00\n[146] 1.369863e-01 3.589041e-01 1.049315e+00 4.668998e+01 1.473510e+02\n[151] 4.589744e+01 2.109589e-01 1.741722e+02 2.496503e+01 1.850993e+02\n[156] 1.863014e-01 1.863014e-01 4.589744e+01 1.942881e+02 5.079646e+02\n[161] 8.767123e-01 2.750685e+00 1.503311e+02 3.000000e-01 3.095890e-01\n[166] 3.000000e-01 6.371681e+02 6.054795e-01 1.955298e+02 1.786424e+02\n[171] 1.120861e+02 1.331954e+02 2.159292e+02 5.628319e+02 1.900662e+02\n[176] 6.547945e-01 1.665753e+00 1.739238e+02 9.991722e+01 9.321192e+01\n[181] 8.767123e-02           NA 6.794521e-01 5.808219e-01 1.369863e-01\n[186] 2.060274e+00 1.610099e+02 4.082192e-01 8.273973e-01 4.601770e+02\n[191] 1.389073e+02 3.867133e+01 9.260274e-01 5.918874e+01 1.870861e+02\n[196] 4.328767e-01 6.301370e-02 3.000000e-01 1.548013e+02 5.819536e+01\n[201] 1.724338e+02 1.932401e+01 2.164420e+00 9.757412e-01 1.509434e-01\n[206] 1.509434e-01 7.766571e+01 4.319563e+01 1.752022e-01 3.094775e+01\n[211] 1.266846e-01 2.919806e+01 9.545455e+00 2.735115e+01 1.314841e+02\n[216] 3.643985e+01 1.498559e+02 9.363636e+00 2.479784e-01 5.390836e-02\n[221] 8.787062e-01 1.994609e-01 3.000000e-01 3.000000e-01 5.390836e-03\n[226] 4.177898e-01 3.000000e-01 2.479784e-01 2.964960e-02 2.964960e-01\n[231] 5.148248e+00 1.994609e-01 3.000000e-01 1.779539e+02 3.290210e+02\n[236] 3.000000e-01 1.809798e+02 4.905660e-01 1.266846e-01 1.543948e+02\n[241] 1.379683e+02 6.153846e+02 1.474784e+02 3.000000e-01 1.024259e+00\n[246] 4.444056e+02 3.000000e-01 2.504043e+00 3.000000e-01 3.000000e-01\n[251] 7.816712e-02 3.000000e-01 5.390836e-02 1.494236e+02 5.972622e+01\n[256] 6.361186e-01 1.837896e+02 1.320809e+02 1.571906e-01 1.520231e+02\n[261] 3.000000e-01 3.000000e-01 1.823699e+02 3.000000e-01 2.173913e+00\n[266] 2.142202e+01 3.000000e-01 3.408027e+00 4.155963e+01 9.698997e-02\n[271] 1.238532e+01 9.528926e+00 1.916185e+02 1.060201e+00 3.679104e+02\n[276] 4.288991e+01 9.971098e+01 3.000000e-01 1.208092e+02 3.000000e-01\n[281] 6.688963e-03 2.505017e+00 1.481605e+00 3.000000e-01 5.183946e-01\n[286] 3.000000e-01 1.872910e-01 3.678930e-01 3.000000e-01 4.529851e+02\n[291] 3.169725e+01 3.000000e-01 4.922018e+01 2.548507e+02 1.661850e+02\n[296] 9.164179e+02 3.678930e-01 1.236994e+02 6.705202e+01 3.834862e+01\n[301] 1.963211e+00 3.000000e-01 2.474916e-01 3.000000e-01 2.173913e-01\n[306] 8.193980e-01 2.444816e+00 3.000000e-01 1.571906e-01 1.849711e+02\n[311] 6.119403e+02 3.000000e-01 4.280936e-01 9.698997e-02 3.678930e-02\n[316] 4.832090e+02 1.390173e+02 3.000000e-01 6.555970e+02 1.526012e+02\n[321] 3.000000e-01 7.222222e-01 7.724426e+01 3.000000e-01 6.111111e-01\n[326] 1.555556e+00 3.055556e-01 1.500000e+00 1.470772e+02 1.694444e+00\n[331] 3.138298e+02 1.414405e+02 1.990605e+02 4.212766e+02 3.000000e-01\n[336] 3.000000e-01 6.478723e+02 3.000000e-01 2.222222e+00 3.000000e-01\n[341] 2.055556e+00 2.777778e-02 8.333333e-02 1.032359e+02 1.611111e+00\n[346] 8.333333e-02 2.333333e+00 5.755319e+02 1.686848e+02 1.111111e-01\n[351] 3.000000e-01 8.372340e+02 3.000000e-01 3.784504e+01 3.819149e+02\n[356] 5.555556e-02 3.000000e+02 1.855950e+02 1.944444e-01 3.000000e-01\n[361] 5.555556e-02 1.138889e+00 4.254237e+01 3.000000e-01 3.000000e-01\n[366] 3.000000e-01 3.000000e-01 3.138298e+02 1.235908e+02 4.159574e+02\n[371] 3.009685e+01 1.567850e+02 1.367432e+02 3.731235e+01 9.164927e+01\n[376] 2.936170e+02 8.820459e+01 1.035491e+02 7.379958e+01 3.000000e-01\n[381] 1.718750e+02 2.128527e+00 1.253918e+00 2.382445e-01 4.639498e-01\n[386] 1.253918e-01 1.253918e-01 3.000000e-01 1.000000e+00 1.570043e+02\n[391] 4.344086e+02 2.184953e+00 1.507837e+00 3.228840e-01 4.588024e+01\n[396] 1.660560e+02 3.000000e-01 3.043011e+02 2.612903e+02 1.621767e+02\n[401] 3.228840e-01 4.639498e-01 2.495298e+00 3.257053e+00 3.793103e-01\n[406]           NA 6.896552e-02 3.000000e-01 1.423197e+00 3.000000e-01\n[411] 3.000000e-01 1.786638e+02 3.279570e+02           NA 1.903017e+02\n[416] 1.654095e+02 4.639498e-01 1.815733e+02 1.366771e+00 1.536050e-01\n[421] 1.306587e+01 2.129032e+02 1.925647e+02 3.000000e-01 1.028213e+00\n[426] 3.793103e-01 8.025078e-01 4.860215e+02 3.000000e-01 2.100313e-01\n[431] 2.767665e+01 1.592476e+00 9.717868e-02 1.028213e+00 3.793103e-01\n[436] 1.292026e+02 4.425150e+01 3.193548e+02 1.860991e+02 6.614420e-01\n[441] 5.203762e-01 1.330819e+02 1.673491e+02 3.000000e-01 1.117457e+02\n[446] 3.045509e+01 3.000000e-01 8.280255e-02 3.000000e-01 1.200637e+00\n[451] 1.687898e-01 7.367273e+02 8.280255e-02 5.127389e-01 1.974522e-01\n[456] 7.993631e-01 3.000000e-01 3.298182e+02 9.736842e+01 3.000000e-01\n[461] 3.000000e-01 4.214545e+02 3.000000e-01 2.578182e+02 2.261147e-01\n[466] 3.000000e-01 1.883901e+02 9.458204e+01 3.000000e-01 3.000000e-01\n[471] 7.707006e-01 5.032727e+02 1.544586e+00 1.431115e+02 3.000000e-01\n[476] 1.458599e+00 1.247678e+02           NA 4.334545e+02 3.000000e-01\n[481] 6.156364e+02 9.574303e+01 1.928019e+02 1.888545e+02 1.598297e+02\n[486] 5.127389e-01 1.171053e+02           NA 2.547771e-02 1.707430e+02\n[491] 3.000000e-01 1.869969e+02 4.731481e+01 1.988390e+02 3.000000e-01\n[496] 8.808050e+01 2.003185e+00 3.000000e-01 3.509259e+01 9.365325e+01\n[501] 3.000000e-01 3.736111e+01 1.674923e+02 8.808050e+01 1.656347e+02\n[506] 3.722222e+01 6.756364e+02 3.000000e-01 1.698142e+02 1.628483e+02\n[511] 5.985130e-01 1.903346e+00 3.000000e-01 3.000000e-01 8.996283e-01\n[516] 3.977695e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[521] 7.446809e+02 6.095745e+02 1.427445e+02 3.000000e-01 2.973978e-02\n[526] 3.977695e-01 4.095745e+02 4.595745e+02 3.000000e-01 1.976341e+02\n[531] 3.776596e+02 1.777603e+02 4.312268e-01 6.765957e+02 7.978723e+02\n[536] 9.665427e-02 1.879338e+02 4.358670e+01 3.000000e-01 3.000000e-01\n[541] 2.638955e+01 3.180523e+01 1.746845e+02 1.876972e+02 1.044164e+02\n[546] 1.202681e+02 1.630915e+02 1.276025e+02 8.880126e+01 3.563830e+02\n[551] 2.212766e+02 1.969121e+01 3.755319e+02 1.214511e+02 1.034700e+02\n[556] 3.000000e-01 3.643123e-01 6.319703e-02 3.000000e-01 3.000000e-01\n[561] 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01 3.000000e-01\n[566] 3.000000e-01 1.664038e+02 2.946809e+02 4.391924e+01 1.874606e+02\n[571] 1.143533e+02 1.600158e+02 1.635688e-01 8.809148e+01 1.337539e+02\n[576] 1.985804e+02 1.578864e+02 3.000000e-01 3.000000e-01 1.953642e-01\n[581] 1.119205e+00 2.523636e+02 3.000000e-01 4.844371e+00 3.000000e-01\n[586] 1.492553e+02 1.993617e+02 2.847682e-01 3.145695e-01 3.000000e-01\n[591] 3.406429e+01 6.595745e+01 3.000000e-01 2.174545e+02           NA\n[596] 5.957447e+01 7.236364e+02 3.000000e-01 3.000000e-01 3.000000e-01\n[601] 2.676364e+02 1.891489e+02 3.036364e+02 3.000000e-01 3.000000e-01\n[606] 3.000000e-01 3.000000e-01 3.000000e-01 1.447020e+00 2.130909e+02\n[611] 1.357616e-01 3.000000e-01 3.000000e-01 5.534545e+02 1.891489e+02\n[616] 7.202128e+01 3.250287e+01 1.655629e-02 3.123636e+02 3.000000e-01\n[621] 7.138298e+01 3.000000e-01 6.946809e+01 4.012629e+01 1.629787e+02\n[626] 1.508511e+02 1.655629e-02 3.000000e-01 4.635762e-02 3.000000e-01\n[631] 3.000000e-01 3.000000e-01 1.942553e+02 3.690909e+02 3.000000e-01\n[636] 3.000000e-01 2.847682e+00 1.435106e+02 3.000000e-01 4.752009e+01\n[641] 2.621125e+01 1.055319e+02 3.000000e-01 1.149007e+00 2.927273e+02\n[646] 3.000000e-01 3.000000e-01 4.839265e+01 3.000000e-01 3.000000e-01\n[651] 2.251656e-01"
+    "text": "Using indexing to subset by columns\nWe can also subset data frames and matrices (2-dimensional objects) using the bracket [ row , column ]. We can subset by columns and pull the x column using the index of the column or the column name. Leaving either row or column dimension blank means to select all of them.\nFor example, here I am pulling the 3rd column, which has the variable name age, for all of rows.\n\ndf[ , \"age\"] #same as df[ , 3]\n\nWe can select multiple columns using multiple column names, again this is selecting these variables for all of the rows.\n\ndf[, c(\"age\", \"gender\")] #same as df[ , c(3,4)]\n\n    age gender\n1     2 Female\n2     4 Female\n3     4   Male\n4     4   Male\n5     1   Male\n6     4   Male\n7     4 Female\n8    NA Female\n9     4   Male\n10    2   Male\n11    3   Male\n12   15 Female\n13    8   Male\n14   12   Male\n15   15   Male\n16    9   Male\n17    8   Male\n18    7 Female\n19   11 Female\n20   10   Male\n21    8   Male\n22   11 Female\n23    2   Male\n24    2 Female\n25    3 Female\n26    5   Male\n27    1   Male\n28    3 Female\n29    5 Female\n30    5 Female\n31    3   Male\n32    1   Male\n33    4 Female\n34    3   Male\n35    2 Female\n36   11 Female\n37    7   Male\n38    8   Male\n39    6   Male\n40    6   Male\n41   11 Female\n42   10   Male\n43    6 Female\n44   12   Male\n45   11   Male\n46   10   Male\n47   11   Male\n48   13 Female\n49    3 Female\n50    4 Female\n51    3   Male\n52    1   Male\n53    2 Female\n54    2 Female\n55    4   Male\n56    2   Male\n57    2   Male\n58    3 Female\n59    3 Female\n60    4   Male\n61    1 Female\n62   13 Female\n63   13 Female\n64    6   Male\n65   13   Male\n66    5 Female\n67   13 Female\n68   14   Male\n69   13   Male\n70    8 Female\n71    7   Male\n72    6 Female\n73   13   Male\n74    3   Male\n75    4   Male\n76    2   Male\n77   NA   Male\n78    5 Female\n79    3   Male\n80    3   Male\n81   14   Male\n82   11 Female\n83    7 Female\n84    7   Male\n85   11 Female\n86    9 Female\n87   14   Male\n88   13 Female\n89    1   Male\n90    1   Male\n91    4   Male\n92    1 Female\n93    2   Male\n94    3 Female\n95    2   Male\n96    1   Male\n97    2   Male\n98    2 Female\n99    4 Female\n100   5 Female\n101   5   Male\n102   6 Female\n103  14 Female\n104  14   Male\n105  10   Male\n106   6 Female\n107   6   Male\n108   8   Male\n109   6 Female\n110  12 Female\n111  12   Male\n112  14 Female\n113  15   Male\n114  12 Female\n115   4 Female\n116   4   Male\n117   3 Female\n118  NA   Male\n119   2 Female\n120   3   Male\n121  NA Female\n122   3 Female\n123   3   Male\n124   2 Female\n125   4 Female\n126  10 Female\n127   7 Female\n128  11 Female\n129   6 Female\n130  11   Male\n131   9   Male\n132   6   Male\n133  13 Female\n134  10 Female\n135   6 Female\n136  11 Female\n137   7   Male\n138   6 Female\n139   4 Female\n140   4 Female\n141   4   Male\n142   4 Female\n143   4   Male\n144   4   Male\n145   3   Male\n146   4 Female\n147   3   Male\n148   3   Male\n149  13 Female\n150   7 Female\n151  10   Male\n152   6   Male\n153  10 Female\n154  12 Female\n155  10   Male\n156  10   Male\n157  13   Male\n158  13 Female\n159   5 Female\n160   3 Female\n161   4   Male\n162   1   Male\n163   3 Female\n164   4   Male\n165   4   Male\n166   1   Male\n167   5 Female\n168   6 Female\n169  14 Female\n170   6   Male\n171  13 Female\n172   9   Male\n173  11   Male\n174  10   Male\n175   5 Female\n176  14   Male\n177   7   Male\n178  10   Male\n179   6   Male\n180   5   Male\n181   3 Female\n182   4   Male\n183   2 Female\n184   3   Male\n185   3 Female\n186   2 Female\n187   3   Male\n188   5 Female\n189   2   Male\n190   3 Female\n191  14 Female\n192   9 Female\n193  14 Female\n194   9 Female\n195   8 Female\n196   7   Male\n197  13   Male\n198   8 Female\n199   6   Male\n200  12 Female\n201  14 Female\n202  15 Female\n203   2 Female\n204   4 Female\n205   3   Male\n206   3 Female\n207   3   Male\n208   4 Female\n209   3   Male\n210  14 Female\n211   8   Male\n212   7   Male\n213  14 Female\n214  13 Female\n215  13 Female\n216   7   Male\n217   8 Female\n218  10 Female\n219   9   Male\n220   9 Female\n221   3 Female\n222   4   Male\n223   4 Female\n224   4   Male\n225   2 Female\n226   1 Female\n227   3 Female\n228   2   Male\n229   3   Male\n230   5   Male\n231   2 Female\n232   2   Male\n233   9   Male\n234  13   Male\n235  10 Female\n236   6   Male\n237  13 Female\n238  11   Male\n239  10   Male\n240   8 Female\n241   9 Female\n242  10   Male\n243  14   Male\n244   1 Female\n245   2   Male\n246   3 Female\n247   2   Male\n248   3 Female\n249   2 Female\n250   3 Female\n251   5 Female\n252  10 Female\n253   7   Male\n254  13 Female\n255  15   Male\n256  11 Female\n257  10 Female\n258   3 Female\n259   2   Male\n260   3   Male\n261   3 Female\n262   3 Female\n263   4   Male\n264   3   Male\n265   2   Male\n266   4   Male\n267   2 Female\n268   8   Male\n269  11   Male\n270   6   Male\n271  14 Female\n272  14   Male\n273   5 Female\n274   5   Male\n275  10 Female\n276  13   Male\n277   6   Male\n278   5   Male\n279  12   Male\n280   2   Male\n281   3 Female\n282   1 Female\n283   1   Male\n284   1 Female\n285   2 Female\n286   5 Female\n287   5   Male\n288   4 Female\n289   2   Male\n290  NA Female\n291   6 Female\n292   8   Male\n293  15   Male\n294  11   Male\n295  14   Male\n296   6   Male\n297  10 Female\n298  12   Male\n299  14   Male\n300  10   Male\n301   1 Female\n302   3   Male\n303   2   Male\n304   3 Female\n305   4   Male\n306   3   Male\n307   4 Female\n308   4   Male\n309   1 Female\n310   7   Male\n311  11 Female\n312   7 Female\n313   5 Female\n314  10   Male\n315   9 Female\n316  13   Male\n317  11 Female\n318  13   Male\n319   9 Female\n320  15 Female\n321   7 Female\n322   4   Male\n323   1   Male\n324   1   Male\n325   2 Female\n326   2 Female\n327   3   Male\n328   2   Male\n329   3   Male\n330   4 Female\n331   7 Female\n332  11 Female\n333  10 Female\n334   5   Male\n335   8   Male\n336  15   Male\n337  14   Male\n338   2   Male\n339   2 Female\n340   2   Male\n341   5   Male\n342   4 Female\n343   3   Male\n344   5 Female\n345   4 Female\n346   2 Female\n347   1 Female\n348   7   Male\n349   8 Female\n350  NA   Male\n351   9   Male\n352   8 Female\n353   5   Male\n354  14   Male\n355  14   Male\n356   7 Female\n357  13 Female\n358   2   Male\n359   1 Female\n360   1   Male\n361   4 Female\n362   3   Male\n363   4 Female\n364   3   Male\n365   1   Male\n366   5 Female\n367   4 Female\n368   4 Female\n369   4   Male\n370  11   Male\n371  15 Female\n372  12 Female\n373  11 Female\n374   8 Female\n375  13   Male\n376  10 Female\n377  10 Female\n378  15   Male\n379   8 Female\n380  14   Male\n381   4   Male\n382   1   Male\n383   5 Female\n384   2   Male\n385   2 Female\n386   4   Male\n387   4   Male\n388   2 Female\n389   3   Male\n390  11   Male\n391  10 Female\n392   6   Male\n393  12 Female\n394  10 Female\n395   8   Male\n396   8   Male\n397  13   Male\n398  10   Male\n399  13 Female\n400  10   Male\n401   2   Male\n402   4 Female\n403   3 Female\n404   2 Female\n405   1 Female\n406   3   Male\n407   3 Female\n408   4   Male\n409   5 Female\n410   5 Female\n411   1 Female\n412  11   Male\n413   6   Male\n414  14 Female\n415   8   Male\n416   8 Female\n417   9 Female\n418   7   Male\n419   6   Male\n420  12 Female\n421   8   Male\n422  11 Female\n423  14   Male\n424   3 Female\n425   1 Female\n426   5 Female\n427   2 Female\n428   3 Female\n429   4 Female\n430   2   Male\n431   3 Female\n432   4   Male\n433   1 Female\n434   7 Female\n435  10   Male\n436  11   Male\n437   7 Female\n438  10 Female\n439  14 Female\n440   7 Female\n441  11   Male\n442  12   Male\n443  10 Female\n444   6   Male\n445  13   Male\n446   8 Female\n447   2   Male\n448   3 Female\n449   1 Female\n450   2 Female\n451  NA   Male\n452  NA Female\n453   4   Male\n454   4   Male\n455   1   Male\n456   2 Female\n457   2   Male\n458  12   Male\n459  12 Female\n460   8 Female\n461  14 Female\n462  13 Female\n463   6   Male\n464  11 Female\n465  11   Male\n466  10 Female\n467  12   Male\n468  14 Female\n469  11 Female\n470   1   Male\n471   2 Female\n472   3   Male\n473   3 Female\n474   5 Female\n475   3   Male\n476   1   Male\n477   4 Female\n478   4 Female\n479   4   Male\n480   2 Female\n481   5 Female\n482   7   Male\n483   8   Male\n484  10   Male\n485   6 Female\n486   7   Male\n487  10 Female\n488   6   Male\n489   6 Female\n490  15 Female\n491   5   Male\n492   3   Male\n493   5   Male\n494   3 Female\n495   5   Male\n496   5   Male\n497   1 Female\n498   1   Male\n499   7 Female\n500  14 Female\n501   9   Male\n502  10 Female\n503  10 Female\n504  11   Male\n505  11 Female\n506  12 Female\n507  11 Female\n508  12   Male\n509  12   Male\n510  10 Female\n511   1   Male\n512   2 Female\n513   4   Male\n514   2   Male\n515   3   Male\n516   3 Female\n517   2   Male\n518   4   Male\n519   3   Male\n520   1 Female\n521   4   Male\n522  12 Female\n523   6   Male\n524   7 Female\n525   7   Male\n526  13 Female\n527   8 Female\n528   7   Male\n529   8 Female\n530   8 Female\n531  11 Female\n532  14 Female\n533   3   Male\n534   2 Female\n535   2   Male\n536   3   Male\n537   2   Male\n538   2 Female\n539   3 Female\n540   2   Male\n541   5   Male\n542  10 Female\n543  14   Male\n544   9   Male\n545   6   Male\n546   7   Male\n547  14 Female\n548   7 Female\n549   7   Male\n550   9   Male\n551  14   Male\n552  10 Female\n553  13 Female\n554   5   Male\n555   4 Female\n556   4 Female\n557   5 Female\n558   4 Female\n559   4   Male\n560   4   Male\n561   3 Female\n562   1 Female\n563   4   Male\n564   1   Male\n565   1 Female\n566   7   Male\n567  13 Female\n568  10 Female\n569  14   Male\n570  12 Female\n571  14   Male\n572   8   Male\n573   7   Male\n574  11 Female\n575   8   Male\n576  12   Male\n577   9 Female\n578   5 Female\n579   4   Male\n580   3 Female\n581   2   Male\n582   2   Male\n583   3   Male\n584   4 Female\n585   4   Male\n586   4 Female\n587   5   Male\n588   3 Female\n589   6 Female\n590   3   Male\n591  11 Female\n592  11   Male\n593   7   Male\n594   8   Male\n595   6 Female\n596  10 Female\n597   8 Female\n598   8   Male\n599   9 Female\n600   8   Male\n601  13   Male\n602  11   Male\n603   8 Female\n604   2 Female\n605   4   Male\n606   2   Male\n607   2 Female\n608   4   Male\n609   2   Male\n610   4 Female\n611   2 Female\n612   4 Female\n613   1 Female\n614   4 Female\n615  12 Female\n616   7 Female\n617  11   Male\n618   6   Male\n619   8   Male\n620  14   Male\n621  11   Male\n622   7 Female\n623  14 Female\n624   6   Male\n625  13 Female\n626  13 Female\n627   3   Male\n628   1   Male\n629   3   Male\n630   1 Female\n631   1 Female\n632   2   Male\n633   4   Male\n634   4   Male\n635   2 Female\n636   4 Female\n637   5   Male\n638   3 Female\n639   3   Male\n640   6 Female\n641  11 Female\n642   9 Female\n643   7 Female\n644   8   Male\n645  NA Female\n646   8 Female\n647  14 Female\n648  10   Male\n649  10   Male\n650  11 Female\n651  13 Female\n\n\nWe can remove select columns using indexing as well, OR by simply changing the column to NULL\n\ndf[, -5] #remove column 5, \"slum\" variable\n\n\ndf$slum &lt;- NULL # this is the same as above\n\nWe can also grab the age column using the $ operator, again this is selecting the variable for all of the rows.\n\ndf$age"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#using-indexing-to-subset-by-rows",
     "href": "modules/Module06-DataSubset.html#using-indexing-to-subset-by-rows",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Using indexing to subset by rows",
-    "text": "Using indexing to subset by rows\nWe can use indexing to also subset by rows. For example, here we pull the 100th observation/row.\n\ndf[100,] \n\n    IgG_concentration       age age gender     slum\n100              8122 0.1818182   5 Female Non slum\n\n\nAnd, here we pull the age of the 100th observation/row.\n\ndf[100,\"age\"] \n\n[1] 0.1818182"
+    "text": "Using indexing to subset by rows\nWe can use indexing to also subset by rows. For example, here we pull the 100th observation/row.\n\ndf[100,] \n\n    observation_id IgG_concentration age gender     slum\n100           8122         0.1818182   5 Female Non slum\n\n\nAnd, here we pull the age of the 100th observation/row.\n\ndf[100,\"age\"] \n\n[1] 5"
   },
   {
     "objectID": "modules/Module06-DataSubset.html#logical-operators-to-help-identify-and-missing-data",
     "href": "modules/Module06-DataSubset.html#logical-operators-to-help-identify-and-missing-data",
     "title": "Module 6: Get to Know Your Data and Subsetting",
     "section": "Logical operators to help identify and missing data",
-    "text": "Logical operators to help identify and missing data\n\n\n\noperator\noperator option\ndescription\n\n\n\n\nis.na\n\nis NAN or NA\n\n\nis.nan\n\nis NAN\n\n\n!is.na\n\nis not NAN or NA\n\n\n!is.nan\n\nis not NAN\n\n\nis.infinite\n\nis infinite\n\n\nany\n\nare any TRUE\n\n\nwhich\n\nwhich are TRUE"
+    "text": "Logical operators to help identify and missing data\n\n\n\noperator\ndescription\n\n\n\n\n\nis.na\nis NAN or NA\n\n\n\nis.nan\nis NAN\n\n\n\n!is.na\nis not NAN or NA\n\n\n\n!is.nan\nis not NAN\n\n\n\nis.infinite\nis infinite\n\n\n\nany\nare any TRUE\n\n\n\nall\nall are TRUE\n\n\n\nwhich\nwhich are TRUE"
+  },
+  {
+    "objectID": "modules/Module02-Functions.html#installing-and-attaching-packages",
+    "href": "modules/Module02-Functions.html#installing-and-attaching-packages",
+    "title": "Module 2: Functions",
+    "section": "Installing and attaching packages",
+    "text": "Installing and attaching packages\nTo use the bundle or “package” of code (and or possibly data) from a package, you need to install and also attach the package.\nTo install a package you can\n\ngo to R Studio Menu Bar Tools Menu —&gt; Install Packages in the RStudio header\n\nOR\n\nuse the following code:\n\n\ninstall.packages(\"package_name\")"
+  },
+  {
+    "objectID": "modules/Module02-Functions.html#installing-and-attaching-packages-1",
+    "href": "modules/Module02-Functions.html#installing-and-attaching-packages-1",
+    "title": "Module 2: Functions",
+    "section": "Installing and attaching packages",
+    "text": "Installing and attaching packages\nTo attach (i.e., be able to use the package) you can use the following code:\n\nrequire(package_name) #library(package_name) also works\n\nMore on installing and attaching packages later…"
+  },
+  {
+    "objectID": "modules/Module05-DataImportExport.html#attach-package",
+    "href": "modules/Module05-DataImportExport.html#attach-package",
+    "title": "Module 5: Data Import and Export",
+    "section": "4. Attach Package",
+    "text": "4. Attach Package\nReminder - To attach (i.e., be able to use the package) you can use the following code:\n\nrequire(package_name)\n\nTherefore,\n\nrequire(readxl)"
+  },
+  {
+    "objectID": "modules/Module06-DataSubset.html#for-indexing-for-data-frame",
+    "href": "modules/Module06-DataSubset.html#for-indexing-for-data-frame",
+    "title": "Module 6: Get to Know Your Data and Subsetting",
+    "section": "$ for indexing for data frame",
+    "text": "$ for indexing for data frame\n$ allows only a literal character string or a symbol as the index. For a data frame it extracts a variable.\n\ndf$IgG_concentration\n\nNote, if you have spaces in your variable name, you will need to use back ticks ` after the $. This is a good reason to not create variables / column names with spaces."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#learning-objectives",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#learning-objectives",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Learning Objectives",
+    "text": "Learning Objectives\nAfter module 7, you should be able to…\n\nCreate new variables\nCharacterize variable classes\nManipulate the classes of variables\nConduct 1 variable data summaries"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#import-data-for-this-module",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#import-data-for-this-module",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Import data for this module",
+    "text": "Import data for this module\nLet’s first read in the data from the previous module and look at it briefly with a new function head().\n\ndf &lt;- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum\n\n\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\nReturn the First or Last Parts of an Object\nDescription:\n Returns the first or last parts of a vector, matrix, table, data\n frame or function.  Since 'head()' and 'tail()' are generic\n functions, they may also have been extended to other classes.\nUsage:\n head(x, ...)\n ## Default S3 method:\n head(x, n = 6L, ...)\n \n ## S3 method for class 'matrix'\n head(x, n = 6L, ...) # is exported as head.matrix()\n ## NB: The methods for 'data.frame' and 'array'  are identical to the 'matrix' one\n \n ## S3 method for class 'ftable'\n head(x, n = 6L, ...)\n ## S3 method for class 'function'\n head(x, n = 6L, ...)\n \n \n tail(x, ...)\n ## Default S3 method:\n tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n ## S3 method for class 'matrix'\n tail(x, n = 6L, keepnums = TRUE, addrownums, ...) # exported as tail.matrix()\n ## NB: The methods for 'data.frame', 'array', and 'table'\n ##     are identical to the  'matrix'  one\n \n ## S3 method for class 'ftable'\n tail(x, n = 6L, keepnums = FALSE, addrownums, ...)\n ## S3 method for class 'function'\n tail(x, n = 6L, ...)\n \nArguments:\n   x: an object\n\n   n: an integer vector of length up to 'dim(x)' (or 1, for\n      non-dimensioned objects).  A 'logical' is silently coerced to\n      integer.  Values specify the indices to be selected in the\n      corresponding dimension (or along the length) of the object.\n      A positive value of 'n[i]' includes the first/last 'n[i]'\n      indices in that dimension, while a negative value excludes\n      the last/first 'abs(n[i])', including all remaining indices.\n      'NA' or non-specified values (when 'length(n) &lt;\n      length(dim(x))') select all indices in that dimension. Must\n      contain at least one non-missing value.\nkeepnums: in each dimension, if no names in that dimension are present, create them using the indices included in that dimension. Ignored if ‘dim(x)’ is ‘NULL’ or its length 1.\naddrownums: deprecated - ‘keepnums’ should be used instead. Taken as the value of ‘keepnums’ if it is explicitly set when ‘keepnums’ is not.\n ...: arguments to be passed to or from other methods.\nDetails:\n For vector/array based objects, 'head()' ('tail()') returns a\n subset of the same dimensionality as 'x', usually of the same\n class. For historical reasons, by default they select the first\n (last) 6 indices in the first dimension (\"rows\") or along the\n length of a non-dimensioned vector, and the full extent (all\n indices) in any remaining dimensions. 'head.matrix()' and\n 'tail.matrix()' are exported.\n\n The default and array(/matrix) methods for 'head()' and 'tail()'\n are quite general. They will work as is for any class which has a\n 'dim()' method, a 'length()' method (only required if 'dim()'\n returns 'NULL'), and a '[' method (that accepts the 'drop'\n argument and can subset in all dimensions in the dimensioned\n case).\n\n For functions, the lines of the deparsed function are returned as\n character strings.\n\n When 'x' is an array(/matrix) of dimensionality two and more,\n 'tail()' will add dimnames similar to how they would appear in a\n full printing of 'x' for all dimensions 'k' where 'n[k]' is\n specified and non-missing and 'dimnames(x)[[k]]' (or 'dimnames(x)'\n itself) is 'NULL'.  Specifically, the form of the added dimnames\n will vary for different dimensions as follows:\n\n 'k=1' (rows): '\"[n,]\"' (right justified with whitespace padding)\n\n 'k=2' (columns): '\"[,n]\"' (with _no_ whitespace padding)\n\n 'k&gt;2' (higher dims): '\"n\"', i.e., the indices as _character_\n      values\n\n Setting 'keepnums = FALSE' suppresses this behaviour.\n\n As 'data.frame' subsetting ('indexing') keeps 'attributes', so do\n the 'head()' and 'tail()' methods for data frames.\nValue:\n An object (usually) like 'x' but generally smaller.  Hence, for\n 'array's, the result corresponds to 'x[.., drop=FALSE]'.  For\n 'ftable' objects 'x', a transformed 'format(x)'.\nNote:\n For array inputs the output of 'tail' when 'keepnums' is 'TRUE',\n any dimnames vectors added for dimensions '&gt;2' are the original\n numeric indices in that dimension _as character vectors_.  This\n means that, e.g., for 3-dimensional array 'arr', 'tail(arr,\n c(2,2,-1))[ , , 2]' and 'tail(arr, c(2,2,-1))[ , , \"2\"]' may both\n be valid but have completely different meanings.\nAuthor(s):\n Patrick Burns, improved and corrected by R-Core. Negative argument\n added by Vincent Goulet.  Multi-dimension support added by Gabriel\n Becker.\nExamples:\n head(letters)\n head(letters, n = -6L)\n \n head(freeny.x, n = 10L)\n head(freeny.y)\n \n head(iris3)\n head(iris3, c(6L, 2L))\n head(iris3, c(6L, -1L, 2L))\n \n tail(letters)\n tail(letters, n = -6L)\n \n tail(freeny.x)\n ## the bottom-right \"corner\" :\n tail(freeny.x, n = c(4, 2))\n tail(freeny.y)\n \n tail(iris3)\n tail(iris3, c(6L, 2L))\n tail(iris3, c(6L, -1L, 2L))\n \n ## iris with dimnames stripped\n a3d &lt;- iris3 ; dimnames(a3d) &lt;- NULL\n tail(a3d, c(6, -1, 2)) # keepnums = TRUE is default here!\n tail(a3d, c(6, -1, 2), keepnums = FALSE)\n \n ## data frame w/ a (non-standard) attribute:\n treeS &lt;- structure(trees, foo = \"bar\")\n (n &lt;- nrow(treeS))\n stopifnot(exprs = { # attribute is kept\n     identical(htS &lt;- head(treeS), treeS[1:6, ])\n     identical(attr(htS, \"foo\") , \"bar\")\n     identical(tlS &lt;- tail(treeS), treeS[(n-5):n, ])\n     ## BUT if I use \"useAttrib(.)\", this is *not* ok, when n is of length 2:\n     ## --- because [i,j]-indexing of data frames *also* drops \"other\" attributes ..\n     identical(tail(treeS, 3:2), treeS[(n-2):n, 2:3] )\n })\n \n tail(library) # last lines of function\n \n head(stats::ftable(Titanic))\n \n ## 1d-array (with named dim) :\n a1 &lt;- array(1:7, 7); names(dim(a1)) &lt;- \"O2\"\n stopifnot(exprs = {\n   identical( tail(a1, 10), a1)\n   identical( head(a1, 10), a1)\n   identical( head(a1, 1), a1 [1 , drop=FALSE] ) # was a1[1] in R &lt;= 3.6.x\n   identical( tail(a1, 2), a1[6:7])\n   identical( tail(a1, 1), a1 [7 , drop=FALSE] ) # was a1[7] in R &lt;= 3.6.x\n })"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#adding-new-columns",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#adding-new-columns",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Adding new columns",
+    "text": "Adding new columns\nYou can add a new column, called log_IgG to df, using the $ operator:\n\ndf$log_IgG &lt;- log(df$IgG_concentration)\nhead(df,3)\n\n\n\n\nobservation_id\nIgG_concentration\nage\ngender\nslum\nlog_IgG\n\n\n\n\n5772\n0.3176895\n2\nFemale\nNon slum\n-1.146681\n\n\n8095\n3.4368231\n4\nFemale\nNon slum\n1.234547\n\n\n9784\n0.3000000\n4\nMale\nNon slum\n-1.203973\n\n\n\n\n\nNote, my use of the underscore in the variable name rather than a space. This is good coding practice and make calling variables much less prone to error."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#creating-conditional-variables",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#creating-conditional-variables",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Creating conditional variables",
+    "text": "Creating conditional variables\nOne frequently-used tool is creating variables with conditions. A general function for creating new variables based on existing variables is the Base R ifelse() function, which “returns a value depending on whether the element of test is TRUE or FALSE.”\nConditional Element Selection\nDescription:\n 'ifelse' returns a value with the same shape as 'test' which is\n filled with elements selected from either 'yes' or 'no' depending\n on whether the element of 'test' is 'TRUE' or 'FALSE'.\nUsage:\n ifelse(test, yes, no)\n \nArguments:\ntest: an object which can be coerced to logical mode.\n\n yes: return values for true elements of 'test'.\n\n  no: return values for false elements of 'test'.\nDetails:\n If 'yes' or 'no' are too short, their elements are recycled.\n 'yes' will be evaluated if and only if any element of 'test' is\n true, and analogously for 'no'.\n\n Missing values in 'test' give missing values in the result.\nValue:\n A vector of the same length and attributes (including dimensions\n and '\"class\"') as 'test' and data values from the values of 'yes'\n or 'no'.  The mode of the answer will be coerced from logical to\n accommodate first any values taken from 'yes' and then any values\n taken from 'no'.\nWarning:\n The mode of the result may depend on the value of 'test' (see the\n examples), and the class attribute (see 'oldClass') of the result\n is taken from 'test' and may be inappropriate for the values\n selected from 'yes' and 'no'.\n\n Sometimes it is better to use a construction such as\n\n   (tmp &lt;- yes; tmp[!test] &lt;- no[!test]; tmp)\n \n , possibly extended to handle missing values in 'test'.\n\n Further note that 'if(test) yes else no' is much more efficient\n and often much preferable to 'ifelse(test, yes, no)' whenever\n 'test' is a simple true/false result, i.e., when 'length(test) ==\n 1'.\n\n The 'srcref' attribute of functions is handled specially: if\n 'test' is a simple true result and 'yes' evaluates to a function\n with 'srcref' attribute, 'ifelse' returns 'yes' including its\n attribute (the same applies to a false 'test' and 'no' argument).\n This functionality is only for backwards compatibility, the form\n 'if(test) yes else no' should be used whenever 'yes' and 'no' are\n functions.\nReferences:\n Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n Language_.  Wadsworth & Brooks/Cole.\nSee Also:\n 'if'.\nExamples:\n x &lt;- c(6:-4)\n sqrt(x)  #- gives warning\n sqrt(ifelse(x &gt;= 0, x, NA))  # no warning\n \n ## Note: the following also gives the warning !\n ifelse(x &gt;= 0, sqrt(x), NA)\n \n \n ## ifelse() strips attributes\n ## This is important when working with Dates and factors\n x &lt;- seq(as.Date(\"2000-02-29\"), as.Date(\"2004-10-04\"), by = \"1 month\")\n ## has many \"yyyy-mm-29\", but a few \"yyyy-03-01\" in the non-leap years\n y &lt;- ifelse(as.POSIXlt(x)$mday == 29, x, NA)\n head(y) # not what you expected ... ==&gt; need restore the class attribute:\n class(y) &lt;- class(x)\n y\n ## This is a (not atypical) case where it is better *not* to use ifelse(),\n ## but rather the more efficient and still clear:\n y2 &lt;- x\n y2[as.POSIXlt(x)$mday != 29] &lt;- NA\n ## which gives the same as ifelse()+class() hack:\n stopifnot(identical(y2, y))\n \n \n ## example of different return modes (and 'test' alone determining length):\n yes &lt;- 1:3\n no  &lt;- pi^(1:4)\n utils::str( ifelse(NA,    yes, no) ) # logical, length 1\n utils::str( ifelse(TRUE,  yes, no) ) # integer, length 1\n utils::str( ifelse(FALSE, yes, no) ) # double,  length 1"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#ifelse-example",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#ifelse-example",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "ifelse example",
+    "text": "ifelse example\nReminder of the first three arguments in the ifelse() function are ifelse(test, yes, no).\n\ndf$age_group &lt;- ifelse(df$age &lt;= 5, \"young\", \"old\")\nhead(df)\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nobservation_id\nIgG_concentration\nage\ngender\nslum\nlog_IgG\nseropos\nage_group\n\n\n\n\n5772\n0.3176895\n2\nFemale\nNon slum\n-1.1466807\nFALSE\nyoung\n\n\n8095\n3.4368231\n4\nFemale\nNon slum\n1.2345475\nFALSE\nyoung\n\n\n9784\n0.3000000\n4\nMale\nNon slum\n-1.2039728\nFALSE\nyoung\n\n\n9338\n143.2363014\n4\nMale\nNon slum\n4.9644957\nTRUE\nyoung\n\n\n6369\n0.4476534\n1\nMale\nNon slum\n-0.8037359\nFALSE\nyoung\n\n\n6885\n0.0252708\n4\nMale\nNon slum\n-3.6781074\nFALSE\nyoung\n\n\n\n\n\nLet’s delve into what is actually happening, with a focus on the NA values in age variable.\n\ndf$age &lt;= 5\n\n  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE\n [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n [61]  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE\n [73] FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE\n [85] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n [97]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[109] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE    NA  TRUE  TRUE\n[121]    NA  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[133] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[145]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[157] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[169] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE\n[181]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE\n[193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n[205]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[217] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[229]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[241] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[253] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[265]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE\n[277] FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[289]  TRUE    NA FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[301]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE\n[313]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE\n[325]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE\n[337] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE\n[349] FALSE    NA FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE\n[361]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE\n[373] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE\n[385]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[397] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[409]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[421] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[433]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[445] FALSE FALSE  TRUE  TRUE  TRUE  TRUE    NA    NA  TRUE  TRUE  TRUE  TRUE\n[457]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[469] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[481]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE\n[493]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE\n[505] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[517]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[529] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[541]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[553] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[565]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[577] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[589] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[601] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[613]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE\n[625] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE\n[637]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE    NA FALSE FALSE FALSE\n[649] FALSE FALSE FALSE\n\ntable(df$age, df$age_group, useNA=\"always\", dnn=list(\"age\", \"\"))\n\n\n\n\nage/\nold\nyoung\nNA\n\n\n\n\n1\n0\n44\n0\n\n\n2\n0\n72\n0\n\n\n3\n0\n79\n0\n\n\n4\n0\n80\n0\n\n\n5\n0\n41\n0\n\n\n6\n38\n0\n0\n\n\n7\n38\n0\n0\n\n\n8\n39\n0\n0\n\n\n9\n20\n0\n0\n\n\n10\n44\n0\n0\n\n\n11\n41\n0\n0\n\n\n12\n23\n0\n0\n\n\n13\n35\n0\n0\n\n\n14\n37\n0\n0\n\n\n15\n11\n0\n0\n\n\nNA\n0\n0\n9"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#nesting-ifelse-statements-example",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#nesting-ifelse-statements-example",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Nesting ifelse statements example",
+    "text": "Nesting ifelse statements example\n\ndf$age_group &lt;- ifelse(df$age &lt;= 5, \"young\", \n                       ifelse(df$age&lt;=10 & df$age&gt;5, \"middle\", \"old\"))\ntable(df$age, df$age_group, useNA=\"always\", dnn=list(\"age\", \"\"))\n\n\n\n\nage/\nmiddle\nold\nyoung\nNA\n\n\n\n\n1\n0\n0\n44\n0\n\n\n2\n0\n0\n72\n0\n\n\n3\n0\n0\n79\n0\n\n\n4\n0\n0\n80\n0\n\n\n5\n0\n0\n41\n0\n\n\n6\n38\n0\n0\n0\n\n\n7\n38\n0\n0\n0\n\n\n8\n39\n0\n0\n0\n\n\n9\n20\n0\n0\n0\n\n\n10\n44\n0\n0\n0\n\n\n11\n0\n41\n0\n0\n\n\n12\n0\n23\n0\n0\n\n\n13\n0\n35\n0\n0\n\n\n14\n0\n37\n0\n0\n\n\n15\n0\n11\n0\n0\n\n\nNA\n0\n0\n0\n9\n\n\n\n\n\nNote, it puts the variable levels in alphabetical order, we will show how to change this later."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#overview---data-classes",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#overview---data-classes",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Overview - Data Classes",
+    "text": "Overview - Data Classes\n\nOne dimensional types (i.e., vectors of characters, numeric, logical, or factor values)\nTwo dimensional types (e.g., matrix, data frame, tibble)\nSpecial data classes (e.g., lists, dates)."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#class-function",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#class-function",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "class() function",
+    "text": "class() function\nThe class() function allows you to evaluate the class of an object.\n\nclass(df$IgG_concentration)\n\n[1] \"numeric\"\n\nclass(df$age)\n\n[1] \"integer\"\n\nclass(df$gender)\n\n[1] \"character\""
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#one-dimensional-data-types",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#one-dimensional-data-types",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "One dimensional data types",
+    "text": "One dimensional data types\n\nCharacter: strings or individual characters, quoted\nNumeric: any real number(s)\n\nDouble: contains fractional values (i.e., double precision) - default numeric\nInteger: any integer(s)/whole numbers\n\nLogical: variables composed of TRUE or FALSE\nFactor: categorical/qualitative variables"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#character-and-numeric",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#character-and-numeric",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Character and numeric",
+    "text": "Character and numeric\nThis can also be a bit tricky.\nIf only one character in the whole vector, the class is assumed to be character\n\nclass(c(1, 2, \"tree\")) \n\n[1] \"character\"\n\n\nHere because integers are in quotations, it is read as a character class by R.\n\nclass(c(\"1\", \"4\", \"7\")) \n\n[1] \"character\"\n\n\nNote, instead of creating a new vector object (e.g., x &lt;- c(\"1\", \"4\", \"7\")) and then feeding the vector object x into the first argument of the class() function (e.g., class(x)), we combined the two steps and directly fed a vector object into the class function."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#numeric-subclasses",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#numeric-subclasses",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Numeric Subclasses",
+    "text": "Numeric Subclasses\nThere are two major numeric subclasses\n\nDouble is a special subset of numeric that contains fractional values. Double stands for double-precision\nInteger is a special subset of numeric that contains only whole numbers.\n\ntypeof() identifies the vector type (double, integer, logical, or character), whereas class() identifies the root class. The difference between the two will be more clear when we look at two dimensional classes below.\n\nclass(df$IgG_concentration)\n\n[1] \"numeric\"\n\nclass(df$age)\n\n[1] \"integer\"\n\ntypeof(df$IgG_concentration)\n\n[1] \"double\"\n\ntypeof(df$age)\n\n[1] \"integer\""
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#logical",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#logical",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Logical",
+    "text": "Logical\nReminder logical is a type that only has three possible elements: TRUE and FALSE and NA\n\nclass(c(TRUE, FALSE, TRUE, TRUE, FALSE))\n\n[1] \"logical\"\n\n\nNote that when creating logical object the TRUE and FALSE are NOT in quotes. Putting R special classes (e.g., NA or FALSE) in quotations turns them into character value."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#other-useful-functions-for-evaluatingsetting-classes",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#other-useful-functions-for-evaluatingsetting-classes",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Other useful functions for evaluating/setting classes",
+    "text": "Other useful functions for evaluating/setting classes\nThere are two useful functions associated with practically all R classes:\n\nis.CLASS_NAME(x) to logically check whether or not x is of certain class. For example, is.integer or is.character or is.numeric\nas.CLASS_NAME(x) to coerce between classes x from current x class into a another class. For example, as.integer or as.character or as.numeric. This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later)."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#examples-is.class_namex",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#examples-is.class_namex",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Examples is.CLASS_NAME(x)",
+    "text": "Examples is.CLASS_NAME(x)\n\nis.numeric(df$IgG_concentration)\n\n[1] TRUE\n\nis.character(df$age)\n\n[1] FALSE\n\nis.character(df$gender)\n\n[1] TRUE"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#examples-as.class_namex",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#examples-as.class_namex",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Examples as.CLASS_NAME(x)",
+    "text": "Examples as.CLASS_NAME(x)\nIn some cases, coercing is seamless\n\nas.character(c(1, 4, 7))\n\n[1] \"1\" \"4\" \"7\"\n\nas.numeric(c(\"1\", \"4\", \"7\"))\n\n[1] 1 4 7\n\nas.logical(c(\"TRUE\", \"FALSE\", \"FALSE\"))\n\n[1]  TRUE FALSE FALSE\n\n\nIn some cases the coercing is not possible; if executed, will return NA\n\nas.numeric(c(\"1\", \"4\", \"7a\"))\n\nWarning: NAs introduced by coercion\n\n\n[1]  1  4 NA\n\nas.logical(c(\"TRUE\", \"FALSE\", \"UNKNOWN\"))\n\n[1]  TRUE FALSE    NA"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#factors",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#factors",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Factors",
+    "text": "Factors\nA factor is a special character vector where the elements have pre-defined groups or ‘levels’. You can think of these as qualitative or categorical variables. Use the factor() function to create factors from character values.\n\nclass(df$age_group)\n\n[1] \"character\"\n\ndf$age_group_factor &lt;- factor(df$age_group)\nclass(df$age_group_factor)\n\n[1] \"factor\"\n\nlevels(df$age_group_factor)\n\n[1] \"middle\" \"old\"    \"young\" \n\n\nNote 1, that levels are, by default, set to alphanumerical order! And, the first is always the “reference” group. However, we often prefer a different reference group.\nNote 2, we can also make ordered factors using factor(... ordered=TRUE), but we won’t talk more about that."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#reference-groups",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#reference-groups",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Reference Groups",
+    "text": "Reference Groups\nWhy do we care about reference groups?\nGeneralized linear regression allows you to compare the outcome of two or more groups. Your reference group is the group that everything else is compared to. Say we want to assess whether being &lt;5 years old is associated with higher IgG antibody concentrations\nBy default middle is the reference group therefore we will only generate beta coefficients comparing middle to young AND middle to old. But, we want young to be the reference group so we will generate beta coefficients comparing young to middle AND young to old."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#changing-factor-reference",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#changing-factor-reference",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Changing factor reference",
+    "text": "Changing factor reference\nChanging the reference group of a factor variable.\n\nIf the object is already a factor then use relevel() function and the ref argument to specify the reference.\nIf the object is a character then use factor() function and levels argument to specify the order of the values, the first being the reference.\n\nLet’s look at the relevel() help file\nReorder Levels of Factor\nDescription:\n The levels of a factor are re-ordered so that the level specified\n by 'ref' is first and the others are moved down. This is useful\n for 'contr.treatment' contrasts which take the first level as the\n reference.\nUsage:\n relevel(x, ref, ...)\n \nArguments:\n   x: an unordered factor.\n\n ref: the reference level, typically a string.\n\n ...: additional arguments for future methods.\nDetails:\n This, as 'reorder()', is a special case of simply calling\n 'factor(x, levels = levels(x)[....])'.\nValue:\n A factor of the same length as 'x'.\nSee Also:\n 'factor', 'contr.treatment', 'levels', 'reorder'.\nExamples:\n warpbreaks$tension &lt;- relevel(warpbreaks$tension, ref = \"M\")\n summary(lm(breaks ~ wool + tension, data = warpbreaks))\n\nLet’s look at the factor() help file\nFactors\nDescription:\n The function 'factor' is used to encode a vector as a factor (the\n terms 'category' and 'enumerated type' are also used for factors).\n If argument 'ordered' is 'TRUE', the factor levels are assumed to\n be ordered.  For compatibility with S there is also a function\n 'ordered'.\n\n 'is.factor', 'is.ordered', 'as.factor' and 'as.ordered' are the\n membership and coercion functions for these classes.\nUsage:\n factor(x = character(), levels, labels = levels,\n        exclude = NA, ordered = is.ordered(x), nmax = NA)\n \n ordered(x = character(), ...)\n \n is.factor(x)\n is.ordered(x)\n \n as.factor(x)\n as.ordered(x)\n \n addNA(x, ifany = FALSE)\n \n .valid.factor(object)\n \nArguments:\n   x: a vector of data, usually taking a small number of distinct\n      values.\nlevels: an optional vector of the unique values (as character strings) that ‘x’ might have taken. The default is the unique set of values taken by ‘as.character(x)’, sorted into increasing order of ‘x’. Note that this set can be specified as smaller than ‘sort(unique(x))’.\nlabels: either an optional character vector of labels for the levels (in the same order as ‘levels’ after removing those in ‘exclude’), or a character string of length 1. Duplicated values in ‘labels’ can be used to map different values of ‘x’ to the same factor level.\nexclude: a vector of values to be excluded when forming the set of levels. This may be factor with the same level set as ‘x’ or should be a ‘character’.\nordered: logical flag to determine if the levels should be regarded as ordered (in the order given).\nnmax: an upper bound on the number of levels; see 'Details'.\n\n ...: (in 'ordered(.)'): any of the above, apart from 'ordered'\n      itself.\nifany: only add an ‘NA’ level if it is used, i.e. if ‘any(is.na(x))’.\nobject: an R object.\nDetails:\n The type of the vector 'x' is not restricted; it only must have an\n 'as.character' method and be sortable (by 'order').\n\n Ordered factors differ from factors only in their class, but\n methods and the model-fitting functions treat the two classes\n quite differently.\n\n The encoding of the vector happens as follows.  First all the\n values in 'exclude' are removed from 'levels'. If 'x[i]' equals\n 'levels[j]', then the 'i'-th element of the result is 'j'.  If no\n match is found for 'x[i]' in 'levels' (which will happen for\n excluded values) then the 'i'-th element of the result is set to\n 'NA'.\n\n Normally the 'levels' used as an attribute of the result are the\n reduced set of levels after removing those in 'exclude', but this\n can be altered by supplying 'labels'.  This should either be a set\n of new labels for the levels, or a character string, in which case\n the levels are that character string with a sequence number\n appended.\n\n 'factor(x, exclude = NULL)' applied to a factor without 'NA's is a\n no-operation unless there are unused levels: in that case, a\n factor with the reduced level set is returned.  If 'exclude' is\n used, since R version 3.4.0, excluding non-existing character\n levels is equivalent to excluding nothing, and when 'exclude' is a\n 'character' vector, that _is_ applied to the levels of 'x'.\n Alternatively, 'exclude' can be factor with the same level set as\n 'x' and will exclude the levels present in 'exclude'.\n\n The codes of a factor may contain 'NA'.  For a numeric 'x', set\n 'exclude = NULL' to make 'NA' an extra level (prints as '&lt;NA&gt;');\n by default, this is the last level.\n\n If 'NA' is a level, the way to set a code to be missing (as\n opposed to the code of the missing level) is to use 'is.na' on the\n left-hand-side of an assignment (as in 'is.na(f)[i] &lt;- TRUE';\n indexing inside 'is.na' does not work).  Under those circumstances\n missing values are currently printed as '&lt;NA&gt;', i.e., identical to\n entries of level 'NA'.\n\n 'is.factor' is generic: you can write methods to handle specific\n classes of objects, see InternalMethods.\n\n Where 'levels' is not supplied, 'unique' is called.  Since factors\n typically have quite a small number of levels, for large vectors\n 'x' it is helpful to supply 'nmax' as an upper bound on the number\n of unique values.\n\n When using 'c' to combine a (possibly ordered) factor with other\n objects, if all objects are (possibly ordered) factors, the result\n will be a factor with levels the union of the level sets of the\n elements, in the order the levels occur in the level sets of the\n elements (which means that if all the elements have the same level\n set, that is the level set of the result), equivalent to how\n 'unlist' operates on a list of factor objects.\nValue:\n 'factor' returns an object of class '\"factor\"' which has a set of\n integer codes the length of 'x' with a '\"levels\"' attribute of\n mode 'character' and unique ('!anyDuplicated(.)') entries.  If\n argument 'ordered' is true (or 'ordered()' is used) the result has\n class 'c(\"ordered\", \"factor\")'.  Undocumentedly for a long time,\n 'factor(x)' loses all 'attributes(x)' but '\"names\"', and resets\n '\"levels\"' and '\"class\"'.\n\n Applying 'factor' to an ordered or unordered factor returns a\n factor (of the same type) with just the levels which occur: see\n also '[.factor' for a more transparent way to achieve this.\n\n 'is.factor' returns 'TRUE' or 'FALSE' depending on whether its\n argument is of type factor or not.  Correspondingly, 'is.ordered'\n returns 'TRUE' when its argument is an ordered factor and 'FALSE'\n otherwise.\n\n 'as.factor' coerces its argument to a factor.  It is an\n abbreviated (sometimes faster) form of 'factor'.\n\n 'as.ordered(x)' returns 'x' if this is ordered, and 'ordered(x)'\n otherwise.\n\n 'addNA' modifies a factor by turning 'NA' into an extra level (so\n that 'NA' values are counted in tables, for instance).\n\n '.valid.factor(object)' checks the validity of a factor, currently\n only 'levels(object)', and returns 'TRUE' if it is valid,\n otherwise a string describing the validity problem.  This function\n is used for 'validObject(&lt;factor&gt;)'.\nWarning:\n The interpretation of a factor depends on both the codes and the\n '\"levels\"' attribute.  Be careful only to compare factors with the\n same set of levels (in the same order).  In particular,\n 'as.numeric' applied to a factor is meaningless, and may happen by\n implicit coercion.  To transform a factor 'f' to approximately its\n original numeric values, 'as.numeric(levels(f))[f]' is recommended\n and slightly more efficient than 'as.numeric(as.character(f))'.\n\n The levels of a factor are by default sorted, but the sort order\n may well depend on the locale at the time of creation, and should\n not be assumed to be ASCII.\n\n There are some anomalies associated with factors that have 'NA' as\n a level.  It is suggested to use them sparingly, e.g., only for\n tabulation purposes.\nComparison operators and group generic methods:\n There are '\"factor\"' and '\"ordered\"' methods for the group generic\n 'Ops' which provide methods for the Comparison operators, and for\n the 'min', 'max', and 'range' generics in 'Summary' of\n '\"ordered\"'.  (The rest of the groups and the 'Math' group\n generate an error as they are not meaningful for factors.)\n\n Only '==' and '!=' can be used for factors: a factor can only be\n compared to another factor with an identical set of levels (not\n necessarily in the same ordering) or to a character vector.\n Ordered factors are compared in the same way, but the general\n dispatch mechanism precludes comparing ordered and unordered\n factors.\n\n All the comparison operators are available for ordered factors.\n Collation is done by the levels of the operands: if both operands\n are ordered factors they must have the same level set.\nNote:\n In earlier versions of R, storing character data as a factor was\n more space efficient if there is even a small proportion of\n repeats.  However, identical character strings now share storage,\n so the difference is small in most cases.  (Integer values are\n stored in 4 bytes whereas each reference to a character string\n needs a pointer of 4 or 8 bytes.)\nReferences:\n Chambers, J. M. and Hastie, T. J. (1992) _Statistical Models in\n S_.  Wadsworth & Brooks/Cole.\nSee Also:\n '[.factor' for subsetting of factors.\n\n 'gl' for construction of balanced factors and 'C' for factors with\n specified contrasts.  'levels' and 'nlevels' for accessing the\n levels, and 'unclass' to get integer codes.\nExamples:\n (ff &lt;- factor(substring(\"statistics\", 1:10, 1:10), levels = letters))\n as.integer(ff)      # the internal codes\n (f. &lt;- factor(ff))  # drops the levels that do not occur\n ff[, drop = TRUE]   # the same, more transparently\n \n factor(letters[1:20], labels = \"letter\")\n \n class(ordered(4:1)) # \"ordered\", inheriting from \"factor\"\n z &lt;- factor(LETTERS[3:1], ordered = TRUE)\n ## and \"relational\" methods work:\n stopifnot(sort(z)[c(1,3)] == range(z), min(z) &lt; max(z))\n \n \n ## suppose you want \"NA\" as a level, and to allow missing values.\n (x &lt;- factor(c(1, 2, NA), exclude = NULL))\n is.na(x)[2] &lt;- TRUE\n x  # [1] 1    &lt;NA&gt; &lt;NA&gt;\n is.na(x)\n # [1] FALSE  TRUE FALSE\n \n ## More rational, since R 3.4.0 :\n factor(c(1:2, NA), exclude =  \"\" ) # keeps &lt;NA&gt; , as\n factor(c(1:2, NA), exclude = NULL) # always did\n ## exclude = &lt;character&gt;\n z # ordered levels 'A &lt; B &lt; C'\n factor(z, exclude = \"C\") # does exclude\n factor(z, exclude = \"B\") # ditto\n \n ## Now, labels maybe duplicated:\n ## factor() with duplicated labels allowing to \"merge levels\"\n x &lt;- c(\"Man\", \"Male\", \"Man\", \"Lady\", \"Female\")\n ## Map from 4 different values to only two levels:\n (xf &lt;- factor(x, levels = c(\"Male\", \"Man\" , \"Lady\",   \"Female\"),\n                  labels = c(\"Male\", \"Male\", \"Female\", \"Female\")))\n #&gt; [1] Male   Male   Male   Female Female\n #&gt; Levels: Male Female\n \n ## Using addNA()\n Month &lt;- airquality$Month\n table(addNA(Month))\n table(addNA(Month, ifany = TRUE))"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#changing-factor-reference-examples",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#changing-factor-reference-examples",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Changing factor reference examples",
+    "text": "Changing factor reference examples\n\ndf$age_group_factor &lt;- relevel(df$age_group_factor, ref=\"young\")\nlevels(df$age_group_factor)\n\n[1] \"young\"  \"middle\" \"old\"   \n\n\nOR\n\ndf$age_group_factor &lt;- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\nlevels(df$age_group_factor)\n\n[1] \"young\"  \"middle\" \"old\"   \n\n\nArranging, tabulating, and plotting the data will reflect the new order"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#two-dimensional-data-classes",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#two-dimensional-data-classes",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Two-dimensional data classes",
+    "text": "Two-dimensional data classes\nTwo-dimensional classes are those we would often use to store data read from a file\n\na matrix (matrix class)\na data frame (data.frame or tibble classes)"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#matrices",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#matrices",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Matrices",
+    "text": "Matrices\nMatrices, like data frames are also composed of rows and columns. Matrices, unlike data.frame, the entire matrix is composed of one R class. For example: all entries are numeric, or all entries are character\nas.matrix() creates a matrix from a data frame (where all values are the same class).\nYou can also create a matrix from scratch using matrix() Use ?matrix to see the arguments.\n\nmatrix(data=1:6, ncol = 2) \n\n\n\n\n1\n4\n\n\n2\n5\n\n\n3\n6\n\n\n\n\nmatrix(data=1:6, ncol=2, byrow=TRUE) \n\n\n\n\n1\n2\n\n\n3\n4\n\n\n5\n6\n\n\n\n\n\nNote, the first matrix filled in numbers 1-6 by columns first and then rows because default byrow argument is FALSE. In the second matrix, we changed the argument byrow to TRUE, and now numbers 1-6 are filled by rows first and then columns."
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#data-frame",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#data-frame",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Data frame",
+    "text": "Data frame\nYou can transform an existing matrix into data frames using as.data.frame()\n\nas.data.frame(matrix(1:6, ncol = 2) ) \n\n\n\n\nV1\nV2\n\n\n\n\n1\n4\n\n\n2\n5\n\n\n3\n6"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#numeric-variable-data-summary",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#numeric-variable-data-summary",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Numeric variable data summary",
+    "text": "Numeric variable data summary\nData summarization on numeric vectors/variables:\n\nmean(): takes the mean of x\nsd(): takes the standard deviation of x\nmedian(): takes the median of x\nquantile(): displays sample quantiles of x. Default is min, IQR, max\nrange(): displays the range. Same as c(min(), max())\nsum(): sum of x\nmax(): maximum value in x\nmin(): minimum value in x\n\nNote, all have the  na.rm argument for missing data\nArithmetic Mean\nDescription:\n Generic function for the (trimmed) arithmetic mean.\nUsage:\n mean(x, ...)\n \n ## Default S3 method:\n mean(x, trim = 0, na.rm = FALSE, ...)\n \nArguments:\n   x: An R object.  Currently there are methods for numeric/logical\n      vectors and date, date-time and time interval objects.\n      Complex vectors are allowed for 'trim = 0', only.\n\ntrim: the fraction (0 to 0.5) of observations to be trimmed from\n      each end of 'x' before the mean is computed.  Values of trim\n      outside that range are taken as the nearest endpoint.\nna.rm: a logical evaluating to ‘TRUE’ or ‘FALSE’ indicating whether ‘NA’ values should be stripped before the computation proceeds.\n ...: further arguments passed to or from other methods.\nValue:\n If 'trim' is zero (the default), the arithmetic mean of the values\n in 'x' is computed, as a numeric or complex vector of length one.\n If 'x' is not logical (coerced to numeric), numeric (including\n integer) or complex, 'NA_real_' is returned, with a warning.\n\n If 'trim' is non-zero, a symmetrically trimmed mean is computed\n with a fraction of 'trim' observations deleted from each end\n before the mean is computed.\nReferences:\n Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n Language_.  Wadsworth & Brooks/Cole.\nSee Also:\n 'weighted.mean', 'mean.POSIXct', 'colMeans' for row and column\n means.\nExamples:\n x &lt;- c(0:10, 50)\n xm &lt;- mean(x)\n c(xm, mean(x, trim = 0.10))"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#numeric-variable-data-summary-examples",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#numeric-variable-data-summary-examples",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Numeric variable data summary examples",
+    "text": "Numeric variable data summary examples\n\nsummary(df)\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nobservation_id\nIgG_concentration\nage\ngender\nslum\nlog_IgG\nseropos\nage_group\nage_group_factor\n\n\n\n\n\nMin. :5006\nMin. : 0.0054\nMin. : 1.000\nLength:651\nLength:651\nMin. :-5.2231\nMode :logical\nLength:651\nyoung :316\n\n\n\n1st Qu.:6306\n1st Qu.: 0.3000\n1st Qu.: 3.000\nClass :character\nClass :character\n1st Qu.:-1.2040\nFALSE:360\nClass :character\nmiddle:179\n\n\n\nMedian :7495\nMedian : 1.6658\nMedian : 6.000\nMode :character\nMode :character\nMedian : 0.5103\nTRUE :281\nMode :character\nold :147\n\n\n\nMean :7492\nMean : 87.3683\nMean : 6.606\nNA\nNA\nMean : 1.6074\nNA’s :10\nNA\nNA’s : 9\n\n\n\n3rd Qu.:8749\n3rd Qu.:141.4405\n3rd Qu.:10.000\nNA\nNA\n3rd Qu.: 4.9519\nNA\nNA\nNA\n\n\n\nMax. :9982\nMax. :916.4179\nMax. :15.000\nNA\nNA\nMax. : 6.8205\nNA\nNA\nNA\n\n\n\nNA\nNA’s :10\nNA’s :9\nNA\nNA\nNA’s :10\nNA\nNA\nNA\n\n\n\n\nrange(df$age)\n\n[1] NA NA\n\nrange(df$age, na.rm=TRUE)\n\n[1]  1 15\n\nmedian(df$IgG_concentration, na.rm=TRUE)\n\n[1] 1.665753"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#character-variable-data-summaries",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#character-variable-data-summaries",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Character variable data summaries",
+    "text": "Character variable data summaries\nData summarization on character or factor vectors/variables using table()\nCross Tabulation and Table Creation\nDescription:\n 'table' uses cross-classifying factors to build a contingency\n table of the counts at each combination of factor levels.\nUsage:\n table(...,\n       exclude = if (useNA == \"no\") c(NA, NaN),\n       useNA = c(\"no\", \"ifany\", \"always\"),\n       dnn = list.names(...), deparse.level = 1)\n \n as.table(x, ...)\n is.table(x)\n \n ## S3 method for class 'table'\n as.data.frame(x, row.names = NULL, ...,\n               responseName = \"Freq\", stringsAsFactors = TRUE,\n               sep = \"\", base = list(LETTERS))\n \nArguments:\n ...: one or more objects which can be interpreted as factors\n      (including numbers or character strings), or a 'list' (such\n      as a data frame) whose components can be so interpreted.\n      (For 'as.table', arguments passed to specific methods; for\n      'as.data.frame', unused.)\nexclude: levels to remove for all factors in ‘…’. If it does not contain ‘NA’ and ‘useNA’ is not specified, it implies ‘useNA = “ifany”’. See ‘Details’ for its interpretation for non-factor arguments.\nuseNA: whether to include ‘NA’ values in the table. See ‘Details’. Can be abbreviated.\n dnn: the names to be given to the dimensions in the result (the\n      _dimnames names_).\ndeparse.level: controls how the default ‘dnn’ is constructed. See ‘Details’.\n   x: an arbitrary R object, or an object inheriting from class\n      '\"table\"' for the 'as.data.frame' method. Note that\n      'as.data.frame.table(x, *)' may be called explicitly for\n      non-table 'x' for \"reshaping\" 'array's.\nrow.names: a character vector giving the row names for the data frame.\nresponseName: The name to be used for the column of table entries, usually counts.\nstringsAsFactors: logical: should the classifying factors be returned as factors (the default) or character vectors?\nsep, base: passed to ‘provideDimnames’.\nDetails:\n If the argument 'dnn' is not supplied, the internal function\n 'list.names' is called to compute the 'dimname names' as follows:\n If '...' is one 'list' with its own 'names()', these 'names' are\n used.  Otherwise, if the arguments in '...' are named, those names\n are used.  For the remaining arguments, 'deparse.level = 0' gives\n an empty name, 'deparse.level = 1' uses the supplied argument if\n it is a symbol, and 'deparse.level = 2' will deparse the argument.\n\n Only when 'exclude' is specified (i.e., not by default) and\n non-empty, will 'table' potentially drop levels of factor\n arguments.\n\n 'useNA' controls if the table includes counts of 'NA' values: the\n allowed values correspond to never ('\"no\"'), only if the count is\n positive ('\"ifany\"') and even for zero counts ('\"always\"').  Note\n the somewhat \"pathological\" case of two different kinds of 'NA's\n which are treated differently, depending on both 'useNA' and\n 'exclude', see 'd.patho' in the 'Examples:' below.\n\n Both 'exclude' and 'useNA' operate on an \"all or none\" basis.  If\n you want to control the dimensions of a multiway table separately,\n modify each argument using 'factor' or 'addNA'.\n\n Non-factor arguments 'a' are coerced via 'factor(a,\n exclude=exclude)'.  Since R 3.4.0, care is taken _not_ to count\n the excluded values (where they were included in the 'NA' count,\n previously).\n\n The 'summary' method for class '\"table\"' (used for objects created\n by 'table' or 'xtabs') which gives basic information and performs\n a chi-squared test for independence of factors (note that the\n function 'chisq.test' currently only handles 2-d tables).\nValue:\n 'table()' returns a _contingency table_, an object of class\n '\"table\"', an array of integer values.  Note that unlike S the\n result is always an 'array', a 1D array if one factor is given.\n\n 'as.table' and 'is.table' coerce to and test for contingency\n table, respectively.\n\n The 'as.data.frame' method for objects inheriting from class\n '\"table\"' can be used to convert the array-based representation of\n a contingency table to a data frame containing the classifying\n factors and the corresponding entries (the latter as component\n named by 'responseName').  This is the inverse of 'xtabs'.\nReferences:\n Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S\n Language_.  Wadsworth & Brooks/Cole.\nSee Also:\n 'tabulate' is the underlying function and allows finer control.\n\n Use 'ftable' for printing (and more) of multidimensional tables.\n 'margin.table', 'prop.table', 'addmargins'.\n\n 'addNA' for constructing factors with 'NA' as a level.\n\n 'xtabs' for cross tabulation of data frames with a formula\n interface.\nExamples:\n require(stats) # for rpois and xtabs\n ## Simple frequency distribution\n table(rpois(100, 5))\n ## Check the design:\n with(warpbreaks, table(wool, tension))\n table(state.division, state.region)\n \n # simple two-way contingency table\n with(airquality, table(cut(Temp, quantile(Temp)), Month))\n \n a &lt;- letters[1:3]\n table(a, sample(a))                    # dnn is c(\"a\", \"\")\n table(a, sample(a), deparse.level = 0) # dnn is c(\"\", \"\")\n table(a, sample(a), deparse.level = 2) # dnn is c(\"a\", \"sample(a)\")\n \n ## xtabs() &lt;-&gt; as.data.frame.table() :\n UCBAdmissions ## already a contingency table\n DF &lt;- as.data.frame(UCBAdmissions)\n class(tab &lt;- xtabs(Freq ~ ., DF)) # xtabs & table\n ## tab *is* \"the same\" as the original table:\n all(tab == UCBAdmissions)\n all.equal(dimnames(tab), dimnames(UCBAdmissions))\n \n a &lt;- rep(c(NA, 1/0:3), 10)\n table(a)                 # does not report NA's\n table(a, exclude = NULL) # reports NA's\n b &lt;- factor(rep(c(\"A\",\"B\",\"C\"), 10))\n table(b)\n table(b, exclude = \"B\")\n d &lt;- factor(rep(c(\"A\",\"B\",\"C\"), 10), levels = c(\"A\",\"B\",\"C\",\"D\",\"E\"))\n table(d, exclude = \"B\")\n print(table(b, d), zero.print = \".\")\n \n ## NA counting:\n is.na(d) &lt;- 3:4\n d. &lt;- addNA(d)\n d.[1:7]\n table(d.) # \", exclude = NULL\" is not needed\n ## i.e., if you want to count the NA's of 'd', use\n table(d, useNA = \"ifany\")\n \n ## \"pathological\" case:\n d.patho &lt;- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) &lt;- 3:4\n d.patho\n ## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :\n as.integer(d.patho) # 1 4 NA NA 1 2\n ##\n ## In R &gt;= 3.4.0, table() allows to differentiate:\n table(d.patho)                   # counts the \"unusual\" NA\n table(d.patho, useNA = \"ifany\")  # counts all three\n table(d.patho, exclude = NULL)   #  (ditto)\n table(d.patho, exclude = NA)     # counts none\n \n ## Two-way tables with NA counts. The 3rd variant is absurd, but shows\n ## something that cannot be done using exclude or useNA.\n with(airquality,\n    table(OzHi = Ozone &gt; 80, Month, useNA = \"ifany\"))\n with(airquality,\n    table(OzHi = Ozone &gt; 80, Month, useNA = \"always\"))\n with(airquality,\n    table(OzHi = Ozone &gt; 80, addNA(Month)))"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#character-variable-data-summary-examples",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#character-variable-data-summary-examples",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Character variable data summary examples",
+    "text": "Character variable data summary examples\nNumber of observations in each category\n\ntable(df$gender)\n\n\n\n\nFemale\nMale\n\n\n\n\n325\n326\n\n\n\n\ntable(df$gender, useNA=\"always\")\n\n\n\n\nFemale\nMale\nNA\n\n\n\n\n325\n326\n0\n\n\n\n\ntable(df$age_group, useNA=\"always\")\n\n\n\n\nmiddle\nold\nyoung\nNA\n\n\n\n\n179\n147\n316\n9\n\n\n\n\n\n\ntable(df$gender)/nrow(df) #if no NA values\n\n\n\n\nFemale\nMale\n\n\n\n\n0.499232\n0.500768\n\n\n\n\ntable(df$age_group)/nrow(df[!is.na(df$age_group),]) #if there are NA values\n\n\n\n\nmiddle\nold\nyoung\n\n\n\n\n0.2788162\n0.228972\n0.4922118\n\n\n\n\ntable(df$age_group)/nrow(subset(df, !is.na(df$age_group),)) #if there are NA values\n\n\n\n\nmiddle\nold\nyoung\n\n\n\n\n0.2788162\n0.228972\n0.4922118"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#summary",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#summary",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Summary",
+    "text": "Summary\n\nAdding (or modifying) columns/variable to a data frame by using $\nThere are two types of numeric class objects: integer and double\nLogical class objects only have TRUE or False (without quotes)\nis.CLASS_NAME(x) can be used to test the class of an object x\nas.CLASS_NAME(x) can be used to change the class of an object x\nFactors are a special character class that has levels\n…xxamy complete"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#acknowledgements",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#acknowledgements",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Acknowledgements",
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R for Public Health Researchers” Johns Hopkins University"
+  },
+  {
+    "objectID": "modules/Module07-VarCreationClassesSummaries.html#adding-new-columns-1",
+    "href": "modules/Module07-VarCreationClassesSummaries.html#adding-new-columns-1",
+    "title": "Module 7: Variable Creation, Classes, and Summaries",
+    "section": "Adding new columns",
+    "text": "Adding new columns\nWe can also add a new column using the transform() function:\n\n\nTransform an Object, for Example a Data Frame\n\nDescription:\n\n     'transform' is a generic function, which-at least currently-only\n     does anything useful with data frames.  'transform.default'\n     converts its first argument to a data frame if possible and calls\n     'transform.data.frame'.\n\nUsage:\n\n     transform(`_data`, ...)\n     \nArguments:\n\n   _data: The object to be transformed\n\n     ...: Further arguments of the form 'tag=value'\n\nDetails:\n\n     The '...' arguments to 'transform.data.frame' are tagged vector\n     expressions, which are evaluated in the data frame '_data'.  The\n     tags are matched against 'names(_data)', and for those that match,\n     the value replace the corresponding variable in '_data', and the\n     others are appended to '_data'.\n\nValue:\n\n     The modified value of '_data'.\n\nWarning:\n\n     This is a convenience function intended for use interactively.\n     For programming it is better to use the standard subsetting\n     arithmetic functions, and in particular the non-standard\n     evaluation of argument 'transform' can have unanticipated\n     consequences.\n\nNote:\n\n     If some of the values are not vectors of the appropriate length,\n     you deserve whatever you get!\n\nAuthor(s):\n\n     Peter Dalgaard\n\nSee Also:\n\n     'within' for a more flexible approach, 'subset', 'list',\n     'data.frame'\n\nExamples:\n\n     transform(airquality, Ozone = -Ozone)\n     transform(airquality, new = -Ozone, Temp = (Temp-32)/1.8)\n     \n     attach(airquality)\n     transform(Ozone, logOzone = log(Ozone)) # marginally interesting ...\n     detach(airquality)\n\n\nFor example, adding a binary column for seropositivity called seropos:\n\ndf &lt;- transform(df, seropos = IgG_concentration &gt;= 10)\nhead(df)\n\n\n\n\n\n\n\n\n\n\n\n\n\nobservation_id\nIgG_concentration\nage\ngender\nslum\nlog_IgG\nseropos\n\n\n\n\n5772\n0.3176895\n2\nFemale\nNon slum\n-1.1466807\nFALSE\n\n\n8095\n3.4368231\n4\nFemale\nNon slum\n1.2345475\nFALSE\n\n\n9784\n0.3000000\n4\nMale\nNon slum\n-1.2039728\nFALSE\n\n\n9338\n143.2363014\n4\nMale\nNon slum\n4.9644957\nTRUE\n\n\n6369\n0.4476534\n1\nMale\nNon slum\n-0.8037359\nFALSE\n\n\n6885\n0.0252708\n4\nMale\nNon slum\n-3.6781074\nFALSE"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#learning-objectives",
+    "href": "modules/Module08-DataMergeReshape.html#learning-objectives",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Learning Objectives",
+    "text": "Learning Objectives\nAfter module 8, you should be able to…\n\nMerge/join data together\nReshape data from wide to long\nReshape data from long to wide"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#joining-types",
+    "href": "modules/Module08-DataMergeReshape.html#joining-types",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Joining types",
+    "text": "Joining types\nPay close attention to the number of rows in your data set before and after a join. This will help flag when an issue has arisen. This will depend on the type of merge:\n\n1:1 merge (one-to-one merge) – Simplest merge (sometimes things go wrong)\n1:m merge (one-to-many merge) – More complex (things often go wrong)\n\nThe “one” suggests that one dataset has the merging variable (e.g., id) each represented once and the “many” implies that one dataset has the merging variable represented multiple times\n\nm:m merge (many-to-many merge) – Danger zone (can be unpredictable)"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#one-to-one-merge",
+    "href": "modules/Module08-DataMergeReshape.html#one-to-one-merge",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "one-to-one merge",
+    "text": "one-to-one merge\n\nThis means that each row of data represents a unique unit of analysis that exists in another dataset (e.g,. id variable)\nWill likely have variables that don’t exist in the current dataset (that’s why you are trying to merge it in)\nThe merging variable (e.g., id) each represented a single time\nYou should try to structure your data so that a 1:1 merge or 1:m merge is possible so that fewer things can go wrong."
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#merge-function",
+    "href": "modules/Module08-DataMergeReshape.html#merge-function",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "merge() function",
+    "text": "merge() function\nWe will use the merge() function to conduct one-to-one merge\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\nMerge Two Data Frames\nDescription:\n Merge two data frames by common columns or row names, or do other\n versions of database _join_ operations.\nUsage:\n merge(x, y, ...)\n \n ## Default S3 method:\n merge(x, y, ...)\n \n ## S3 method for class 'data.frame'\n merge(x, y, by = intersect(names(x), names(y)),\n       by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,\n       sort = TRUE, suffixes = c(\".x\",\".y\"), no.dups = TRUE,\n       incomparables = NULL, ...)\n \nArguments:\nx, y: data frames, or objects to be coerced to one.\nby, by.x, by.y: specifications of the columns used for merging. See ‘Details’.\n all: logical; 'all = L' is shorthand for 'all.x = L' and 'all.y =\n      L', where 'L' is either 'TRUE' or 'FALSE'.\nall.x: logical; if ‘TRUE’, then extra rows will be added to the output, one for each row in ‘x’ that has no matching row in ‘y’. These rows will have ‘NA’s in those columns that are usually filled with values from ’y’. The default is ‘FALSE’, so that only rows with data from both ‘x’ and ‘y’ are included in the output.\nall.y: logical; analogous to ‘all.x’.\nsort: logical.  Should the result be sorted on the 'by' columns?\nsuffixes: a character vector of length 2 specifying the suffixes to be used for making unique the names of columns in the result which are not used for merging (appearing in ‘by’ etc).\nno.dups: logical indicating that ‘suffixes’ are appended in more cases to avoid duplicated column names in the result. This was implicitly false before R version 3.5.0.\nincomparables: values which cannot be matched. See ‘match’. This is intended to be used for merging on one column, so these are incomparable values of that column.\n ...: arguments to be passed to or from methods.\nDetails:\n 'merge' is a generic function whose principal method is for data\n frames: the default method coerces its arguments to data frames\n and calls the '\"data.frame\"' method.\n\n By default the data frames are merged on the columns with names\n they both have, but separate specifications of the columns can be\n given by 'by.x' and 'by.y'.  The rows in the two data frames that\n match on the specified columns are extracted, and joined together.\n If there is more than one match, all possible matches contribute\n one row each.  For the precise meaning of 'match', see 'match'.\n\n Columns to merge on can be specified by name, number or by a\n logical vector: the name '\"row.names\"' or the number '0' specifies\n the row names.  If specified by name it must correspond uniquely\n to a named column in the input.\n\n If 'by' or both 'by.x' and 'by.y' are of length 0 (a length zero\n vector or 'NULL'), the result, 'r', is the _Cartesian product_ of\n 'x' and 'y', i.e., 'dim(r) = c(nrow(x)*nrow(y), ncol(x) +\n ncol(y))'.\n\n If 'all.x' is true, all the non matching cases of 'x' are appended\n to the result as well, with 'NA' filled in the corresponding\n columns of 'y'; analogously for 'all.y'.\n\n If the columns in the data frames not used in merging have any\n common names, these have 'suffixes' ('\".x\"' and '\".y\"' by default)\n appended to try to make the names of the result unique.  If this\n is not possible, an error is thrown.\n\n If a 'by.x' column name matches one of 'y', and if 'no.dups' is\n true (as by default), the y version gets suffixed as well,\n avoiding duplicate column names in the result.\n\n The complexity of the algorithm used is proportional to the length\n of the answer.\n\n In SQL database terminology, the default value of 'all = FALSE'\n gives a _natural join_, a special case of an _inner join_.\n Specifying 'all.x = TRUE' gives a _left (outer) join_, 'all.y =\n TRUE' a _right (outer) join_, and both ('all = TRUE') a _(full)\n outer join_.  DBMSes do not match 'NULL' records, equivalent to\n 'incomparables = NA' in R.\nValue:\n A data frame.  The rows are by default lexicographically sorted on\n the common columns, but for 'sort = FALSE' are in an unspecified\n order.  The columns are the common columns followed by the\n remaining columns in 'x' and then those in 'y'.  If the matching\n involved row names, an extra character column called 'Row.names'\n is added at the left, and in all cases the result has 'automatic'\n row names.\nNote:\n This is intended to work with data frames with vector-like\n columns: some aspects work with data frames containing matrices,\n but not all.\n\n Currently long vectors are not accepted for inputs, which are thus\n restricted to less than 2^31 rows. That restriction also applies\n to the result for 32-bit platforms.\nSee Also:\n 'data.frame', 'by', 'cbind'.\n\n 'dendrogram' for a class which has a 'merge' method.\nExamples:\n authors &lt;- data.frame(\n     ## I(*) : use character columns of names to get sensible sort order\n     surname = I(c(\"Tukey\", \"Venables\", \"Tierney\", \"Ripley\", \"McNeil\")),\n     nationality = c(\"US\", \"Australia\", \"US\", \"UK\", \"Australia\"),\n     deceased = c(\"yes\", rep(\"no\", 4)))\n authorN &lt;- within(authors, { name &lt;- surname; rm(surname) })\n books &lt;- data.frame(\n     name = I(c(\"Tukey\", \"Venables\", \"Tierney\",\n              \"Ripley\", \"Ripley\", \"McNeil\", \"R Core\")),\n     title = c(\"Exploratory Data Analysis\",\n               \"Modern Applied Statistics ...\",\n               \"LISP-STAT\",\n               \"Spatial Statistics\", \"Stochastic Simulation\",\n               \"Interactive Data Analysis\",\n               \"An Introduction to R\"),\n     other.author = c(NA, \"Ripley\", NA, NA, NA, NA,\n                      \"Venables & Smith\"))\n \n (m0 &lt;- merge(authorN, books))\n (m1 &lt;- merge(authors, books, by.x = \"surname\", by.y = \"name\"))\n  m2 &lt;- merge(books, authors, by.x = \"name\", by.y = \"surname\")\n stopifnot(exprs = {\n    identical(m0, m2[, names(m0)])\n    as.character(m1[, 1]) == as.character(m2[, 1])\n    all.equal(m1[, -1], m2[, -1][ names(m1)[-1] ])\n    identical(dim(merge(m1, m2, by = NULL)),\n              c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))\n })\n \n ## \"R core\" is missing from authors and appears only here :\n merge(authors, books, by.x = \"surname\", by.y = \"name\", all = TRUE)\n \n \n ## example of using 'incomparables'\n x &lt;- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)\n y &lt;- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)\n merge(x, y, by = c(\"k1\",\"k2\")) # NA's match\n merge(x, y, by = \"k1\") # NA's match, so 6 rows\n merge(x, y, by = \"k2\", incomparables = NA) # 2 rows"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#lets-import-the-new-data-we-want-to-merge-and-take-a-look",
+    "href": "modules/Module08-DataMergeReshape.html#lets-import-the-new-data-we-want-to-merge-and-take-a-look",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Lets import the new data we want to merge and take a look",
+    "text": "Lets import the new data we want to merge and take a look\nThe new data serodata_new.csv represents a follow-up serological survey four years later. At this follow-up individuals were retested for IgG antibody concentrations and their ages were collected.\n\ndf_new &lt;- read.csv(\"data/serodata_new.csv\")\nstr(df_new)\n\n'data.frame':   636 obs. of  3 variables:\n $ observation_id   : int  5772 8095 9784 9338 6369 6885 6252 8913 7332 6941 ...\n $ IgG_concentration: num  0.261 2.981 0.282 136.638 0.381 ...\n $ age              : int  6 8 8 8 5 8 8 NA 8 6 ...\n\nsummary(df_new)\n\n\n\n\n\nobservation_id\nIgG_concentration\nage\n\n\n\n\n\nMin. :5006\nMin. : 0.0051\nMin. : 5.00\n\n\n\n1st Qu.:6328\n1st Qu.: 0.2751\n1st Qu.: 7.00\n\n\n\nMedian :7494\nMedian : 1.5477\nMedian :10.00\n\n\n\nMean :7490\nMean : 82.7684\nMean :10.63\n\n\n\n3rd Qu.:8736\n3rd Qu.:129.6389\n3rd Qu.:14.00\n\n\n\nMax. :9982\nMax. :950.6590\nMax. :19.00\n\n\n\nNA\nNA\nNA’s :9"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data",
+    "href": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Merge the new data with the original data",
+    "text": "Merge the new data with the original data\nLets load the old data as well and look for a variable, or variables, to merge by.\n\ndf &lt;- read.csv(\"data/serodata.csv\")\ncolnames(df)\n\n[1] \"observation_id\"    \"IgG_concentration\" \"age\"              \n[4] \"gender\"            \"slum\"             \n\n\nWe notice that observation_id seems to be the obvious variable by which to merge. However, we also realize that IgG_concentration and age are the exact same names. If we merge now we see that\n\nhead(merge(df, df_new, all.x=T, all.y=T, by=c('observation_id')))\n\n\n\n\n\n\n\n\n\n\n\n\n\nobservation_id\nIgG_concentration.x\nage.x\ngender\nslum\nIgG_concentration.y\nage.y\n\n\n\n\n5006\n164.2979452\n7\nMale\nNon slum\n155.5811325\n11\n\n\n5024\n0.3000000\n5\nFemale\nNon slum\n0.2918605\n9\n\n\n5026\n0.3000000\n10\nFemale\nNon slum\n0.2542945\n14\n\n\n5030\n0.0555556\n7\nFemale\nNon slum\n0.0533262\n11\n\n\n5035\n26.2112514\n11\nFemale\nNon slum\n22.0159300\n15\n\n\n5054\n0.3000000\n3\nMale\nNon slum\n0.2709671\n7"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data-1",
+    "href": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data-1",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Merge the new data with the original data",
+    "text": "Merge the new data with the original data\nThe first option is to rename the IgG_concentration and age variables before the merge, so that it is clear which is time point 1 and time point 2.\n\ndf$IgG_concentration_time1 &lt;- df$IgG_concentration\ndf$age_time1 &lt;- df$age\ndf$IgG_concentration &lt;- df$age &lt;- NULL #remove the original variables\n\ndf_new$IgG_concentration_time2 &lt;- df_new$IgG_concentration\ndf_new$age_time2 &lt;- df_new$age\ndf_new$IgG_concentration &lt;- df_new$age &lt;- NULL #remove the original variables\n\nNow, lets merge.\n\ndf_all_wide &lt;- merge(df, df_new, all.x=T, all.y=T, by=c('observation_id'))\nstr(df_all_wide)\n\n'data.frame':   651 obs. of  7 variables:\n $ observation_id         : int  5006 5024 5026 5030 5035 5054 5057 5063 5064 5080 ...\n $ gender                 : chr  \"Male\" \"Female\" \"Female\" \"Female\" ...\n $ slum                   : chr  \"Non slum\" \"Non slum\" \"Non slum\" \"Non slum\" ...\n $ IgG_concentration_time1: num  164.2979 0.3 0.3 0.0556 26.2113 ...\n $ age_time1              : int  7 5 10 7 11 3 3 12 14 6 ...\n $ IgG_concentration_time2: num  155.5811 0.2919 0.2543 0.0533 22.0159 ...\n $ age_time2              : int  11 9 14 11 15 7 7 16 18 10 ..."
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data-2",
+    "href": "modules/Module08-DataMergeReshape.html#merge-the-new-data-with-the-original-data-2",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Merge the new data with the original data",
+    "text": "Merge the new data with the original data\nThe second option is to add a time variable to the two data sets and then merge by observation_id,time,age,IgG_concentration. Note, I need to read in the data again b/c I removed the IgG_concentration and age variables.\n\ndf &lt;- read.csv(\"data/serodata.csv\")\ndf_new &lt;- read.csv(\"data/serodata_new.csv\")\n\n\ndf$time &lt;- 1 #you can put in one number and it will repeat it\ndf_new$time &lt;- 2\nhead(df)\n\n\n\n\nobservation_id\nIgG_concentration\nage\ngender\nslum\ntime\n\n\n\n\n5772\n0.3176895\n2\nFemale\nNon slum\n1\n\n\n8095\n3.4368231\n4\nFemale\nNon slum\n1\n\n\n9784\n0.3000000\n4\nMale\nNon slum\n1\n\n\n9338\n143.2363014\n4\nMale\nNon slum\n1\n\n\n6369\n0.4476534\n1\nMale\nNon slum\n1\n\n\n6885\n0.0252708\n4\nMale\nNon slum\n1\n\n\n\n\nhead(df_new)\n\n\n\n\nobservation_id\nIgG_concentration\nage\ntime\n\n\n\n\n5772\n0.2612388\n6\n2\n\n\n8095\n2.9809049\n8\n2\n\n\n9784\n0.2819489\n8\n2\n\n\n9338\n136.6382260\n8\n2\n\n\n6369\n0.3810119\n5\n2\n\n\n6885\n0.0245951\n8\n2\n\n\n\n\n\nNow, lets merge. Note, “By default the data frames are merged on the columns with names they both have” therefore if I don’t specify the by argument it will merge on all matching variables.\n\ndf_all_long &lt;- merge(df, df_new, all.x=T, all.y=T) \nstr(df_all_long)\n\n'data.frame':   1287 obs. of  6 variables:\n $ observation_id   : int  5006 5006 5024 5024 5026 5026 5030 5030 5035 5035 ...\n $ IgG_concentration: num  155.581 164.298 0.292 0.3 0.254 ...\n $ age              : int  11 7 9 5 14 10 11 7 15 11 ...\n $ time             : num  2 1 2 1 2 1 2 1 2 1 ...\n $ gender           : chr  NA \"Male\" NA \"Female\" ...\n $ slum             : chr  NA \"Non slum\" NA \"Non slum\" ...\n\n\nNote, there are 1287 rows, which is the sum of the number of rows of df (651 rows) and df_new (636 rows)"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#what-is-widelong-data",
+    "href": "modules/Module08-DataMergeReshape.html#what-is-widelong-data",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "What is wide/long data?",
+    "text": "What is wide/long data?\nAbove, we actually created a wide and long version of the data.\nWide: has many columns\n\nmultiple columns per individual, values spread across multiple columns\neasier for humans to read\n\nLong: has many rows\n\ncolumn names become data\nmultiple rows per observation, a single column contains the values\neasier for R to make plots & do analysis"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#reshape-function",
+    "href": "modules/Module08-DataMergeReshape.html#reshape-function",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "reshape() function",
+    "text": "reshape() function\nThe reshape() function allows you to toggle between wide and long data\nReshape Grouped Data\nDescription:\n This function reshapes a data frame between 'wide' format (with\n repeated measurements in separate columns of the same row) and\n 'long' format (with the repeated measurements in separate rows).\nUsage:\n reshape(data, varying = NULL, v.names = NULL, timevar = \"time\",\n         idvar = \"id\", ids = 1:NROW(data),\n         times = seq_along(varying[[1]]),\n         drop = NULL, direction, new.row.names = NULL,\n         sep = \".\",\n         split = if (sep == \"\") {\n             list(regexp = \"[A-Za-z][0-9]\", include = TRUE)\n         } else {\n             list(regexp = sep, include = FALSE, fixed = TRUE)}\n         )\n \n ### Typical usage for converting from long to wide format:\n \n # reshape(data, direction = \"wide\",\n #         idvar = \"___\", timevar = \"___\", # mandatory\n #         v.names = c(___),    # time-varying variables\n #         varying = list(___)) # auto-generated if missing\n \n ### Typical usage for converting from wide to long format:\n \n ### If names of wide-format variables are in a 'nice' format\n \n # reshape(data, direction = \"long\",\n #         varying = c(___), # vector \n #         sep)              # to help guess 'v.names' and 'times'\n \n ### To specify long-format variable names explicitly\n \n # reshape(data, direction = \"long\",\n #         varying = ___,  # list / matrix / vector (use with care)\n #         v.names = ___,  # vector of variable names in long format\n #         timevar, times, # name / values of constructed time variable\n #         idvar, ids)     # name / values of constructed id variable\n \nArguments:\ndata: a data frame\nvarying: names of sets of variables in the wide format that correspond to single variables in long format (‘time-varying’). This is canonically a list of vectors of variable names, but it can optionally be a matrix of names, or a single vector of names. In each case, when ‘direction = “long”’, the names can be replaced by indices which are interpreted as referring to ‘names(data)’. See ‘Details’ for more details and options.\nv.names: names of variables in the long format that correspond to multiple variables in the wide format. See ‘Details’.\ntimevar: the variable in long format that differentiates multiple records from the same group or individual. If more than one record matches, the first will be taken (with a warning).\nidvar: Names of one or more variables in long format that identify multiple records from the same group/individual. These variables may also be present in wide format.\n ids: the values to use for a newly created 'idvar' variable in\n      long format.\ntimes: the values to use for a newly created ‘timevar’ variable in long format. See ‘Details’.\ndrop: a vector of names of variables to drop before reshaping.\ndirection: character string, partially matched to either ‘“wide”’ to reshape to wide format, or ‘“long”’ to reshape to long format.\nnew.row.names: character or ‘NULL’: a non-null value will be used for the row names of the result.\n sep: A character vector of length 1, indicating a separating\n      character in the variable names in the wide format.  This is\n      used for guessing 'v.names' and 'times' arguments based on\n      the names in 'varying'.  If 'sep == \"\"', the split is just\n      before the first numeral that follows an alphabetic\n      character.  This is also used to create variable names when\n      reshaping to wide format.\nsplit: A list with three components, ‘regexp’, ‘include’, and (optionally) ‘fixed’. This allows an extended interface to variable name splitting. See ‘Details’.\nDetails:\n Although 'reshape()' can be used in a variety of contexts, the\n motivating application is data from longitudinal studies, and the\n arguments of this function are named and described in those terms.\n A longitudinal study is characterized by repeated measurements of\n the same variable(s), e.g., height and weight, on each unit being\n studied (e.g., individual persons) at different time points (which\n are assumed to be the same for all units). These variables are\n called time-varying variables. The study may include other\n variables that are measured only once for each unit and do not\n vary with time (e.g., gender and race); these are called\n time-constant variables.\n\n A 'wide' format representation of a longitudinal dataset will have\n one record (row) for each unit, typically with some time-constant\n variables that occupy single columns, and some time-varying\n variables that occupy multiple columns (one column for each time\n point).  A 'long' format representation of the same dataset will\n have multiple records (rows) for each individual, with the\n time-constant variables being constant across these records and\n the time-varying variables varying across the records.  The 'long'\n format dataset will have two additional variables: a 'time'\n variable identifying which time point each record comes from, and\n an 'id' variable showing which records refer to the same unit.\n\n The type of conversion (long to wide or wide to long) is\n determined by the 'direction' argument, which is mandatory unless\n the 'data' argument is the result of a previous call to 'reshape'.\n In that case, the operation can be reversed simply using\n 'reshape(data)' (the other arguments are stored as attributes on\n the data frame).\n\n Conversion from long to wide format with 'direction = \"wide\"' is\n the simpler operation, and is mainly useful in the context of\n multivariate analysis where data is often expected as a\n wide-format matrix. In this case, the time variable 'timevar' and\n id variable 'idvar' must be specified. All other variables are\n assumed to be time-varying, unless the time-varying variables are\n explicitly specified via the 'v.names' argument.  A warning is\n issued if time-constant variables are not actually constant.\n\n Each time-varying variable is expanded into multiple variables in\n the wide format.  The names of these expanded variables are\n generated automatically, unless they are specified as the\n 'varying' argument in the form of a list (or matrix) with one\n component (or row) for each time-varying variable. If 'varying' is\n a vector of names, it is implicitly converted into a matrix, with\n one row for each time-varying variable. Use this option with care\n if there are multiple time-varying variables, as the ordering (by\n column, the default in the 'matrix' constructor) may be\n unintuitive, whereas the explicit list or matrix form is\n unambiguous.\n\n Conversion from wide to long with 'direction = \"long\"' is the more\n common operation as most (univariate) statistical modeling\n functions expect data in the long format. In the simpler case\n where there is only one time-varying variable, the corresponding\n columns in the wide format input can be specified as the 'varying'\n argument, which can be either a vector of column names or the\n corresponding column indices. The name of the corresponding\n variable in the long format output combining these columns can be\n optionally specified as the 'v.names' argument, and the name of\n the time variables as the 'timevar' argument. The values to use as\n the time values corresponding to the different columns in the wide\n format can be specified as the 'times' argument.  If 'v.names' is\n unspecified, the function will attempt to guess 'v.names' and\n 'times' from 'varying' (an explicitly specified 'times' argument\n is unused in that case).  The default expects variable names like\n 'x.1', 'x.2', where 'sep = \".\"' specifies to split at the dot and\n drop it from the name.  To have alphabetic followed by numeric\n times use 'sep = \"\"'.\n\n Multiple time-varying variables can be specified in two ways,\n either with 'varying' as an atomic vector as above, or as a list\n (or a matrix). The first form is useful (and mandatory) if the\n automatic variable name splitting as described above is used; this\n requires the names of all time-varying variables to be suitably\n formatted in the same manner, and 'v.names' to be unspecified. If\n 'varying' is a list (with one component for each time-varying\n variable) or a matrix (one row for each time-varying variable),\n variable name splitting is not attempted, and 'v.names' and\n 'times' will generally need to be specified, although they will\n default to, respectively, the first variable name in each set, and\n sequential times.\n\n Also, guessing is not attempted if 'v.names' is given explicitly,\n even if 'varying' is an atomic vector. In that case, the number of\n time-varying variables is taken to be the length of 'v.names', and\n 'varying' is implicitly converted into a matrix, with one row for\n each time-varying variable. As in the case of long to wide\n conversion, the matrix is filled up by column, so careful\n attention needs to be paid to the order of variable names (or\n indices) in 'varying', which is taken to be like 'x.1', 'y.1',\n 'x.2', 'y.2' (i.e., variables corresponding to the same time point\n need to be grouped together).\n\n The 'split' argument should not usually be necessary.  The\n 'split$regexp' component is passed to either 'strsplit' or\n 'regexpr', where the latter is used if 'split$include' is 'TRUE',\n in which case the splitting occurs after the first character of\n the matched string.  In the 'strsplit' case, the separator is not\n included in the result, and it is possible to specify fixed-string\n matching using 'split$fixed'.\nValue:\n The reshaped data frame with added attributes to simplify\n reshaping back to the original form.\nSee Also:\n 'stack', 'aperm'; 'relist' for reshaping the result of 'unlist'.\n 'xtabs' and 'as.data.frame.table' for creating contingency tables\n and converting them back to data frames.\nExamples:\n summary(Indometh) # data in long format\n \n ## long to wide (direction = \"wide\") requires idvar and timevar at a minimum\n reshape(Indometh, direction = \"wide\", idvar = \"Subject\", timevar = \"time\")\n \n ## can also explicitly specify name of combined variable\n wide &lt;- reshape(Indometh, direction = \"wide\", idvar = \"Subject\",\n                 timevar = \"time\", v.names = \"conc\", sep= \"_\")\n wide\n \n ## reverse transformation\n reshape(wide, direction = \"long\")\n reshape(wide, idvar = \"Subject\", varying = list(2:12),\n         v.names = \"conc\", direction = \"long\")\n \n ## times need not be numeric\n df &lt;- data.frame(id = rep(1:4, rep(2,4)),\n                  visit = I(rep(c(\"Before\",\"After\"), 4)),\n                  x = rnorm(4), y = runif(4))\n df\n reshape(df, timevar = \"visit\", idvar = \"id\", direction = \"wide\")\n ## warns that y is really varying\n reshape(df, timevar = \"visit\", idvar = \"id\", direction = \"wide\", v.names = \"x\")\n \n \n ##  unbalanced 'long' data leads to NA fill in 'wide' form\n df2 &lt;- df[1:7, ]\n df2\n reshape(df2, timevar = \"visit\", idvar = \"id\", direction = \"wide\")\n \n ## Alternative regular expressions for guessing names\n df3 &lt;- data.frame(id = 1:4, age = c(40,50,60,50), dose1 = c(1,2,1,2),\n                   dose2 = c(2,1,2,1), dose4 = c(3,3,3,3))\n reshape(df3, direction = \"long\", varying = 3:5, sep = \"\")\n \n \n ## an example that isn't longitudinal data\n state.x77 &lt;- as.data.frame(state.x77)\n long &lt;- reshape(state.x77, idvar = \"state\", ids = row.names(state.x77),\n                 times = names(state.x77), timevar = \"Characteristic\",\n                 varying = list(names(state.x77)), direction = \"long\")\n \n reshape(long, direction = \"wide\")\n \n reshape(long, direction = \"wide\", new.row.names = unique(long$state))\n \n ## multiple id variables\n df3 &lt;- data.frame(school = rep(1:3, each = 4), class = rep(9:10, 6),\n                   time = rep(c(1,1,2,2), 3), score = rnorm(12))\n wide &lt;- reshape(df3, idvar = c(\"school\", \"class\"), direction = \"wide\")\n wide\n ## transform back\n reshape(wide)"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#long-to-wide-data",
+    "href": "modules/Module08-DataMergeReshape.html#long-to-wide-data",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "long to wide data",
+    "text": "long to wide data\nxxzane - help"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#wide-to-long-data",
+    "href": "modules/Module08-DataMergeReshape.html#wide-to-long-data",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "wide to long data",
+    "text": "wide to long data\nxxzane - help"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#summary",
+    "href": "modules/Module08-DataMergeReshape.html#summary",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Summary",
+    "text": "Summary\n\n…"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#acknowledgements",
+    "href": "modules/Module08-DataMergeReshape.html#acknowledgements",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Acknowledgements",
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R for Public Health Researchers” Johns Hopkins University"
+  },
+  {
+    "objectID": "modules/Module08-DataMergeReshape.html#lets-get-real",
+    "href": "modules/Module08-DataMergeReshape.html#lets-get-real",
+    "title": "Module 8: Data Merging and Reshaping",
+    "section": "Let’s get real",
+    "text": "Let’s get real\nUse the pivot_wider() and pivot_longer() from the tidyr package!"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#learning-objectives",
+    "href": "modules/Module09-DataAnalysis.html#learning-objectives",
+    "title": "Module 9: Data Analysis",
+    "section": "Learning Objectives",
+    "text": "Learning Objectives\nAfter module 9, you should be able to…\n\nDescriptively assess association between two variables\nCompute basic statistics\nFit a generalized linear model"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#import-data-for-this-module",
+    "href": "modules/Module09-DataAnalysis.html#import-data-for-this-module",
+    "title": "Module 9: Data Analysis",
+    "section": "Import data for this module",
+    "text": "Import data for this module\nLet’s read in our data (again) and take a quick look.\n\ndf &lt;- read.csv(file = \"data/serodata.csv\") #relative path\nhead(x=df, n=3)\n\n  observation_id IgG_concentration age gender     slum\n1           5772         0.3176895   2 Female Non slum\n2           8095         3.4368231   4 Female Non slum\n3           9784         0.3000000   4   Male Non slum"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#prep-data",
+    "href": "modules/Module09-DataAnalysis.html#prep-data",
+    "title": "Module 9: Data Analysis",
+    "section": "Prep data",
+    "text": "Prep data\nCreate age_group three level factor variable\n\ndf$age_group &lt;- ifelse(df$age &lt;= 5, \"young\", \n                       ifelse(df$age&lt;=10 & df$age&gt;5, \"middle\", \"old\"))\ndf$age_group &lt;- factor(df$age_group, levels=c(\"young\", \"middle\", \"old\"))\n\nCreate seropos binary variable representing seropositivity if antibody concentrations are &gt;10 IU/mL.\n\ndf$seropos &lt;- ifelse(df$IgG_concentration&lt;10, 0, 1)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#variable-contingency-tables",
+    "href": "modules/Module09-DataAnalysis.html#variable-contingency-tables",
+    "title": "Module 9: Data Analysis",
+    "section": "2 variable contingency tables",
+    "text": "2 variable contingency tables\nWe use table() prior to look at one variable, now we can generate frequency tables for 2 plus variables. To get cell percentages, the prop.table() is useful.\n\n?prop.table\n\n\nlibrary(printr)\n\nRegistered S3 method overwritten by 'printr':\n  method                from     \n  knit_print.data.frame rmarkdown\n\n?prop.table\n\nExpress Table Entries as Fraction of Marginal Table\n\nDescription:\n\n     Returns conditional proportions given 'margins', i.e. entries of\n     'x', divided by the appropriate marginal sums.\n\nUsage:\n\n     proportions(x, margin = NULL)\n     prop.table(x, margin = NULL)\n     \nArguments:\n\n       x: table\n\n  margin: a vector giving the margins to split by.  E.g., for a matrix\n          '1' indicates rows, '2' indicates columns, 'c(1, 2)'\n          indicates rows and columns.  When 'x' has named dimnames, it\n          can be a character vector selecting dimension names.\n\nValue:\n\n     Table like 'x' expressed relative to 'margin'\n\nNote:\n\n     'prop.table' is an earlier name, retained for back-compatibility.\n\nAuthor(s):\n\n     Peter Dalgaard\n\nSee Also:\n\n     'marginSums'. 'apply', 'sweep' are a more general mechanism for\n     sweeping out marginal statistics.\n\nExamples:\n\n     m &lt;- matrix(1:4, 2)\n     m\n     proportions(m, 1)\n     \n     DF &lt;- as.data.frame(UCBAdmissions)\n     tbl &lt;- xtabs(Freq ~ Gender + Admit, DF)\n     \n     proportions(tbl, \"Gender\")"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#chi-square-test",
+    "href": "modules/Module09-DataAnalysis.html#chi-square-test",
+    "title": "Module 9: Data Analysis",
+    "section": "Chi-Square test",
+    "text": "Chi-Square test\nThe chisq.test() function test of independence of factor variables from stats package.\n\n?chisq.test\n\nPearson’s Chi-squared Test for Count Data\nDescription:\n 'chisq.test' performs chi-squared contingency table tests and\n goodness-of-fit tests.\nUsage:\n chisq.test(x, y = NULL, correct = TRUE,\n            p = rep(1/length(x), length(x)), rescale.p = FALSE,\n            simulate.p.value = FALSE, B = 2000)\n \nArguments:\n   x: a numeric vector or matrix. 'x' and 'y' can also both be\n      factors.\n\n   y: a numeric vector; ignored if 'x' is a matrix.  If 'x' is a\n      factor, 'y' should be a factor of the same length.\ncorrect: a logical indicating whether to apply continuity correction when computing the test statistic for 2 by 2 tables: one half is subtracted from all |O - E| differences; however, the correction will not be bigger than the differences themselves. No correction is done if ‘simulate.p.value = TRUE’.\n   p: a vector of probabilities of the same length as 'x'.  An\n      error is given if any entry of 'p' is negative.\nrescale.p: a logical scalar; if TRUE then ‘p’ is rescaled (if necessary) to sum to 1. If ‘rescale.p’ is FALSE, and ‘p’ does not sum to 1, an error is given.\nsimulate.p.value: a logical indicating whether to compute p-values by Monte Carlo simulation.\n   B: an integer specifying the number of replicates used in the\n      Monte Carlo test.\nDetails:\n If 'x' is a matrix with one row or column, or if 'x' is a vector\n and 'y' is not given, then a _goodness-of-fit test_ is performed\n ('x' is treated as a one-dimensional contingency table).  The\n entries of 'x' must be non-negative integers.  In this case, the\n hypothesis tested is whether the population probabilities equal\n those in 'p', or are all equal if 'p' is not given.\n\n If 'x' is a matrix with at least two rows and columns, it is taken\n as a two-dimensional contingency table: the entries of 'x' must be\n non-negative integers.  Otherwise, 'x' and 'y' must be vectors or\n factors of the same length; cases with missing values are removed,\n the objects are coerced to factors, and the contingency table is\n computed from these.  Then Pearson's chi-squared test is performed\n of the null hypothesis that the joint distribution of the cell\n counts in a 2-dimensional contingency table is the product of the\n row and column marginals.\n\n If 'simulate.p.value' is 'FALSE', the p-value is computed from the\n asymptotic chi-squared distribution of the test statistic;\n continuity correction is only used in the 2-by-2 case (if\n 'correct' is 'TRUE', the default).  Otherwise the p-value is\n computed for a Monte Carlo test (Hope, 1968) with 'B' replicates.\n The default 'B = 2000' implies a minimum p-value of about 0.0005\n (1/(B+1)).\n\n In the contingency table case, simulation is done by random\n sampling from the set of all contingency tables with given\n marginals, and works only if the marginals are strictly positive.\n Continuity correction is never used, and the statistic is quoted\n without it.  Note that this is not the usual sampling situation\n assumed for the chi-squared test but rather that for Fisher's\n exact test.\n\n In the goodness-of-fit case simulation is done by random sampling\n from the discrete distribution specified by 'p', each sample being\n of size 'n = sum(x)'.  This simulation is done in R and may be\n slow.\nValue:\n A list with class '\"htest\"' containing the following components:\nstatistic: the value the chi-squared test statistic.\nparameter: the degrees of freedom of the approximate chi-squared distribution of the test statistic, ‘NA’ if the p-value is computed by Monte Carlo simulation.\np.value: the p-value for the test.\nmethod: a character string indicating the type of test performed, and whether Monte Carlo simulation or continuity correction was used.\ndata.name: a character string giving the name(s) of the data.\nobserved: the observed counts.\nexpected: the expected counts under the null hypothesis.\nresiduals: the Pearson residuals, ‘(observed - expected) / sqrt(expected)’.\nstdres: standardized residuals, ‘(observed - expected) / sqrt(V)’, where ‘V’ is the residual cell variance (Agresti, 2007, section 2.4.5 for the case where ‘x’ is a matrix, ‘n * p * (1 - p)’ otherwise).\nSource:\n The code for Monte Carlo simulation is a C translation of the\n Fortran algorithm of Patefield (1981).\nReferences:\n Hope, A. C. A. (1968).  A simplified Monte Carlo significance test\n procedure.  _Journal of the Royal Statistical Society Series B_,\n *30*, 582-598.  doi:10.1111/j.2517-6161.1968.tb00759.x\n &lt;https://doi.org/10.1111/j.2517-6161.1968.tb00759.x&gt;.\n\n Patefield, W. M. (1981).  Algorithm AS 159: An efficient method of\n generating r x c tables with given row and column totals.\n _Applied Statistics_, *30*, 91-97.  doi:10.2307/2346669\n &lt;https://doi.org/10.2307/2346669&gt;.\n\n Agresti, A. (2007).  _An Introduction to Categorical Data\n Analysis_, 2nd ed.  New York: John Wiley & Sons.  Page 38.\nSee Also:\n For goodness-of-fit testing, notably of continuous distributions,\n 'ks.test'.\nExamples:\n ## From Agresti(2007) p.39\n M &lt;- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))\n dimnames(M) &lt;- list(gender = c(\"F\", \"M\"),\n                     party = c(\"Democrat\",\"Independent\", \"Republican\"))\n (Xsq &lt;- chisq.test(M))  # Prints test summary\n Xsq$observed   # observed counts (same as M)\n Xsq$expected   # expected counts under the null\n Xsq$residuals  # Pearson residuals\n Xsq$stdres     # standardized residuals\n \n \n ## Effect of simulating p-values\n x &lt;- matrix(c(12, 5, 7, 7), ncol = 2)\n chisq.test(x)$p.value           # 0.4233\n chisq.test(x, simulate.p.value = TRUE, B = 10000)$p.value\n                                 # around 0.29!\n \n ## Testing for population probabilities\n ## Case A. Tabulated data\n x &lt;- c(A = 20, B = 15, C = 25)\n chisq.test(x)\n chisq.test(as.table(x))             # the same\n x &lt;- c(89,37,30,28,2)\n p &lt;- c(40,20,20,15,5)\n try(\n chisq.test(x, p = p)                # gives an error\n )\n chisq.test(x, p = p, rescale.p = TRUE)\n                                 # works\n p &lt;- c(0.40,0.20,0.20,0.19,0.01)\n                                 # Expected count in category 5\n                                 # is 1.86 &lt; 5 ==&gt; chi square approx.\n chisq.test(x, p = p)            #               maybe doubtful, but is ok!\n chisq.test(x, p = p, simulate.p.value = TRUE)\n \n ## Case B. Raw data\n x &lt;- trunc(5 * runif(100))\n chisq.test(table(x))            # NOT 'chisq.test(x)'!"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#chi-square-test-1",
+    "href": "modules/Module09-DataAnalysis.html#chi-square-test-1",
+    "title": "Module 9: Data Analysis",
+    "section": "Chi-Square test",
+    "text": "Chi-Square test\n\nchisq.test(freq)\n\n\n    Pearson's Chi-squared test\n\ndata:  freq\nX-squared = 175.85, df = 2, p-value &lt; 2.2e-16\n\n\nWe reject the null hypothesis that the proportion of seropositive individuals in the young, middle, and old age groups are the same."
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#correlation",
+    "href": "modules/Module09-DataAnalysis.html#correlation",
+    "title": "Module 9: Data Analysis",
+    "section": "Correlation",
+    "text": "Correlation\nFirst, we compute correlation by providing two vectors.\nLike other functions, if there are NAs, you get NA as the result. But if you specify use only the complete observations, then it will give you correlation using the non-missing data.\n\ncor(df$age, df$IgG_concentration, method=\"pearson\")\n\n[1] NA\n\ncor(df$age, df$IgG_concentration, method=\"pearson\", use = \"complete.obs\") #IF have missing data\n\n[1] 0.2604783\n\n\nSmall positive correlation between IgG concentration and age."
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#t-test",
+    "href": "modules/Module09-DataAnalysis.html#t-test",
+    "title": "Module 9: Data Analysis",
+    "section": "T-test",
+    "text": "T-test\nThe commonly used are:\n\none-sample t-test – used to test mean of a variable in one group (to the null hypothesis mean)\ntwo-sample t-test – used to test difference in means of a variable between two groups (null hypothesis - the group means are the same)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#t-test-1",
+    "href": "modules/Module09-DataAnalysis.html#t-test-1",
+    "title": "Module 9: Data Analysis",
+    "section": "T-test",
+    "text": "T-test\nWe can use the t.test() function from the stats package.\n\n?t.test\n\nStudent’s t-Test\nDescription:\n Performs one and two sample t-tests on vectors of data.\nUsage:\n t.test(x, ...)\n \n ## Default S3 method:\n t.test(x, y = NULL,\n        alternative = c(\"two.sided\", \"less\", \"greater\"),\n        mu = 0, paired = FALSE, var.equal = FALSE,\n        conf.level = 0.95, ...)\n \n ## S3 method for class 'formula'\n t.test(formula, data, subset, na.action, ...)\n \nArguments:\n   x: a (non-empty) numeric vector of data values.\n\n   y: an optional (non-empty) numeric vector of data values.\nalternative: a character string specifying the alternative hypothesis, must be one of ‘“two.sided”’ (default), ‘“greater”’ or ‘“less”’. You can specify just the initial letter.\n  mu: a number indicating the true value of the mean (or difference\n      in means if you are performing a two sample test).\npaired: a logical indicating whether you want a paired t-test.\nvar.equal: a logical variable indicating whether to treat the two variances as being equal. If ‘TRUE’ then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.\nconf.level: confidence level of the interval.\nformula: a formula of the form ‘lhs ~ rhs’ where ‘lhs’ is a numeric variable giving the data values and ‘rhs’ either ‘1’ for a one-sample or paired test or a factor with two levels giving the corresponding groups. If ‘lhs’ is of class ‘“Pair”’ and ‘rhs’ is ‘1’, a paired test is done.\ndata: an optional matrix or data frame (or similar: see\n      'model.frame') containing the variables in the formula\n      'formula'.  By default the variables are taken from\n      'environment(formula)'.\nsubset: an optional vector specifying a subset of observations to be used.\nna.action: a function which indicates what should happen when the data contain ‘NA’s. Defaults to ’getOption(“na.action”)’.\n ...: further arguments to be passed to or from methods.\nDetails:\n 'alternative = \"greater\"' is the alternative that 'x' has a larger\n mean than 'y'. For the one-sample case: that the mean is positive.\n\n If 'paired' is 'TRUE' then both 'x' and 'y' must be specified and\n they must be the same length.  Missing values are silently removed\n (in pairs if 'paired' is 'TRUE').  If 'var.equal' is 'TRUE' then\n the pooled estimate of the variance is used.  By default, if\n 'var.equal' is 'FALSE' then the variance is estimated separately\n for both groups and the Welch modification to the degrees of\n freedom is used.\n\n If the input data are effectively constant (compared to the larger\n of the two means) an error is generated.\nValue:\n A list with class '\"htest\"' containing the following components:\nstatistic: the value of the t-statistic.\nparameter: the degrees of freedom for the t-statistic.\np.value: the p-value for the test.\nconf.int: a confidence interval for the mean appropriate to the specified alternative hypothesis.\nestimate: the estimated mean or difference in means depending on whether it was a one-sample test or a two-sample test.\nnull.value: the specified hypothesized value of the mean or mean difference depending on whether it was a one-sample test or a two-sample test.\nstderr: the standard error of the mean (difference), used as denominator in the t-statistic formula.\nalternative: a character string describing the alternative hypothesis.\nmethod: a character string indicating what type of t-test was performed.\ndata.name: a character string giving the name(s) of the data.\nSee Also:\n 'prop.test'\nExamples:\n require(graphics)\n \n t.test(1:10, y = c(7:20))      # P = .00001855\n t.test(1:10, y = c(7:20, 200)) # P = .1245    -- NOT significant anymore\n \n ## Classical example: Student's sleep data\n plot(extra ~ group, data = sleep)\n ## Traditional interface\n with(sleep, t.test(extra[group == 1], extra[group == 2]))\n \n ## Formula interface\n t.test(extra ~ group, data = sleep)\n \n ## Formula interface to one-sample test\n t.test(extra ~ 1, data = sleep)\n \n ## Formula interface to paired test\n ## The sleep data are actually paired, so could have been in wide format:\n sleep2 &lt;- reshape(sleep, direction = \"wide\", \n                   idvar = \"ID\", timevar = \"group\")\n t.test(Pair(extra.1, extra.2) ~ 1, data = sleep2)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#running-two-sample-t-test",
+    "href": "modules/Module09-DataAnalysis.html#running-two-sample-t-test",
+    "title": "Module 9: Data Analysis",
+    "section": "Running two-sample t-test",
+    "text": "Running two-sample t-test\nThe base R - t.test() function from the stats package. It tests test difference in means of a variable between two groups. By default:\n\ntests whether difference in means of a variable is equal to 0 (default mu=0)\nuses “two sided” alternative (alternative = \"two.sided\")\nreturns result assuming confidence level 0.95 (conf.level = 0.95)\nassumes data are not paired (paired = FALSE)\nassumes true variance in the two groups is not equal (var.equal = FALSE)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#running-two-sample-t-test-1",
+    "href": "modules/Module09-DataAnalysis.html#running-two-sample-t-test-1",
+    "title": "Module 9: Data Analysis",
+    "section": "Running two-sample t-test",
+    "text": "Running two-sample t-test\n\nIgG_young &lt;- df$IgG_concentration[df$age_group==\"young\"]\nIgG_old &lt;- df$IgG_concentration[df$age_group==\"old\"]\n\nt.test(IgG_young, IgG_old)\n\n\n    Welch Two Sample t-test\n\ndata:  IgG_young and IgG_old\nt = -6.1969, df = 259.54, p-value = 2.25e-09\nalternative hypothesis: true difference in means is not equal to 0\n95 percent confidence interval:\n -111.09281  -57.51515\nsample estimates:\nmean of x mean of y \n 45.05056 129.35454 \n\n\nThe mean IgG concenration of young and old is 45.05 and 129.35 IU/mL, respectively. We reject null hypothesis that the difference in the mean IgG concentration of young and old is 0 IU/mL."
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r",
+    "href": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r",
+    "title": "Module 9: Data Analysis",
+    "section": "Linear regression fit in R",
+    "text": "Linear regression fit in R\nTo fit regression models in R, we use the function glm() (Generalized Linear Model).\n\n?glm\n\nFitting Generalized Linear Models\nDescription:\n 'glm' is used to fit generalized linear models, specified by\n giving a symbolic description of the linear predictor and a\n description of the error distribution.\nUsage:\n glm(formula, family = gaussian, data, weights, subset,\n     na.action, start = NULL, etastart, mustart, offset,\n     control = list(...), model = TRUE, method = \"glm.fit\",\n     x = FALSE, y = TRUE, singular.ok = TRUE, contrasts = NULL, ...)\n \n glm.fit(x, y, weights = rep.int(1, nobs),\n         start = NULL, etastart = NULL, mustart = NULL,\n         offset = rep.int(0, nobs), family = gaussian(),\n         control = list(), intercept = TRUE, singular.ok = TRUE)\n \n ## S3 method for class 'glm'\n weights(object, type = c(\"prior\", \"working\"), ...)\n \nArguments:\nformula: an object of class ‘“formula”’ (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.\nfamily: a description of the error distribution and link function to be used in the model. For ‘glm’ this can be a character string naming a family function, a family function or the result of a call to a family function. For ‘glm.fit’ only the third option is supported. (See ‘family’ for details of family functions.)\ndata: an optional data frame, list or environment (or object\n      coercible by 'as.data.frame' to a data frame) containing the\n      variables in the model.  If not found in 'data', the\n      variables are taken from 'environment(formula)', typically\n      the environment from which 'glm' is called.\nweights: an optional vector of ‘prior weights’ to be used in the fitting process. Should be ‘NULL’ or a numeric vector.\nsubset: an optional vector specifying a subset of observations to be used in the fitting process.\nna.action: a function which indicates what should happen when the data contain ‘NA’s. The default is set by the ’na.action’ setting of ‘options’, and is ‘na.fail’ if that is unset. The ‘factory-fresh’ default is ‘na.omit’. Another possible value is ‘NULL’, no action. Value ‘na.exclude’ can be useful.\nstart: starting values for the parameters in the linear predictor.\netastart: starting values for the linear predictor.\nmustart: starting values for the vector of means.\noffset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be ‘NULL’ or a numeric vector of length equal to the number of cases. One or more ‘offset’ terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See ‘model.offset’.\ncontrol: a list of parameters for controlling the fitting process. For ‘glm.fit’ this is passed to ‘glm.control’.\nmodel: a logical value indicating whether model frame should be included as a component of the returned value.\nmethod: the method to be used in fitting the model. The default method ‘“glm.fit”’ uses iteratively reweighted least squares (IWLS): the alternative ‘“model.frame”’ returns the model frame and does no fitting.\n      User-supplied fitting functions can be supplied either as a\n      function or a character string naming a function, with a\n      function which takes the same arguments as 'glm.fit'.  If\n      specified as a character string it is looked up from within\n      the 'stats' namespace.\n\nx, y: For 'glm': logical values indicating whether the response\n      vector and model matrix used in the fitting process should be\n      returned as components of the returned value.\n\n      For 'glm.fit': 'x' is a design matrix of dimension 'n * p',\n      and 'y' is a vector of observations of length 'n'.\nsingular.ok: logical; if ‘FALSE’ a singular fit is an error.\ncontrasts: an optional list. See the ‘contrasts.arg’ of ‘model.matrix.default’.\nintercept: logical. Should an intercept be included in the null model?\nobject: an object inheriting from class ‘“glm”’.\ntype: character, partial matching allowed.  Type of weights to\n      extract from the fitted model object.  Can be abbreviated.\n\n ...: For 'glm': arguments to be used to form the default 'control'\n      argument if it is not supplied directly.\n\n      For 'weights': further arguments passed to or from other\n      methods.\nDetails:\n A typical predictor has the form 'response ~ terms' where\n 'response' is the (numeric) response vector and 'terms' is a\n series of terms which specifies a linear predictor for 'response'.\n For 'binomial' and 'quasibinomial' families the response can also\n be specified as a 'factor' (when the first level denotes failure\n and all others success) or as a two-column matrix with the columns\n giving the numbers of successes and failures.  A terms\n specification of the form 'first + second' indicates all the terms\n in 'first' together with all the terms in 'second' with any\n duplicates removed.\n\n A specification of the form 'first:second' indicates the set of\n terms obtained by taking the interactions of all terms in 'first'\n with all terms in 'second'.  The specification 'first*second'\n indicates the _cross_ of 'first' and 'second'.  This is the same\n as 'first + second + first:second'.\n\n The terms in the formula will be re-ordered so that main effects\n come first, followed by the interactions, all second-order, all\n third-order and so on: to avoid this pass a 'terms' object as the\n formula.\n\n Non-'NULL' 'weights' can be used to indicate that different\n observations have different dispersions (with the values in\n 'weights' being inversely proportional to the dispersions); or\n equivalently, when the elements of 'weights' are positive integers\n w_i, that each response y_i is the mean of w_i unit-weight\n observations.  For a binomial GLM prior weights are used to give\n the number of trials when the response is the proportion of\n successes: they would rarely be used for a Poisson GLM.\n\n 'glm.fit' is the workhorse function: it is not normally called\n directly but can be more efficient where the response vector,\n design matrix and family have already been calculated.\n\n If more than one of 'etastart', 'start' and 'mustart' is\n specified, the first in the list will be used.  It is often\n advisable to supply starting values for a 'quasi' family, and also\n for families with unusual links such as 'gaussian(\"log\")'.\n\n All of 'weights', 'subset', 'offset', 'etastart' and 'mustart' are\n evaluated in the same way as variables in 'formula', that is first\n in 'data' and then in the environment of 'formula'.\n\n For the background to warning messages about 'fitted probabilities\n numerically 0 or 1 occurred' for binomial GLMs, see Venables &\n Ripley (2002, pp. 197-8).\nValue:\n 'glm' returns an object of class inheriting from '\"glm\"' which\n inherits from the class '\"lm\"'. See later in this section.  If a\n non-standard 'method' is used, the object will also inherit from\n the class (if any) returned by that function.\n\n The function 'summary' (i.e., 'summary.glm') can be used to obtain\n or print a summary of the results and the function 'anova' (i.e.,\n 'anova.glm') to produce an analysis of variance table.\n\n The generic accessor functions 'coefficients', 'effects',\n 'fitted.values' and 'residuals' can be used to extract various\n useful features of the value returned by 'glm'.\n\n 'weights' extracts a vector of weights, one for each case in the\n fit (after subsetting and 'na.action').\n\n An object of class '\"glm\"' is a list containing at least the\n following components:\ncoefficients: a named vector of coefficients\nresiduals: the working residuals, that is the residuals in the final iteration of the IWLS fit. Since cases with zero weights are omitted, their working residuals are ‘NA’.\nfitted.values: the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.\nrank: the numeric rank of the fitted linear model.\nfamily: the ‘family’ object used.\nlinear.predictors: the linear fit on link scale.\ndeviance: up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.\n aic: A version of Akaike's _An Information Criterion_, minus twice\n      the maximized log-likelihood plus twice the number of\n      parameters, computed via the 'aic' component of the family.\n      For binomial and Poison families the dispersion is fixed at\n      one and the number of parameters is the number of\n      coefficients.  For gaussian, Gamma and inverse gaussian\n      families the dispersion is estimated from the residual\n      deviance, and the number of parameters is the number of\n      coefficients plus one.  For a gaussian family the MLE of the\n      dispersion is used so this is a valid value of AIC, but for\n      Gamma and inverse gaussian families it is not.  For families\n      fitted by quasi-likelihood the value is 'NA'.\nnull.deviance: The deviance for the null model, comparable with ‘deviance’. The null model will include the offset, and an intercept if there is one in the model. Note that this will be incorrect if the link function depends on the data other than through the fitted mean: specify a zero offset to force a correct calculation.\niter: the number of iterations of IWLS used.\nweights: the working weights, that is the weights in the final iteration of the IWLS fit.\nprior.weights: the weights initially supplied, a vector of ’1’s if none were.\ndf.residual: the residual degrees of freedom.\ndf.null: the residual degrees of freedom for the null model.\n   y: if requested (the default) the 'y' vector used. (It is a\n      vector even for a binomial model.)\n\n   x: if requested, the model matrix.\nmodel: if requested (the default), the model frame.\nconverged: logical. Was the IWLS algorithm judged to have converged?\nboundary: logical. Is the fitted value on the boundary of the attainable values?\ncall: the matched call.\nformula: the formula supplied.\nterms: the ‘terms’ object used.\ndata: the 'data argument'.\noffset: the offset vector used.\ncontrol: the value of the ‘control’ argument used.\nmethod: the name of the fitter function used (when provided as a ‘character’ string to ‘glm()’) or the fitter ‘function’ (when provided as that).\ncontrasts: (where relevant) the contrasts used.\nxlevels: (where relevant) a record of the levels of the factors used in fitting.\nna.action: (where relevant) information returned by ‘model.frame’ on the special handling of ’NA’s.\n In addition, non-empty fits will have components 'qr', 'R' and\n 'effects' relating to the final weighted linear fit.\n\n Objects of class '\"glm\"' are normally of class 'c(\"glm\", \"lm\")',\n that is inherit from class '\"lm\"', and well-designed methods for\n class '\"lm\"' will be applied to the weighted linear model at the\n final iteration of IWLS.  However, care is needed, as extractor\n functions for class '\"glm\"' such as 'residuals' and 'weights' do\n *not* just pick out the component of the fit with the same name.\n\n If a 'binomial' 'glm' model was specified by giving a two-column\n response, the weights returned by 'prior.weights' are the total\n numbers of cases (factored by the supplied case weights) and the\n component 'y' of the result is the proportion of successes.\nFitting functions:\n The argument 'method' serves two purposes.  One is to allow the\n model frame to be recreated with no fitting.  The other is to\n allow the default fitting function 'glm.fit' to be replaced by a\n function which takes the same arguments and uses a different\n fitting algorithm.  If 'glm.fit' is supplied as a character string\n it is used to search for a function of that name, starting in the\n 'stats' namespace.\n\n The class of the object return by the fitter (if any) will be\n prepended to the class returned by 'glm'.\nAuthor(s):\n The original R implementation of 'glm' was written by Simon Davies\n working for Ross Ihaka at the University of Auckland, but has\n since been extensively re-written by members of the R Core team.\n\n The design was inspired by the S function of the same name\n described in Hastie & Pregibon (1992).\nReferences:\n Dobson, A. J. (1990) _An Introduction to Generalized Linear\n Models._ London: Chapman and Hall.\n\n Hastie, T. J. and Pregibon, D. (1992) _Generalized linear models._\n Chapter 6 of _Statistical Models in S_ eds J. M. Chambers and T.\n J. Hastie, Wadsworth & Brooks/Cole.\n\n McCullagh P. and Nelder, J. A. (1989) _Generalized Linear Models._\n London: Chapman and Hall.\n\n Venables, W. N. and Ripley, B. D. (2002) _Modern Applied\n Statistics with S._ New York: Springer.\nSee Also:\n 'anova.glm', 'summary.glm', etc. for 'glm' methods, and the\n generic functions 'anova', 'summary', 'effects', 'fitted.values',\n and 'residuals'.\n\n 'lm' for non-generalized _linear_ models (which SAS calls GLMs,\n for 'general' linear models).\n\n 'loglin' and 'loglm' (package 'MASS') for fitting log-linear\n models (which binomial and Poisson GLMs are) to contingency\n tables.\n\n 'bigglm' in package 'biglm' for an alternative way to fit GLMs to\n large datasets (especially those with many cases).\n\n 'esoph', 'infert' and 'predict.glm' have examples of fitting\n binomial glms.\nExamples:\n ## Dobson (1990) Page 93: Randomized Controlled Trial :\n counts &lt;- c(18,17,15,20,10,20,25,13,12)\n outcome &lt;- gl(3,1,9)\n treatment &lt;- gl(3,3)\n data.frame(treatment, outcome, counts) # showing data\n glm.D93 &lt;- glm(counts ~ outcome + treatment, family = poisson())\n anova(glm.D93)\n summary(glm.D93)\n ## Computing AIC [in many ways]:\n (A0 &lt;- AIC(glm.D93))\n (ll &lt;- logLik(glm.D93))\n A1 &lt;- -2*c(ll) + 2*attr(ll, \"df\")\n A2 &lt;- glm.D93$family$aic(counts, mu=fitted(glm.D93), wt=1) +\n         2 * length(coef(glm.D93))\n stopifnot(exprs = {\n   all.equal(A0, A1)\n   all.equal(A1, A2)\n   all.equal(A1, glm.D93$aic)\n })\n \n \n ## an example with offsets from Venables & Ripley (2002, p.189)\n utils::data(anorexia, package = \"MASS\")\n \n anorex.1 &lt;- glm(Postwt ~ Prewt + Treat + offset(Prewt),\n                 family = gaussian, data = anorexia)\n summary(anorex.1)\n \n \n # A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)\n clotting &lt;- data.frame(\n     u = c(5,10,15,20,30,40,60,80,100),\n     lot1 = c(118,58,42,35,27,25,21,19,18),\n     lot2 = c(69,35,26,21,18,16,13,12,12))\n summary(glm(lot1 ~ log(u), data = clotting, family = Gamma))\n summary(glm(lot2 ~ log(u), data = clotting, family = Gamma))\n ## Aliased (\"S\"ingular) -&gt; 1 NA coefficient\n (fS &lt;- glm(lot2 ~ log(u) + log(u^2), data = clotting, family = Gamma))\n tools::assertError(update(fS, singular.ok=FALSE), verbose=interactive())\n ## -&gt; .. \"singular fit encountered\"\n \n ## Not run:\n \n ## for an example of the use of a terms object as a formula\n demo(glm.vr)\n ## End(Not run)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r-1",
+    "href": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r-1",
+    "title": "Module 9: Data Analysis",
+    "section": "Linear regression fit in R",
+    "text": "Linear regression fit in R\nWe tend to focus on three arguments:\n\nformula – model formula written using names of columns in our data\ndata – our data frame\nfamily – error distribution and link function\n\n\nfit1 &lt;- glm(IgG_concentration~age+gender+slum, data=df, family=gaussian())\nfit2 &lt;- glm(seropos~age_group+gender+slum, data=df, family = binomial(link = \"logit\"))"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#summary.glm",
+    "href": "modules/Module09-DataAnalysis.html#summary.glm",
+    "title": "Module 9: Data Analysis",
+    "section": "summary.glm()",
+    "text": "summary.glm()\nThe summary() function when applied to a fit object based on a glm is technically the summary.glm() function and produces details of the model fit. Note on object oriented code.\n\nSummarizing Generalized Linear Model Fits\nDescription:\n These functions are all 'methods' for class 'glm' or 'summary.glm'\n objects.\nUsage:\n ## S3 method for class 'glm'\n summary(object, dispersion = NULL, correlation = FALSE,\n         symbolic.cor = FALSE, ...)\n \n ## S3 method for class 'summary.glm'\n print(x, digits = max(3, getOption(\"digits\") - 3),\n       symbolic.cor = x$symbolic.cor,\n       signif.stars = getOption(\"show.signif.stars\"),\n       show.residuals = FALSE, ...)\n \nArguments:\nobject: an object of class ‘“glm”’, usually, a result of a call to ‘glm’.\n   x: an object of class '\"summary.glm\"', usually, a result of a\n      call to 'summary.glm'.\ndispersion: the dispersion parameter for the family used. Either a single numerical value or ‘NULL’ (the default), when it is inferred from ‘object’ (see ‘Details’).\ncorrelation: logical; if ‘TRUE’, the correlation matrix of the estimated parameters is returned and printed.\ndigits: the number of significant digits to use when printing.\nsymbolic.cor: logical. If ‘TRUE’, print the correlations in a symbolic form (see ‘symnum’) rather than as numbers.\nsignif.stars: logical. If ‘TRUE’, ‘significance stars’ are printed for each coefficient.\nshow.residuals: logical. If ‘TRUE’ then a summary of the deviance residuals is printed at the head of the output.\n ...: further arguments passed to or from other methods.\nDetails:\n 'print.summary.glm' tries to be smart about formatting the\n coefficients, standard errors, etc. and additionally gives\n 'significance stars' if 'signif.stars' is 'TRUE'.  The\n 'coefficients' component of the result gives the estimated\n coefficients and their estimated standard errors, together with\n their ratio.  This third column is labelled 't ratio' if the\n dispersion is estimated, and 'z ratio' if the dispersion is known\n (or fixed by the family).  A fourth column gives the two-tailed\n p-value corresponding to the t or z ratio based on a Student t or\n Normal reference distribution.  (It is possible that the\n dispersion is not known and there are no residual degrees of\n freedom from which to estimate it.  In that case the estimate is\n 'NaN'.)\n\n Aliased coefficients are omitted in the returned object but\n restored by the 'print' method.\n\n Correlations are printed to two decimal places (or symbolically):\n to see the actual correlations print 'summary(object)$correlation'\n directly.\n\n The dispersion of a GLM is not used in the fitting process, but it\n is needed to find standard errors.  If 'dispersion' is not\n supplied or 'NULL', the dispersion is taken as '1' for the\n 'binomial' and 'Poisson' families, and otherwise estimated by the\n residual Chisquared statistic (calculated from cases with non-zero\n weights) divided by the residual degrees of freedom.\n\n 'summary' can be used with Gaussian 'glm' fits to handle the case\n of a linear regression with known error variance, something not\n handled by 'summary.lm'.\nValue:\n 'summary.glm' returns an object of class '\"summary.glm\"', a list\n with components\n\ncall: the component from 'object'.\nfamily: the component from ‘object’.\ndeviance: the component from ‘object’.\ncontrasts: the component from ‘object’.\ndf.residual: the component from ‘object’.\nnull.deviance: the component from ‘object’.\ndf.null: the component from ‘object’.\ndeviance.resid: the deviance residuals: see ‘residuals.glm’.\ncoefficients: the matrix of coefficients, standard errors, z-values and p-values. Aliased coefficients are omitted.\naliased: named logical vector showing if the original coefficients are aliased.\ndispersion: either the supplied argument or the inferred/estimated dispersion if the former is ‘NULL’.\n  df: a 3-vector of the rank of the model and the number of\n      residual degrees of freedom, plus number of coefficients\n      (including aliased ones).\ncov.unscaled: the unscaled (‘dispersion = 1’) estimated covariance matrix of the estimated coefficients.\ncov.scaled: ditto, scaled by ‘dispersion’.\ncorrelation: (only if ‘correlation’ is true.) The estimated correlations of the estimated coefficients.\nsymbolic.cor: (only if ‘correlation’ is true.) The value of the argument ‘symbolic.cor’.\nSee Also:\n 'glm', 'summary'.\nExamples:\n ## For examples see example(glm)"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r-2",
+    "href": "modules/Module09-DataAnalysis.html#linear-regression-fit-in-r-2",
+    "title": "Module 9: Data Analysis",
+    "section": "Linear regression fit in R",
+    "text": "Linear regression fit in R\nLets look at the output…\n\nsummary(fit1)\n\n\nCall:\nglm(formula = IgG_concentration ~ age + gender + slum, family = gaussian(), \n    data = df)\n\nCoefficients:\n             Estimate Std. Error t value Pr(&gt;|t|)    \n(Intercept)    46.132     16.774   2.750  0.00613 ** \nage             9.324      1.388   6.718 4.15e-11 ***\ngenderMale     -9.655     11.543  -0.836  0.40321    \nslumNon slum  -20.353     14.299  -1.423  0.15513    \nslumSlum      -29.705     25.009  -1.188  0.23536    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for gaussian family taken to be 20918.39)\n\n    Null deviance: 14141483  on 631  degrees of freedom\nResidual deviance: 13115831  on 627  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 8087.9\n\nNumber of Fisher Scoring iterations: 2\n\nsummary(fit2)\n\n\nCall:\nglm(formula = seropos ~ age_group + gender + slum, family = binomial(link = \"logit\"), \n    data = df)\n\nCoefficients:\n                Estimate Std. Error z value Pr(&gt;|z|)    \n(Intercept)      -1.3220     0.2516  -5.254 1.49e-07 ***\nage_groupmiddle   1.9020     0.2133   8.916  &lt; 2e-16 ***\nage_groupold      2.8443     0.2522  11.278  &lt; 2e-16 ***\ngenderMale       -0.1725     0.1895  -0.910    0.363    \nslumNon slum     -0.1099     0.2329  -0.472    0.637    \nslumSlum         -0.1073     0.4118  -0.261    0.794    \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n    Null deviance: 866.98  on 631  degrees of freedom\nResidual deviance: 679.10  on 626  degrees of freedom\n  (19 observations deleted due to missingness)\nAIC: 691.1\n\nNumber of Fisher Scoring iterations: 4"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#summary",
+    "href": "modules/Module09-DataAnalysis.html#summary",
+    "title": "Module 9: Data Analysis",
+    "section": "Summary",
+    "text": "Summary\n\nUse cor() or cor.test() to calculate correlation between two numeric vectors.\nt.test() tests the mean compared to null or difference in means between two groups\n  ... xxamy more"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#acknowledgements",
+    "href": "modules/Module09-DataAnalysis.html#acknowledgements",
+    "title": "Module 9: Data Analysis",
+    "section": "Acknowledgements",
+    "text": "Acknowledgements\nThese are the materials we looked through, modified, or extracted to complete this module’s lecture.\n\n“Introduction to R for Public Health Researchers” Johns Hopkins University"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#variable-contingency-tables-1",
+    "href": "modules/Module09-DataAnalysis.html#variable-contingency-tables-1",
+    "title": "Module 9: Data Analysis",
+    "section": "2 variable contingency tables",
+    "text": "2 variable contingency tables\nLet’s practice\n\nfreq &lt;- table(df$age_group, df$seropos)\nfreq\n\n\n\n\n/\n0\n1\n\n\n\n\nyoung\n254\n57\n\n\nmiddle\n70\n105\n\n\nold\n30\n116\n\n\n\n\n\nNow, lets move to percentages\n\nprop.cell.percentages &lt;- prop.table(freq)\nprop.cell.percentages\n\n\n\n\n/\n0\n1\n\n\n\n\nyoung\n0.4018987\n0.0901899\n\n\nmiddle\n0.1107595\n0.1661392\n\n\nold\n0.0474684\n0.1835443\n\n\n\n\nprop.column.percentages &lt;- prop.table(freq, margin=2)\nprop.column.percentages\n\n\n\n\n/\n0\n1\n\n\n\n\nyoung\n0.7175141\n0.2050360\n\n\nmiddle\n0.1977401\n0.3776978\n\n\nold\n0.0847458\n0.4172662"
+  },
+  {
+    "objectID": "modules/Module09-DataAnalysis.html#correlation-confidence-interval",
+    "href": "modules/Module09-DataAnalysis.html#correlation-confidence-interval",
+    "title": "Module 9: Data Analysis",
+    "section": "Correlation confidence interval",
+    "text": "Correlation confidence interval\nThe function cor.test() also gives you the confidence interval of the correlation statistic. Note, it uses complete observations by default.\n\ncor.test(df$age, df$IgG_concentration, method=\"pearson\")\n\n\n    Pearson's product-moment correlation\n\ndata:  df$age and df$IgG_concentration\nt = 6.7717, df = 630, p-value = 2.921e-11\nalternative hypothesis: true correlation is not equal to 0\n95 percent confidence interval:\n 0.1862722 0.3317295\nsample estimates:\n      cor \n0.2604783"
+  },
+  {
+    "objectID": "modules/Module10-DataVisualization.html#parameters-1",
+    "href": "modules/Module10-DataVisualization.html#parameters-1",
+    "title": "Module 10: Data Visualization",
+    "section": "1. Parameters",
+    "text": "1. Parameters"
+  },
+  {
+    "objectID": "modules/Module10-DataVisualization.html#add-legend-to-the-plot-1",
+    "href": "modules/Module10-DataVisualization.html#add-legend-to-the-plot-1",
+    "title": "Module 10: Data Visualization",
+    "section": "Add legend to the plot",
+    "text": "Add legend to the plot"
+  },
+  {
+    "objectID": "modules/Module10-DataVisualization.html#barplot-example-3",
+    "href": "modules/Module10-DataVisualization.html#barplot-example-3",
+    "title": "Module 10: Data Visualization",
+    "section": "barplot() example",
+    "text": "barplot() example\nNow, let look at seropositivity by two individual level characteristics in the same plot.\n\npar(mfrow = c(1,2))\nbarplot(prop.column.percentages, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Age Group\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(\"topright\",\n             fill=c(\"darkblue\",\"red\"), \n             legend = c(\"seronegative\", \"seropositive\"))\n\nbarplot(prop.column.percentages2, col=c(\"darkblue\",\"red\"), ylim=c(0,1.35), main=\"Seropositivity by Residence\")\naxis(2, at = c(0.2, 0.4, 0.6, 0.8,1))\nlegend(\"topright\", fill=c(\"darkblue\",\"red\"),  legend = c(\"seronegative\", \"seropositive\"))"
+  },
+  {
+    "objectID": "modules/Module10-DataVisualization.html#barplot-example-4",
+    "href": "modules/Module10-DataVisualization.html#barplot-example-4",
+    "title": "Module 10: Data Visualization",
+    "section": "barplot() example",
+    "text": "barplot() example"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#learning-goals",
+    "href": "archive/CaseStudy01.html#learning-goals",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Learning goals",
+    "text": "Learning goals\n\nUse logical operators, subsetting functions, and math calculations in R\nTranslate human-understandable problem descriptions into instructions that R can understand."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#instructions",
+    "href": "archive/CaseStudy01.html#instructions",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Instructions",
+    "text": "Instructions\n\nMake a new R script for this case study, and save it to your code folder.\nWe’ll use the diphtheria serosample data from Exercise 1 for this case study. Load it into R and use the functions we’ve learned to look at it."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#instructions-1",
+    "href": "archive/CaseStudy01.html#instructions-1",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Instructions",
+    "text": "Instructions\n\nMake a new R script for this case study, and save it to your code folder.\nWe’ll use the diphtheria serosample data from Exercise 1 for this case study. Load it into R and use the functions we’ve learned to look at it.\nThe str() of your dataset should look like this.\n\n\n\ntibble [250 × 5] (S3: tbl_df/tbl/data.frame)\n $ age_months  : num [1:250] 15 44 103 88 88 118 85 19 78 112 ...\n $ group       : chr [1:250] \"urban\" \"rural\" \"urban\" \"urban\" ...\n $ DP_antibody : num [1:250] 0.481 0.657 1.368 1.218 0.333 ...\n $ DP_infection: num [1:250] 1 1 1 1 1 1 1 1 1 1 ...\n $ DP_vacc     : num [1:250] 0 1 1 1 1 1 1 1 1 1 ..."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-was-the-overall-prevalence-higher-in-urban-or-rural-areas",
+    "href": "archive/CaseStudy01.html#q1-was-the-overall-prevalence-higher-in-urban-or-rural-areas",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: Was the overall prevalence higher in urban or rural areas?",
+    "text": "Q1: Was the overall prevalence higher in urban or rural areas?\n\n\nHow do we calculate the prevalence from the data?\nHow do we calculate the prevalence separately for urban and rural areas?\nHow do we determine which prevalence is higher and if the difference is meaningful?"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-from-the-data",
+    "href": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-from-the-data",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: How do we calculate the prevalence from the data?",
+    "text": "Q1: How do we calculate the prevalence from the data?\n\n\nThe variable DP_infection in our dataset is binary / dichotomous.\nThe prevalence is the number or percent of people who had the disease over some duration.\nThe average of a binary variable gives the prevalence!\n\n\n\n\nmean(diph$DP_infection)\n\n[1] 0.8"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas",
+    "href": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: How do we calculate the prevalence separately for urban and rural areas?",
+    "text": "Q1: How do we calculate the prevalence separately for urban and rural areas?\n\n\nmean(diph[diph$group == \"urban\", ]$DP_infection)\n\n[1] 0.8235294\n\nmean(diph[diph$group == \"rural\", ]$DP_infection)\n\n[1] 0.778626\n\n\n\n\n\nThere are many ways you could write this code! You can use subset() or you can write the indices many ways.\nUsing tbl_df objects from haven uses different [[ rules than a base R data frame."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas-1",
+    "href": "archive/CaseStudy01.html#q1-how-do-we-calculate-the-prevalence-separately-for-urban-and-rural-areas-1",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: How do we calculate the prevalence separately for urban and rural areas?",
+    "text": "Q1: How do we calculate the prevalence separately for urban and rural areas?\n\nOne easy way is to use the aggregate() function.\n\n\naggregate(DP_infection ~ group, data = diph, FUN = mean)\n\n  group DP_infection\n1 rural    0.7786260\n2 urban    0.8235294"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-how-do-we-determine-which-prevalence-is-higher-and-if-the-difference-is-meaningful",
+    "href": "archive/CaseStudy01.html#q1-how-do-we-determine-which-prevalence-is-higher-and-if-the-difference-is-meaningful",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: How do we determine which prevalence is higher and if the difference is meaningful?",
+    "text": "Q1: How do we determine which prevalence is higher and if the difference is meaningful?\n\n\nWe probably need to include a confidence interval in our calculation.\nThis is actually not so easy without more advanced tools that we will learn in upcoming modules.\nRight now the best options are to do it by hand or google a function."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-by-hand",
+    "href": "archive/CaseStudy01.html#q1-by-hand",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: By hand",
+    "text": "Q1: By hand\n\np_urban &lt;- mean(diph[diph$group == \"urban\", ]$DP_infection)\np_rural &lt;- mean(diph[diph$group == \"rural\", ]$DP_infection)\nse_urban &lt;- sqrt(p_urban * (1 - p_urban) / nrow(diph[diph$group == \"urban\", ]))\nse_rural &lt;- sqrt(p_rural * (1 - p_rural) / nrow(diph[diph$group == \"rural\", ])) \n\nresult_urban &lt;- paste0(\n    \"Urban: \", round(p_urban, 2), \"; 95% CI: (\",\n    round(p_urban - 1.96 * se_urban, 2), \", \",\n    round(p_urban + 1.96 * se_urban, 2), \")\"\n)\n\nresult_rural &lt;- paste0(\n    \"Rural: \", round(p_rural, 2), \"; 95% CI: (\",\n    round(p_rural - 1.96 * se_rural, 2), \", \",\n    round(p_rural + 1.96 * se_rural, 2), \")\"\n)\n\ncat(result_urban, result_rural, sep = \"\\n\")\n\nUrban: 0.82; 95% CI: (0.76, 0.89)\nRural: 0.78; 95% CI: (0.71, 0.85)"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-by-hand-1",
+    "href": "archive/CaseStudy01.html#q1-by-hand-1",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: By hand",
+    "text": "Q1: By hand\n\nWe can see that the 95% CI’s overlap, so the groups are probably not that different. To be sure, we need to do a 2-sample test! But this is not a statistics class.\nSome people will tell you that coding like this is “bad”. But ‘bad’ code that gives you answers is better than broken code! We will learn techniques for writing this with less work and less repetition in upcoming modules."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#q1-googling-a-package",
+    "href": "archive/CaseStudy01.html#q1-googling-a-package",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Q1: Googling a package",
+    "text": "Q1: Googling a package\n\n\n# install.packages(\"DescTools\")\nlibrary(DescTools)\n\naggregate(DP_infection ~ group, data = diph, FUN = DescTools::MeanCI)\n\n  group DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci\n1 rural         0.7786260           0.7065872           0.8506647\n2 urban         0.8235294           0.7540334           0.8930254"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#you-try-it",
+    "href": "archive/CaseStudy01.html#you-try-it",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "You try it!",
+    "text": "You try it!\n\nUsing any of the approaches you can think of, answer this question!\nHow many children under 5 were vaccinated? In children under 5, did vaccination lower the prevalence of infection?"
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#you-try-it-1",
+    "href": "archive/CaseStudy01.html#you-try-it-1",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "You try it!",
+    "text": "You try it!\n\n# How many children under 5 were vaccinated\nsum(diph$DP_vacc[diph$age_months &lt; 60])\n\n[1] 91\n\n# Prevalence in both vaccine groups for children under 5\naggregate(\n    DP_infection ~ DP_vacc,\n    data = subset(diph, age_months &lt; 60),\n    FUN = DescTools::MeanCI\n)\n\n  DP_vacc DP_infection.mean DP_infection.lwr.ci DP_infection.upr.ci\n1       0         0.4285714           0.1977457           0.6593972\n2       1         0.6373626           0.5366845           0.7380407\n\n\nIt appears that prevalence was HIGHER in the vaccine group? That is counterintuitive, but the sample size for the unvaccinated group is too small to be sure."
+  },
+  {
+    "objectID": "archive/CaseStudy01.html#congratulations-for-finishing-the-first-case-study",
+    "href": "archive/CaseStudy01.html#congratulations-for-finishing-the-first-case-study",
+    "title": "Algorithmic Thinking Case Study 1",
+    "section": "Congratulations for finishing the first case study!",
+    "text": "Congratulations for finishing the first case study!\n\nWhat R functions and skills did you practice?\nWhat other questions could you answer about the same dataset with the skills you know now?"
   }
 ]
\ No newline at end of file
diff --git a/exercises/Exercise02-Soln.R b/exercises/Exercise02-Soln.R
index 4e699ed..1a23f5b 100644
--- a/exercises/Exercise02-Soln.R
+++ b/exercises/Exercise02-Soln.R
@@ -84,12 +84,19 @@ oceania_countries <- cr[cr$region == "Oceania", "alpha.3"]
 # year recorded in the MeaslesCases data set. Ignore missing data values.
 # Which year had the highest number of measles cases in Oceania?
 # Which year had the lowest number of cases?
+# HINT 1: First subset the cases data so you only have the rows for countries in
+# Oceania, and you only have the columns which are cases in each year.
+# HINT 2: you can select a sequence of adjacent columns in the dataset using
+# the : (colon) operator, e.g. to get all of the years, you can use
+# X2023:X1980 as the "select" argument in subset().
+# Hint 3: After you subset the data, you can get the sums across each column ( e.g. with colSums() ) and
+# look at the highest and lowest values.
 colSums(
 	subset(cases, iso3c %in% oceania_countries, X2023:X1980),
 	na.rm = TRUE
 )
 
-# Part E: Are there any countries where the average number of measles cases from
+# BONUS PROBLEM Part E: Are there any countries where the average number of measles cases from
 # 2020 - 2023 for that country is higher than the average number of measles
 # cases from 1997 - 2000?
 # Only consider countries with no missing values during these periods. (HINT:
diff --git a/exercises/Exercise02.R b/exercises/Exercise02.R
index 21b50a3..eedb411 100644
--- a/exercises/Exercise02.R
+++ b/exercises/Exercise02.R
@@ -4,6 +4,8 @@
 # Zane Billings and Amy Winter
 ################################################################################
 
+# xxzane add question about variable classes from ex1 here
+
 # Question 1 ###################################################################
 # For this question, we'll use the measles vaccination coverage dataset.
 # Part A: load the file "MeaslesVaccinationCoverage.csv" into R.
diff --git a/exercises/Exercise03-Soln.R b/exercises/Exercise03-Soln.R
index 5e7607a..4002180 100644
--- a/exercises/Exercise03-Soln.R
+++ b/exercises/Exercise03-Soln.R
@@ -25,7 +25,7 @@ cases_long <-
 	reshape(
 		cases,
 		direction = "long",
-		varying = which(startsWith(names(cases), "X")),
+		varying = paste0("X", 2023:1980),
 		v.names = "Cases",
 		idvar = names(cases)[1:3],
 		times = 1980:2023
diff --git a/exercises/Exercise04-Soln.R b/exercises/Exercise04-Soln.R
index 9921492..1feac5e 100644
--- a/exercises/Exercise04-Soln.R
+++ b/exercises/Exercise04-Soln.R
@@ -80,7 +80,7 @@ prev_vac_by_decade <-
 # Part H: install the package "tinytable". Then use the function
 # tinytable::tt() on the result of Part G. This will give you a nice looking
 # table that's almost ready for publication!
-# install.packages(tinytable)
+#install.packages("tinytable")
 tinytable::tt(prev_vac_by_decade)
 
 # If you want to read more about tinytable, you can see the documentation here
@@ -138,7 +138,7 @@ crude_or <- o_v / o_u
 # package and look at the documentation/google to figure out how to calculate
 # an odds ratio using this package.
 # This also calculates the confidence interval for us!
-# install.packages("epitools")
+install.packages("epitools")
 epitools::epitab(
 	x = diph$DP_vacc,
 	y = diph$DP_infection,
@@ -177,7 +177,7 @@ model_ie <- glm(
 )
 summary(model_ie)
 
-anova(model_ie, model_me)
+anova(model_ie, model_me, test = "LRT")
 
 # Part G: for this question, choose the correct logistic regression model to
 # use based on the results of the likelihood ratio test you just performed.
@@ -199,7 +199,7 @@ exp(5.94)
 # epidemiology tasks for us!
 # (Note that the default CI method for confint is slightly different, so your
 # results might not be exactly the same. That's ok!)
-# install.packages(epiDisplay)
+install.packages("epiDisplay")
 epiDisplay::logistic.display(model_ie)
 
 # Part I: Final question for this model! Good job so far! Next we want to
diff --git a/exercises/Exercise05-Soln.R b/exercises/Exercise05-Soln.R
index eebdb41..81785f3 100644
--- a/exercises/Exercise05-Soln.R
+++ b/exercises/Exercise05-Soln.R
@@ -230,7 +230,7 @@ barplot(
 # fit a logistic regression model where DP_infection is the outcome and
 # DP_Antibody is the only predictor.
 diph_mod <- glm(
-	DP_vacc ~ DP_antibody,
+	DP_infection ~ DP_antibody,
 	data = diph,
 	family = "binomial"
 )
diff --git a/modules/Module00-Welcome.qmd b/modules/Module00-Welcome.qmd
index 5eac464..c4814a8 100644
--- a/modules/Module00-Welcome.qmd
+++ b/modules/Module00-Welcome.qmd
@@ -4,16 +4,23 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Welcome to SISMID Workshop: Introduction to R!
 
 **Amy Winter (she/her)** 
+
 Assistant Professor, Department of Epidemiology and Biostatistics
+
 Email: awinter@uga.edu
 
+</br>
+
 **Zane Billings (he/him)** 
+
 PhD Candidate, Department of Epidemiology and Biostatistics
+
 Email: Wesley.Billings@uga.edu
 
 
@@ -57,7 +64,7 @@ knitr::include_graphics("https://www.r-project.org/logo/Rlogo.png")
 ## What is R?
 
 - Program: R is a clear and accessible programming tool
-- Transform: R is made up of a collection of libraries designed specifically for data science
+- Transform: R is made up of a collection of packages/libraries designed specifically for statistical computing
 - Discover: Investigate the data, refine your hypothesis and analyze them
 - Model: R provides a wide array of tools to capture the right model for your data
 - Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build Shiny apps to share with the world
@@ -86,28 +93,27 @@ knitr::include_graphics("https://www.r-project.org/logo/Rlogo.png")
 * Slower, and more memory intensive, than the more traditional programming languages (C, Perl, Python)
 
 
-## Is R DIfficult?
+## Is R Difficult?
 
-* Short answer – It has a steep learning curve. 
-* Years ago, R was a difficult language to master. The language was confusing and not as structured as the other programming tools. 
+* Short answer – It has a steep learning curve, like all programming languages
+* Years ago, R was a difficult language to master. 
 * Hadley Wickham developed a collection of packages called tidyverse. Data manipulation became trivial and intuitive. Creating a graph was not so difficult anymore.
 
 
-
 ## Overall Workshop Objectives
 
 By the end of this workshop, you should be able to 
 
 1. start a new project, read in data, and conduct basic data manipulation, analysis, and visualization
 2. know how to use and find packages/functions that we did not specifically learn in class
-3. troubleshoot errors (xxzane? -- not included right now)
+3. troubleshoot errors
 
 
 ## This workshop differs from "Introduction to Tidyverse"
 
 We will focus this class on using **Base R** functions and packages, i.e., pre-installed into R and the basis for most other functions and packages! If you know Base R then are will be more equipped to use all the other useful/pretty packages that exit.
 
-the Tidyverse is one set of useful/pretty packages, designed to can make your code more **intuitive** as compared to the original older Base R. **Tidyverse advantages**:  
+The Tidyverse is one set of useful/pretty sets of packages, designed to can make your code more **intuitive** as compared to the original older Base R. **Tidyverse advantages**:  
 
 -	**consistent structure** - making it easier to learn how to use different packages
 -	particularly good for **wrangling** (manipulating, cleaning, joining) data  
@@ -121,11 +127,13 @@ knitr::include_graphics("https://tidyverse.tidyverse.org/logo.png")
 ## Workshop Overview
 
 14 lecture blocks that will each:
+
 - Start with learning objectives
 - End with summary slides
 - Include mini-exercise(s) or a full exercise
 
 Themes that will show up throughout the workshop:
+
 - Reproducibility
 - Good coding techniques
 - Thinking algorithmically
@@ -136,11 +144,6 @@ Themes that will show up throughout the workshop:
 
 xxzane slides
 
-## Good coding techniques
-
-
-## Thinking algorithmically 
-
 
 ## Useful (+ Free) Resources
 
diff --git a/modules/Module01-Intro.qmd b/modules/Module01-Intro.qmd
index 49008f0..9191e01 100644
--- a/modules/Module01-Intro.qmd
+++ b/modules/Module01-Intro.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -11,11 +12,11 @@ format:
 After module 1, you should be able to...
 
 -   Create and save an R script
--   Describe the utility and differences b/w the console and an R script
--   Modify R Studio windows
+-   Describe the utility and differences b/w the Console and the Source panes
+-   Modify R Studio panes
 -   Create objects
 -   Describe the difference b/w character, numeric, list, and matrix objects
--   Reference objects in the RStudio Global Environment
+-   Reference objects in the RStudio Environment pane
 -   Use basic arithmetic operators in R
 -   Use comments within an R script to create header, sections, and make notes
 
@@ -26,7 +27,7 @@ RStudio is an Integrated Development Environment (IDE) for R
 -   It helps the user effectively use R
 -   Makes things easier
 -   Is NOT a dropdown statistical tool (such as Stata)
-    -   See [Rcmdr](https://cran.r-project.org/web/packages/Rcmdr/index.html) or [Radiant](http://vnijs.github.io/radiant/)
+    -   See [jamovi](https://www.jamovi.org/) or also [Rcmdr](https://cran.r-project.org/web/packages/Rcmdr/index.html), [Radiant](http://vnijs.github.io/radiant/)
 
 ```{r, fig.alt="RStudio logo", out.width = "30%", echo = FALSE, fig.align='center'}
 knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/62bcc8535a06077094ca3c29c383e37ad7334311/a263f/assets/img/logo.svg")
@@ -60,10 +61,14 @@ knitr::include_graphics("images/both.png")
 
 ## Working with R in RStudio - 2 major panes:
 
-1)  The **Source/Editor**: "Analysis" Script + Interactive Exploration
-    -   Static copy of what you did (reproducibility)
-    -   Top by default
-2)  The **R Console**: "interprets" whatever you type
+1) The **Source/Editor**: xxamy
+
+- "Analysis" Script
+- Static copy of what you did (reproducibility)
+- Top by default
+    
+2)  The **R Console**: "interprets" whatever you type:
+
     -   Calculator
     -   Try things out interactively, then add to your editor
     -   Bottom by default
@@ -102,7 +107,7 @@ knitr::include_graphics("images/rstudio_sheet.png")
 
 If RStudio doesn't look the way you want (or like our RStudio), then do:
 
-RStudio --\> View --\> Panes --\> Pane Layout
+In R Studio Menu Bar go to View Menu --\> Panes --\> Pane Layout
 
 ```{r, out.width = "500px", echo = FALSE, fig.align='center'}
 knitr::include_graphics("images/pane_layout.png")
@@ -120,7 +125,7 @@ knitr::include_graphics("images/rstudio_environment.png")
 ## Workspace/History
 
 -   Shows previous commands. Good to look at for debugging, but **don't rely** on it.
--   Also type the "up" key in the Console to scroll through previous commands
+-   Also type the "up" and "down" key in the Console to scroll through previous commands
 
 ## Workspace/Other Panes
 
@@ -132,19 +137,22 @@ knitr::include_graphics("images/rstudio_environment.png")
 
 ## Getting Started
 
--   File --\> New File --\> R Script
+-   In R Studio Menu Bar go to File Menu --\> New File --\> R Script
 -   Save the blank R script as Module1.R
 
 ## Explaining output on slides
 
-In slides, a command (we'll also call them code or a code chunk) will look like this
+In slides, the R command/code will be in a box, and then directly after it, will be the output of the code starting with `[1]`
 
 ```{r echo=T}
 print("I'm code")
 ```
 
-And then directly after it, will be the output of the code.  
-So `print("I'm code")` is the code chunk and `[1] "I'm code"` is the output.
+So `print("I'm code")` is the command and `[1] "I'm code"` is the output.
+
+</br>
+
+Commands/code and output written as inline text will be typewriter blue font. For example `code`
 
 ## R as a calculator
 
@@ -159,16 +167,16 @@ You can do basic arithmetic in R, which I surprisingly use all the time.
 ## R as a calculator
 
 - The R console is a full calculator
-- Try to play around with it:
-    - +, -, /, * are add, subtract, divide and multiply
-    - ^ or ** is power
-    - parentheses -- ( and ) -- work with order of operations 
-    - %% finds the remainder
+- Arithmetic operators:
+    - `+`, `-`, `/`, `*` are add, subtract, divide and multiply
+    - `^` or `**` is power
+    - parentheses -- `(` and `)` -- work with order of operations 
+    - `%%` finds the remainder
     
 
 ## Execute / Run Code
 
-To execute or run a line of code, you just put your cursor on line of code and then:
+To execute or run a line of code (i.e., command), you just put your cursor on the command and then:
 
   1. Press Run (which you will find at the top of your window)
 
@@ -176,7 +184,7 @@ To execute or run a line of code, you just put your cursor on line of code and t
 
   2. Press `Cmd + Return` (iOS) OR `Ctrl + Enter` (Windows).
 
-To execute or run multiple lines of code, you just need to highlight the code you want to run and then follow option 1 or 2.
+To execute or run multiple lines of code, you need to highlight the code you want to run and then follow option 1 or 2.
 
 ## Mini exercise 
 
@@ -205,7 +213,7 @@ Add a comment header to Module1.R.  This is the one I typically use, but you may
 
 ## Commenting to create sections
 
-You can also create sections within your code by ending a comment with 4 hash marks. **This is very useful for creating an outline of your R Script.** The "Outline" can be found in the top right of the your source window.
+You can also create sections within your code by ending a comment with 4 hash marks. **This is very useful for creating an outline of your R Script.** The "Outline" can be found in the top right of the your Source pane
 
 ```{r, echo=T, eval=F}
 # Section 1 Header ####
@@ -231,12 +239,10 @@ knitr::include_graphics("images/outline.png")
 # Take it to another line
 ```
 
-## Commenting to explain code
-
 I tend to use:
 
--   One hash tag with a space to describe what is happening in the following few lines of code
--   One hastag with no space after a command to list specifics 
+-   One hash mark with a space to describe what is happening in the following few lines of code
+-   One hash mark with no space after a command to list specifics 
 
 ```{r, echo=T, eval=F}
 # Practicing my arithmetic
@@ -264,7 +270,7 @@ I tend to use:
 
 - You can create objects from within the R environment and from files on your computer
 - R uses `<-` to assign values to an object name 
-- Note: Object names are case-sensitive, i.e. X and x are different
+- Note: Object names are case-sensitive, i.e. `X` and `x` are different
 - Here are examples of creating five different objects:
 ```{r echo=T}
 number.object <- 3
@@ -298,7 +304,7 @@ knitr::include_graphics("images/global_env.png")
 ```
 
 
-Also, you can call them anytime (i.e, see them in the Console) by executing (running) the object.  For example,
+Also, you can print them anytime (i.e, see them in the Console) by executing (running) the object.  For example,
 
 ```{r, echo = TRUE}
 character.object
@@ -309,9 +315,11 @@ matrix.object
 ```
 
 
-## Assignment - Good coding
+# Object names and assingment - Good coding
+
+xxzane
 
-`=` and `<-` can both be used for assignment, but `<-` is better coding practice, because `==` is a logical operator. We will talk about this more, later.
+`=` and `<-` can both be used for assignment, but `<-` is better coding practice, because sometimes `=` doesn't work and we want to distinguish between the logical operator `==`. We will talk about this more, later.
 
 ## Lists
 
@@ -352,7 +360,7 @@ knitr::include_graphics("images/tab.completion.png")
 -   The Editor is for static code like R Scripts
 -   The Console is for testing code that can't be saved
 -   Commenting is your new best friend
--   In R we create objects that can be viewed in the Environment panel and called anytime
+-   In R we create objects that can be viewed in the Environment pane and used anytime
 -   An object is something that can be worked with in R
 -   Use `<-` syntax to create objects
 
@@ -361,7 +369,7 @@ knitr::include_graphics("images/tab.completion.png")
 
 1. Create a new number object and name it `my.object`
 2. Create a vector of 4 numbers and name it `my.vector` using the `c()` function
-3. Add `my.object` and `my.vector` together use arithmatic operator
+3. Add `my.object` and `my.vector` together using an arithmetic operator
 
 ## Acknowledgements
 
diff --git a/modules/Module02-Functions.qmd b/modules/Module02-Functions.qmd
index 49ffa26..cb495a8 100644
--- a/modules/Module02-Functions.qmd
+++ b/modules/Module02-Functions.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -18,7 +19,7 @@ After module 2, you should be able to...
 
 ## Function - Basic term
 
-**Function** - Functions are "self contained" modules of code that accomplish specific tasks. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.
+**Function** - Functions are "self contained" modules of code that **accomplish specific tasks**. Functions usually take in some sort of object (e.g., vector, list), process it, and return a result. You can write your own, use functions that come directly from installing R (i.e., Base R functions), or use functions from external packages.
 
 A function might help you add numbers together, create a plot, or organize your data. In fact, we have already used three functions in the Module 1, including `c()`, `matrix()`, `list()`. Here is another one, `sum()`
 
@@ -29,7 +30,7 @@ sum(1, 20234)
 
 ## Function
 
-The general usage for a function is the name of the function followed by parentheses. Within the parentheses are **arguments**.
+The general usage for a function is the name of the function followed by parentheses (i.e., the function signature). Within the parentheses are **arguments**.
 
 ```{r echo=TRUE, eval=FALSE}
 function_name(argument1, argument2, ...)
@@ -107,7 +108,7 @@ log(10, base=2)
 
 ## Package - Basic term
 
-When you download R, it has a "base" set of functions, that are associated with a "base" set of packages including: 'base', 'datasets', 'graphics', 'grDevices', 'methods', 'stats', 'methods' (typically just referred to as **Base R**).
+When you download R, it has a "base" set of functions, that are associated with a "base" set of packages including: 'base', 'datasets', 'graphics', 'grDevices', 'methods', 'stats' (typically just referred to as **Base R**).
 
 -   e.g., the `log()` function comes from the 'base' package
 
@@ -117,27 +118,27 @@ Packages are analogous to software applications like Microsoft Word. After insta
 
 ## Packages
 
-The Packages window in RStudio can help you identify what have been installed (listed), and which one have been called (check mark).
+The Packages pane in RStudio can help you identify what have been installed (listed), and which one have been attached (check mark).
 
-Lets go look at the Packages window, find the `base` package and find the `log()` function. It automatically loads the help file that we looked at earlier using `?log`.
+Lets go look at the Packages pane, find the `base` package and find the `log()` function. It automatically loads the help file that we looked at earlier using `?log`.
 
 
 ## Additional Packages
 
-You can install additional packages for your uses from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/). These additional packages are written by RStudio or R users/developers (like us)
+You can install additional packages for your use from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/). These additional packages are written by RStudio or R users/developers (like us)
 
 -   Not all packages available on CRAN or GitHub are trustworthy
 -   RStudio (the company) makes a lot of great packages
 -   Who wrote it? **Hadley Wickham** is a major authority on R (Employee and Developer at RStudio)
 -   How to [trust](https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/#:~:text=The%20first%20thing%20I%20do,I%20immediately%20trust%20the%20package.) an R package
 
-## **Installing** and calling packages
+## **Installing** and attaching packages
 
-To use the bundle or "package" of code (and or possibly data) from a package, you need to install and also call the package.
+To use the bundle or "package" of code (and or possibly data) from a package, you need to install and also attach the package.
 
 To install a package you can 
 
-1. go to Tools ---\> Install Packages in the RStudio header
+1. go to R Studio Menu Bar Tools Menu ---\> Install Packages in the RStudio header
 
 OR
 
@@ -147,25 +148,25 @@ install.packages("package_name")
 ```
 
 
-## Installing and **calling** packages
+## Installing and **attaching** packages
 
-To call (i.e., be able to use the package) you can use the following code:
+To attach (i.e., be able to use the package) you can use the following code:
 
 ```{r echo=TRUE, eval=FALSE}
-library(package_name)
+require(package_name) #library(package_name) also works
 ```
 
-More on installing and calling packages later...
+More on installing and attaching packages later...
 
 
-## Mini Exercise
+## Mini exercise
 
 Find and execute a **Base R** function that will round the number 0.86424 to two digits.
 
 
 ## Functions from Module 1
 
-The combine function `c()` collects/combines/joins single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types. 
+The combine function `c()` concatenate/collects/combines single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types. 
 
 ```{r echo=TRUE, eval=FALSE}
 ?c
@@ -185,10 +186,9 @@ The `matrix()` function creates a matrix from the given set of values.
 ?matrix
 ```
 
-xxamy - doesn't seem to work - may need to paste in a screen shot figure
 ```{r echo=FALSE}
 library(printr)
-?matix
+?matrix
 ```
 
 
@@ -197,8 +197,8 @@ library(printr)
 - Functions are "self contained" modules of code that accomplish specific tasks.
 - Arguments are what you pass to functions (e.g., objects on which you carry out the task or options for how to carry out the task)
 - Arguments may include defaults that the author of the function specified as being "good enough in standard cases", but that can be changed.
-- An R Package is a bundle or "package" of code (and or possibly data) that can be used by installing it once and calling it (using `library()`) each time R/Rstudio is opened
-- The Help window in RStudio is useful for to get more information about functions and packages 
+- An R Package is a bundle or "package" of code (and or possibly data) that can be used by installing it once and attaching it (using `require`()`) each time R/Rstudio is opened
+- The Help pane in RStudio is useful for to get more information about functions and packages 
 
 
 ## Acknowledgements
diff --git a/modules/Module03-WorkingDirectories.qmd b/modules/Module03-WorkingDirectories.qmd
index 13bc943..de3d643 100644
--- a/modules/Module03-WorkingDirectories.qmd
+++ b/modules/Module03-WorkingDirectories.qmd
@@ -4,13 +4,14 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
 
 After module 3, you should be able to...
 
--   Understand your own systems file structure and the purpose of the working directory
+-   Understand your own systems' file structure and the purpose of the working directory
 -   Determine the working directory
 -   Change the working directory
 
@@ -21,7 +22,7 @@ xxzane slide(s)
 ## Working Directory -- Basic term
 
 -   R "looks" for files on your computer relative to the "working" directory
--   For example, if you want to load data into R or save a figure, you will need to tell R where/store the file
+-   For example, if you want to load data into R or save a figure, you will need to tell R where to look for or store the file
 -   Many people recommend not setting a directory in the scripts, rather assume you're in the directory the script is in
 
 
@@ -70,9 +71,9 @@ setwd("~/Dropbox/Git/SISMID-2024")
 
 ## Setting the Working Directory
 
-If you have not yet saved a "source" file, it will set working directory to the default location. See RStudio -\> Preferences -\> General for default location.
+If you have not yet saved a "source" file, it will set working directory to the default location.Find the Tool Menu in the Menu Bar -\> Global Opsions -\> General for default location.
 
-To change the working directory to another location, go to Session --\> Set Working Directory --\> Choose Directory`
+To change the working directory to another location, find Session Menu in the Menu Bar --\> Set Working Directory --\> Choose Directory`
 
 Again, RStudio will show the code in the Console for the action you took with your cursor.
 
@@ -82,7 +83,7 @@ Again, RStudio will show the code in the Console for the action you took with yo
 -   R "looks" for files on your computer relative to the "working" directory
 -   Absolute path points to the same location in a file system - it is specific to your system and your system alone
 -   Relative path points is based on the current working directory 
--   Two functions, `setwd()` and `getwd()`, are your new best friends.
+-   Two functions, `setwd()` and `getwd()` are useful for identifying and manipulating the working directory.
 
 
 ## Acknowledgements
diff --git a/modules/Module04-RProject.qmd b/modules/Module04-RProject.qmd
index 41aa2c4..04125a3 100644
--- a/modules/Module04-RProject.qmd
+++ b/modules/Module04-RProject.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -12,7 +13,7 @@ After module 4, you should be able to...
 
 -   Create an R Project
 -   Check you are in the desired R Project
--   Reference the Files window in RStudio
+-   Reference the Files pane in RStudio
 -   Describe "good" R Project organization
 
 ## RStudio Project
@@ -27,9 +28,9 @@ RStudio "Project" is one highly recommended strategy to build organized and repr
 
 Let's create a new RStudio Project.
 
-Go to File --\> New Project --\> New Directory --\> New Project
+Find the File Menu in the Menu Bar --\> New Project --\> New Directory --\> New Project
 
-Call your Project "IntroToR_RProject"
+Name your Project "IntroToR_RProject"
 
 ## RStudio Project Organization
 
@@ -42,27 +43,27 @@ Create 4 sub-directories with the following names within your "SISMID_IntroToR_R
 -   output
 -   figures
 
-We will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the 'code' sub-directories. 
+We will be working from this directory for the remainder of the Workshop. Take a moment to move any R scripts you have already created to the 'code' sub-directory. 
 
 
 ## Some things to notice in an R Project
 
-1. The name of the R Project will be shown at the top of the RStudio application
+1. The name of the R Project will be shown at the top of the RStudio Window
 2. If you check the working directory using `getwd()` you will find the working directory is set to the location where the R Project was saved.
-3. The Files window in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio.
+3. The Files pane in RStudio is also set to the location where the R Project was saved, making it easy to navigate to sub-directories directly from RStudio.
 
 
 ## R Project - Common issues
 
 If you simply open RStudio, it will not automatically open your R Project.  As a result, when you say run a function to import data using the relative path based on your working directory, it won't be able to find the data.
 
-To open a previously created R Project, you need to open the R Project (i.e., SISMID_IntroToR_RProject.RProj)
+To open a previously created R Project, you need to open the R Project (i.e., double click on SISMID_IntroToR_RProject.RProj)
 
 ## Summary
 
--   R Projects are really helpful for lots of reasons, including to improve the reproducibility of your work
--   Consistently set up your R Project's sub-directories so that you can easily navigate the project
--		If you get an error that a file can't be found, make sure you correctly opened the R Project by looking for the Project name at the top of the RStudio application window.
+- R Projects are really helpful for lots of reasons, including to improve the reproducibility of your work
+- Consistently set up your R Project's sub-directories so that you can easily navigate the project
+- If you get an error that a file can't be found, make sure you correctly opened the R Project by looking for the Project name at the top of the RStudio application window.
 
 
 ## Mini Exercise
diff --git a/modules/Module05-DataImportExport.qmd b/modules/Module05-DataImportExport.qmd
index e8ba99d..37e8fa8 100644
--- a/modules/Module05-DataImportExport.qmd
+++ b/modules/Module05-DataImportExport.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -11,15 +12,15 @@ format:
 After module 5, you should be able to...
 
 -   Use Base R functions to load data
--   Install and call external R Packages to extend R's functionality
--   Install any type of data into R
--   Find loaded data in the Global Environment window of RStudio
+-   Install and attach external R Packages to extend R's functionality
+-   Load any type of data into R
+-   Find loaded data in the Environment pane of RStudio
 -   Reading and writing R .Rds and .Rda/.RData files
 
 
 ## Import (read) Data
 
--   Importing or 'Reading in' data is the first step of any real project/analysis
+-   Importing or 'Reading in' data are the first step of any real project / data analysis
 -   R can read almost any file format, especially with external, non-Base R, packages
 -   We are going to focus on simple delimited files first. 
     -   comma separated (e.g. '.csv')
@@ -31,9 +32,13 @@ A delimited file is a sequential file with column delimiters. Each delimited fil
 
 1. Download Module 5 data from the website and save the data to your data subdirectory -- specifically `SISMID_IntroToR_RProject/data`
 
-2. Open the data files in a text editor application and familiarize you self with the data.
+1. Open the '.csv' and '.txt' data files in a text editor application and familiarize yourself with the data (i.e., Notepad for Windows and TextEdit for Mac)
 
-3. Determine the delminiter of the two '.txt' files
+1. Open the '.xlsx' data file in excel and familiarize yourself with the data
+		-		if you use a Mac **do not** open in Numbers, it can corrupt the file
+		-		if you do not have excel, you can upload it to Google Sheets
+
+1. Determine the delimiter of the two '.txt' files
 
 
 ## Import delimited data
@@ -51,18 +56,18 @@ library(printr)
 
 ## Import .csv files
 
-Reminder
+Function signature reminder
 ```
 read.csv(file, header = TRUE, sep = ",", quote = "\"",
          dec = ".", fill = TRUE, comment.char = "", ...)
 ```
 
-`file` is the first argument and is the path to your file, in quotes 
-
-	- 		can be path in your local computer -- absolute file path or relative file path 
-	- 		can be path to a file on a website
+		-		`file` is the first argument and is the path to your file, in quotes 
+		
+				-		can be path in your local computer -- absolute file path or relative file path 
+				-		can be path to a file on a website
 
-## Mini Exercise
+## Mini exercise
 
 If your R Project is not already open, open it so we take advantage of it setting a useful working directory for us in order to import data.
 
@@ -74,30 +79,30 @@ Lets import a new data file
 ```{r, echo=TRUE, eval = FALSE}
 ## Examples
 df <- read.csv(file = "data/serodata.csv") #relative path
-df <- read.csv(file = "~/Dropbox/Git/SISMID-2024/modules/data/serodata.csv") #absolute path starting from my home directory
 ```
 
 
 Note #1, I assigned the data frame to an object called `df`.  I could have called the data anything, but in order to use the data (i.e., as an object we can find in the Environment), I need to assign it as an object. 
 
-Note #2, Look to the Environment window, you will see the `df` object ready to be used.
+Note #2, Look to the Environment pane, you will see the `df` object ready to be used.
 
 
 ## Import .txt files
 
 `read.csv()` is a special case of `read.delim()` -- a general function to read a delimited file into a data frame  
 
+Reminder function signature
 ```
 read.delim(file, header = TRUE, sep = "\t", quote = "\"",
            dec = ".", fill = TRUE, comment.char = "", ...)
 ```
 
-- `file` is the path to your file, in quotes 
-- `delim` is what separates the fields within a record. The default for csv is comma
+		- `file` is the path to your file, in quotes 
+		- `delim` is what separates the fields within a record. The default for csv is comma
 
 ## Import .txt files
 
-Lets first import 'serodata1.txt' which uses a tab delminiter and 'serodata2.txt' which uses a semicolon delminiter.
+Lets first import 'serodata1.txt' which uses a tab delimiter and 'serodata2.txt' which uses a semicolon delimiter.
 
 
 ```{r, eval = FALSE}
@@ -106,7 +111,7 @@ df <- read.delim(file = "data/serodata.txt", sep = "\t")
 df <- read.delim(file = "data/serodata.txt", sep = ";")
 ```
 
-The data is now successfully read into your R workspace, **many times actually.** Notice, that each time we imported the data we assigned the data to the `df` object, meaning we replaced it each time we reassinged the `df` object.  
+The dataset is now successfully read into your R workspace, **many times actually.** Notice, that each time we imported the data we assigned the data to the `df` object, meaning we replaced it each time we reassinged the `df` object.  
 
 
 ## What if we have a .xlsx file - what do we do?
@@ -114,7 +119,7 @@ The data is now successfully read into your R workspace, **many times actually.*
 1. Google / Ask ChatGPT
 2. Find and vet function and package you want
 3. Install package
-4. Call package
+4. Attach package
 5. Use function
 
 
@@ -152,19 +157,17 @@ Therefore,
 install.packages("readxl")
 ```
 
-## 4. Call Package
+## 4. Attach Package
 
-Reminder -- Installing and calling packages
-
-To call (i.e., be able to use the package) you can use the following code:
+Reminder - To attach (i.e., be able to use the package) you can use the following code:
 ```{r echo=TRUE, eval=FALSE}
-library(package_name)
+require(package_name)
 ```
 
 Therefore, 
 
 ```{r echo=TRUE, eval=FALSE}
-library(readxl)
+require(readxl)
 ```
 
 ## 5. Use Function
@@ -181,7 +184,7 @@ library(readxl)
 
 ## 5. Use Function
 
-Reminder
+Reminder of function signature
 ```
 read_excel(
   path,
@@ -205,9 +208,7 @@ df <- read_excel(path = "data/serodata.xlsx", sheet = "Data")
 ```
 
 
-## Mini exercise
-
-Lets make some mistakes
+## Lets make some mistakes
 
 1. What if we read in the data without assigning it to an object (i.e., `read_excel(path = "data/serodata.xlsx", sheet = "Data")`)?
 
@@ -216,29 +217,33 @@ Lets make some mistakes
 
 ## Installing and calling packages - Common confusion
 
-You only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it. 
+</br>
+
+You only need to install a package once (unless you update R or want to update the package), but you will need to call or load a package each time you want to use it. 
+
+</br>
 
 The exception to this rule are the "base" set of packages (i.e., **Base R**) that are installed automatically when you install R and that automatically called whenever you open R or RStudio.
 
 
 ## Common Error
 
-Be prepared to see the error 
+Be prepared to see this error
 
 ```{r echo=TRUE, eval=FALSE}
-Error: could not find function "some_function"
+Error: could not find function "some_function_name"
 ```
 
-This usually mean that either 
+This usually means that either 
 
 - you called the function by the wrong name 
 - you have not installed a package that contains the function
-- you have installed a package but you forgot to call it (i.e., `library(package_name)`) -- **most likely**
+- you have installed a package but you forgot to attach it (i.e., `require(package_name)`) -- **most likely**
 
 
 ## Export (write) Data 
 
--   Exporting or 'Writing out' data allows you to save modified files to future use or sharing
+-   Exporting or 'Writing out' data allows you to save modified files for future use or sharing
 -   R can write almost any file format, especially with external, non-Base R, packages
 -   We are going to focus again on writing delimited files
 
@@ -254,6 +259,8 @@ library(printr)
 
 ## Export delimited data
 
+Let's practice exporting the data as three files with three different delimiters (comma, tab, semicolon)
+
 ```{r echo=TRUE, eval=FALSE}
 write.csv(df, file="data/serodata_new.csv", row.names = FALSE) #comma delimited
 write.table(df, file="data/serodata1_new.txt", sep="\t", row.names = FALSE) #tab delimited
@@ -266,7 +273,7 @@ Note, I wrote the data to new file names.  Even though we didn't change the data
 
 There are two file extensions worth discussing.
 
-R has two native data formats—Rdata (sometimes shortened to Rda) and Rds. These formats are used when R objects are saved for later use. Rdata is used to save multiple R objects, while Rds is used to save a single R object. 
+R has two native data formats—'Rdata' (sometimes shortened to 'Rda') and 'Rds'. These formats are used when R objects are saved for later use. 'Rdata' is used to save multiple R objects, while 'Rds' is used to save a single R object. 'Rds' is fast to write/read and is very small.
 
 ## .rds binary file
 
@@ -285,25 +292,26 @@ object1 <- read_rds(file = "filename.rds")
 
 The Base R functions `save()` and `load()` can be used to save and load multiple R objects. 
 
-`save()` writes an external representation of R objects to the specified file, and can by loaded back into the environment using `load()`. A nice feature about using `save` and `load` is that the R object is directly imported into the environment and you don't have to assign it to an object. The files can be saved as `.RData` or `.rda` files.
+`save()` writes an external representation of R objects to the specified file, and can by loaded back into the environment using `load()`. A nice feature about using `save` and `load` is that the R object(s) is directly imported into the environment and you don't have to specify the name. The files can be saved as `.RData` or `.Rda` files.
 
+Function signature
 ```
 save(object1, object2, file = "filename.RData")
 load("filename.RData")
 ```
 
-Note, that when you read .RData files you don't need to assign it to an abjecct.  It simply reads in the objects as they were saved.  Therefore, `load("filename.RData")` will read in `object1` and  `object2` directly into the Global Environment.
+Note, that you separate the objects you want to save with commas.
 
 
 
 ## Summary
 
-- Importing or 'Reading in' data is the first step of any real project/analysis
-- The Base R 'util' package we can find a handful of useful functions including  `read.csv()` and `read.delim()` to importing/reading data or `write.csv()` and `write.table()` for exporti/writing data
-- When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be called/used
-- Properly read data can be found in the Environment window of RStudio
-- You only need to install a package once (unless you update R), but you will need to call or load a package each time you want to use it. 
-- To complete a tasek you don't know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Call package, 5. Use function
+- Importing or 'Reading in' data are the first step of any real project /  data analysis
+- The Base R 'util' package has useful functions including  `read.csv()` and `read.delim()` to importing/reading data or `write.csv()` and `write.table()` for exporting/writing data
+- When importing data (exception is object from .RData), you must assign it to an object, otherwise it cannot be used
+- If data are imported correctly, they can be found in the Environment pane of RStudio
+- You only need to install a package once (unless you update R or the package), but you will need to attach a package each time you want to use it. 
+- To complete a task you don't know how to do (e.g., reading in an excel data file) use the following steps: 1. Google / Ask ChatGPT, 2. Find and vet function and package you want, 3. Install package, 4. Attach package, 5. Use function
 
 
 ## Acknowledgements
diff --git a/modules/Module06-DataSubset.qmd b/modules/Module06-DataSubset.qmd
index 614394d..35e7e8a 100644
--- a/modules/Module06-DataSubset.qmd
+++ b/modules/Module06-DataSubset.qmd
@@ -5,6 +5,8 @@ format:
     scrollable: true
     smaller: true
     toc: false
+#execute: 
+#  echo: true
 ---
 
 ## Learning Objectives
@@ -48,11 +50,11 @@ Note, if you have a very large dataset with 15+ variables, `summary()` is not so
 
 ## Description of data
 
-This is data based on a simulated pathogen X IgG antibody serological survey.  The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization.  We will use this dataset for lectures throughout the Workshop.
+This is data based on a simulated pathogen X IgG antibody serological survey.  The rows represent individuals. Variables include IgG concentrations in IU/mL, age in years, gender, and residence based on slum characterization.  We will use this dataset for modules throughout the Workshop.
 
 ## View the data as a whole dataframe
 
-The `View()` function, one of the few Base R functions with a capital letter can be used to open a new tab in the Console and view the data as you would in excel.
+The `View()` function, one of the few Base R functions with a capital letter, and can be used to open a new tab in the Console and view the data as you would in excel.
 
 ```{r echo=TRUE, eval=FALSE}
 View(df)
@@ -64,15 +66,17 @@ knitr::include_graphics("images/ViewTab.png")
 
 ## View the data as a whole dataframe
 
-You can also open a new tab of the data by clicking on the data icon beside the object in the Environment window.
+You can also open a new tab of the data by clicking on the data icon beside the object in the Environment pane
 
 ```{r, out.width = "90%", echo = FALSE}
 knitr::include_graphics("images/View.png")
 ```
 
+You can also hold down `Cmd` or `CTRL` and click on the name of a data frame in your code.
+
 ## Indexing
 
-R contains several constructs which allow access to individual elements or subsets through indexing operations. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing syntax: `[ ]`, `[[ ]]` and `$`.
+R contains several operators which allow access to individual elements or subsets through indexing. Indexing can be used both to extract part of an object and to replace parts of an object (or to add parts). There are three basic indexing operators: `[`, `[[` and `$`. 
 
 ```{r echo=TRUE, eval=FALSE}
 x[i] #if x is a vector
@@ -84,7 +88,7 @@ x$"a" #if x is a data frame or list
 
 ## Vectors and multi-dimensional objects
 
-To index a vector, `vector[i]` select the ith element. To index a multi-dimensional objects such as a matrix, `matrix[i, j]` selects the element in row i and column j, where as in a three dimensional `array[k, i, i, j]` selects the element in matrix k, row i, and column j. 
+To index a vector, `vector[i]` select the ith element. To index a multi-dimensional objects such as a matrix, `matrix[i, j]` selects the element in row i and column j, where as in a three dimensional `array[k, i, j]` selects the element in matrix k, row i, and column j. 
 
 Let's practice by first creating the same objects as we did in Module 1.
 ```{r echo=T}
@@ -101,7 +105,7 @@ vector.object1
 matrix.object
 ```
 
-Finally, let's use indexing to pull our elements of the objects.  
+Finally, let's use indexing to pull out elements of the objects.  
 ```{r echo=T}
 vector.object1[2] #pulling the second element
 matrix.object[1,2] #pulling the element in row 1 column 2
@@ -123,18 +127,29 @@ Now we use indexing to pull out the 3rd element in the list.
 list.object[[3]]
 ```
 
-## $ for indexing
+What happens if we use a single square bracket?
+```{r echo=T}
+list.object[3]
+```
+
+The `[[` operator is called the "extract" operator and gives us the element
+from the list. The `[` operator is called the "subset" operator and gives
+us a subset of the list, that is still a list.
 
-`$` allows only a literal character string or a symbol as the index.
+## $ for indexing for data frame
 
-```{r echo=T}
+`$` allows only a literal character string or a symbol as the index.  For a data frame it extracts a variable.
+
+```{r echo=T, eval=FALSE}
 df$IgG_concentration
 ```
 
-Note, if you have spaces in your variable name, you will need to use back ticks `variable name` after the `$`.  This is a good reason to not create variables / column names with spaces.
+Note, if you have spaces in your variable name, you will need to use back ticks \` after the `$`.  This is a good reason to not create variables / column names with spaces.
 
 ## $ for indexing with lists
 
+`$` allows only a literal character string or a symbol as the index.  For a list it extracts a named element.
+
 List elements can be named
 ```{r echo=TRUE}
 list.object.named <- list(
@@ -145,7 +160,7 @@ list.object.named <- list(
 list.object.named
 ```
 
-If list elements are named, than you can reference data from list using `$` or using double square brackets, `[[ ]]`
+If list elements are named, than you can reference data from list using `$` or using double square brackets, `[[`
 ```{r echo=TRUE}
 list.object.named$uga 
 list.object.named[["uga"]] 
@@ -156,34 +171,38 @@ list.object.named[["uga"]]
 
 As mentioned above, indexing can be used both to extract part of an object and to replace parts of an object (or to add parts).
 
-```{r}
-colnames(df) # just prints
-colnames(df)[1:2] <- c("IgG_concentration_mIU/mL", "age_year") # reassigns
+```{r echo=TRUE}
+colnames(df) 
+colnames(df)[2:3] <- c("IgG_concentration_IU/mL", "age_year") # reassigns
 colnames(df)
-colnames(df)[1:2] <- c("IgG_concentration", "age") #reset
+```
+
+For the sake of the module, I am going to reassign them back to the original variable names
+```{r echo=TRUE}
+colnames(df)[2:3] <- c("IgG_concentration", "age") #reset
 ```
 
 ##  Using indexing to subset by columns
 
-We can also subset a data frames and matrices (2-dimensional objects) using the bracket `[ row , column ]`.  We can subset by columns and pull the `x` column using the index of the column or the column name. 
+We can also subset data frames and matrices (2-dimensional objects) using the bracket `[ row , column ]`.  We can subset by columns and pull the `x` column using the index of the column or the column name. Leaving either row or column dimension blank means to select all of them.
 
-For example, here I am pulling the 3nd column, which has the variable name `age`
-```{r echo=T}
+For example, here I am pulling the 3rd column, which has the variable name `age`, for all of rows.
+```{r echo=T, eval=FALSE}
 df[ , "age"] #same as df[ , 3]
 ```
-We can select multiple columns using multiple column names:
+We can select multiple columns using multiple column names, again this is selecting these variables for all of the rows.
 ```{r echo=T}
 df[, c("age", "gender")] #same as df[ , c(3,4)]
 ```
 We can remove select columns using indexing as well, OR by simply changing the column to `NULL`
-```{r echo=T}
+```{r echo=T, eval=FALSE}
 df[, -5] #remove column 5, "slum" variable
 ```
 ```{r echo=TRUE, eval=FALSE}
 df$slum <- NULL # this is the same as above
 ```
-We can also grab the `age` column using the `$` operator. 
-```{r echo=T}
+We can also grab the `age` column using the `$` operator, again this is selecting the variable for all of the rows.
+```{r echo=T, eval=FALSE}
 df$age
 ```
 
@@ -211,7 +230,7 @@ operator | operator option |description
 `>`|%g%|greater than
 `>=`|%ge%|greater than or equal to
 `==`||equal to
-`!=`|not equal to
+`!=`||not equal to
 `x&y`||x and y
 `x|y`||x or y
 `%in%`||match
@@ -241,8 +260,10 @@ number.object %in% c(6,7,2)
 ```{r echo=TRUE}
 cn <- colnames(df)
 cn
+cn=="IgG_concentration"
 cn[cn=="IgG_concentration"] <-"IgG_concentration_mIU" #rename cn to "IgG_concentration_mIU" when cn is "IgG_concentration"
 colnames(df) <- cn
+colnames(df)
 ```
 
 Note, I am resetting the column name back to the original name for the sake of the rest of the module.
@@ -258,13 +279,19 @@ In this example, we subset by rows and pull only observations with an age of les
 ```{r echo=T}
 df_lte10 <- df[df$age<=10, ]
 ```
-In this example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.
+Lets check that my subsets worked using the `summary()` function. 
+```{r echo=T}
+summary(df_lte10$age)
+```
+
+</br>
+
+In the next example, we subset by rows and pull only observations with an age of less than or equal to 5 OR greater than 10.
 ```{r echo=TRUE}
 df_lte5_gt10 <- df[df$age<=5 | df$age>10, ]
 ```
 Lets check that my subsets worked using the `summary()` function. 
 ```{r echo=T}
-summary(df_lte10$age)
 summary(df_lte5_gt10$age)
 ```
 
@@ -275,24 +302,25 @@ Missing data need to be carefully described and dealt with in data analysis. Und
 
 Types of "missing" values:
 
--   `NA` - general missing data
--   `NaN` - stands for "**N**ot **a** **N**umber", happens when you do
-    0/0.
--   `Inf` and `-Inf` - Infinity, happens when you divide a positive
-    number (or negative number) by 0.
--   blank space - sometimes when data is read it, there is a blank space left
+- `NA` - **N**ot **A**pplicable general missing data
+- `NaN` - stands for "**N**ot **a** **N**umber", happens when you do 0/0.
+- `Inf` and `-Inf` - Infinity, happens when you divide a positive number (or negative number) by 0.
+- blank space - sometimes when data is read it, there is a blank space left
+- an empty string (e.g., `""`) 
+- `NULL`- undefined value that represents something that does not exist
 
 ## Logical operators to help identify and missing data
 
-operator | operator option |description
+operator |description
 -----|-----|-----:
-`is.na`||is NAN or NA
-`is.nan`||is NAN
-`!is.na`||is not NAN or NA
-`!is.nan`||is not NAN
-`is.infinite`||is infinite
-`any`||are any TRUE
-`which`||which are TRUE
+`is.na`|is NAN or NA
+`is.nan`|is NAN
+`!is.na`|is not NAN or NA
+`!is.nan`|is not NAN
+`is.infinite`|is infinite
+`any`|are any TRUE
+`all`|all are TRUE
+`which`|which are TRUE
 
 ## More logical operators examples
 
@@ -347,8 +375,8 @@ df_lt5_f <- subset(df, df$age<=5 & gender=="Female", select=c(IgG_concentration,
 `subset()` automatically removes NAs, which is a different behavior from doing logical operations on NAs.
 
 ```{r echo=T}
-summary(df_lte10$age)
-summary(df_lte10_v2$age)
+summary(df_lte10$age) #created with indexing
+summary(df_lte10_v2$age) #created with the subset function
 ```
 
 We can also see this by looking at the number or rows in each dataset.
@@ -362,11 +390,11 @@ nrow(df_lte10_v2)
 
 ## Summary
 
-- `colnames()`, `str()` and `summary()`functions from Base R are great functions to assess the data type and some summary statistics
-- There are three basic indexing syntax: `[ ]`, `[[ ]]` and `$`
+- `colnames()`, `str()` and `summary()`functions from Base R are functions to assess the data type and some summary statistics
+- There are three basic indexing syntax: `[`, `[[` and `$`
 - Indexing can be used to extract part of an object (e.g., subset data) and to replace parts of an object (e.g., rename variables / columns)
 - Logical operators can be evaluated on object(s) in order to return a binary response of TRUE/FALSE, and are useful for decision rules for indexing
-- There are 5 “types” of missing values, the most common being “NA”
+- There are 7 “types” of missing values, the most common being “NA”
 - Logical operators meant to determine missing values are very helpful for data cleaning
 - The Base R `subset()` function is a slightly easier way to select variables and observations.
 
diff --git a/modules/Module07-VarCreationClassesSummaries.qmd b/modules/Module07-VarCreationClassesSummaries.qmd
index 704d75b..734654c 100644
--- a/modules/Module07-VarCreationClassesSummaries.qmd
+++ b/modules/Module07-VarCreationClassesSummaries.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     smaller: true
     scrollable: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -39,6 +40,23 @@ head(df,3)
 
 Note, my use of the underscore in the variable name rather than a space.  This is good coding practice and make calling variables much less prone to error.
 
+## Adding new columns
+
+We can also add a new column using the `transform()` function:
+
+```{r, echo = FALSE}
+library(printr)
+?transform
+```
+
+For example, adding a binary column for seropositivity called `seropos`:
+
+```{r}
+df <- transform(df, seropos = IgG_concentration >= 10)
+head(df)
+```
+
+
 ## Creating conditional variables
 
 One frequently-used tool is creating variables with conditions. A general function for creating new variables based on existing variables is the Base R `ifelse()` function, which "returns a value depending on whether the element of test is `TRUE` or `FALSE`."
@@ -58,16 +76,22 @@ df$age_group <- ifelse(df$age <= 5, "young", "old")
 head(df)
 ```
 
+Let's delve into what is actually happening, with a focus on the NA values in `age` variable.
+
+```{r echo=TRUE}
+df$age <= 5
+table(df$age, df$age_group, useNA="always", dnn=list("age", ""))
+```
 
 ## Nesting `ifelse` statements example
 
 ```{r echo=TRUE}
 df$age_group <- ifelse(df$age <= 5, "young", 
-                       ifelse(df$age<=10 & df$age>5, "middle", 
-                              ifelse(df$age>10, "old", NA)))
-head(df)
+                       ifelse(df$age<=10 & df$age>5, "middle", "old"))
+table(df$age, df$age_group, useNA="always", dnn=list("age", ""))
 ```
 
+Note, it puts the variable levels in alphabetical order, we will show how to change this later.
 
 # Data Classes
 
@@ -89,11 +113,6 @@ class(df$age)
 class(df$gender)
 ```
 
-```{r, echo = FALSE, results = "asis"}
-library(printr)
-?head
-```
-
 
 ## One dimensional data types
 
@@ -118,7 +137,7 @@ Here because integers are in quotations, it is read as a character class by R.
 class(c("1", "4", "7")) 
 ```
 
-Note, this is the first time we have shown you nested functions.  Here, instead of creating a new vector object (e.g., `x <- c("1", "4", "7")`) and then feeding the vector object `x` into the first argument of the `class()` function (e.g., `class(x)`), we combined the two steps and directly fed a vector object into the class function.
+Note, instead of creating a new vector object (e.g., `x <- c("1", "4", "7")`) and then feeding the vector object `x` into the first argument of the `class()` function (e.g., `class(x)`), we combined the two steps and directly fed a vector object into the class function.
 
 ## Numeric Subclasses
 
@@ -140,13 +159,13 @@ typeof(df$age)
 
 ## Logical
 
-Reminder `logical` is a type that only has two possible elements: `TRUE` and `FALSE`. 
+Reminder `logical` is a type that only has three possible elements: `TRUE` and `FALSE` and `NA`
 
 ```{r echo=TRUE}
 class(c(TRUE, FALSE, TRUE, TRUE, FALSE))
 ```
 
-Note that `logical` elements are NOT in quotes. Putting R special classes (e.g., `NA` or `FALSE`) in quotations turns them into character value. 
+Note that when creating `logical` object the `TRUE` and `FALSE` are NOT in quotes. Putting R special classes (e.g., `NA` or `FALSE`) in quotations turns them into character value. 
 
 
 ## Other useful functions for evaluating/setting classes
@@ -154,7 +173,7 @@ Note that `logical` elements are NOT in quotes. Putting R special classes (e.g.,
 There are two useful functions associated with practically all R classes: 
 
 - `is.CLASS_NAME(x)` to **logically check** whether or not `x` is of certain  class.  For example,  `is.integer` or `is.character` or `is.numeric`
-- `as.CLASS_NAME(x)` to **coerce between classes** `x` from current `x` class into a certain class. For example, `as.integer` or `as.character` or `as.numeric`.  This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).
+- `as.CLASS_NAME(x)` to **coerce between classes** `x` from current `x` class into a another class. For example, `as.integer` or `as.character` or `as.numeric`.  This is particularly useful is maybe integer variable was read in as a character variable, or when you need to change a character variable to a factor variable (more on this later).
 
 ## Examples `is.CLASS_NAME(x)`
 
@@ -173,7 +192,7 @@ as.numeric(c("1", "4", "7"))
 as.logical(c("TRUE", "FALSE", "FALSE"))
 ```
 
-In some cases the coercing is not possible; if executed, will return `NA` (an R constant representing "**N**ot **A**vailable" i.e. missing value)
+In some cases the coercing is not possible; if executed, will return `NA`
 ```{r echo=TRUE}
 as.numeric(c("1", "4", "7a"))
 as.logical(c("TRUE", "FALSE", "UNKNOWN"))
@@ -191,7 +210,9 @@ class(df$age_group_factor)
 levels(df$age_group_factor)
 ```
 
-Note that levels are, by default, set to **alphanumerical** order! And, the first is always the "reference" group. However, we often prefer a different reference group.
+Note 1, that levels are, by default, set to **alphanumerical** order! And, the first is always the "reference" group. However, we often prefer a different reference group.
+
+Note 2, we can also make ordered factors using `factor(... ordered=TRUE)`, but we won't talk more about that.
 
 ## Reference Groups 
 
@@ -204,15 +225,20 @@ By default `middle` is the reference group therefore we will only generate beta
 ## Changing factor reference 
 
 Changing the reference group of a factor variable.
+
 - If the object is already a factor then use `relevel()` function and the `ref` argument to specify the reference.
 - If the object is a character then use `factor()` function and `levels` argument to specify the order of the values, the first being the reference.
 
 
+Let's look at the `relevel()` help file
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
 ?relevel
 ```
 
+</br>
+
+Let's look at the `factor()` help file
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
 ?factor
@@ -253,15 +279,15 @@ Matrices, like data frames are also composed of rows and columns. Matrices, unli
 You can also create a matrix from scratch using `matrix()` Use `?matrix` to see the arguments.  
 
 ```{r echo=TRUE}
-matrix(1:6, ncol = 2) 
-matrix(1:6, ncol=2, byrow=TRUE) 
+matrix(data=1:6, ncol = 2) 
+matrix(data=1:6, ncol=2, byrow=TRUE) 
 ```
 
-Notice, the first matrix filled in numbers 1-6 by columns first and then rows because default `byrow` argument is FALSE. In the second matrix, we changed the argument `byrow` to `TRUE`, and now numbers 1-6 are filled by rows first and then columns.
+Note, the first matrix filled in numbers 1-6 by columns first and then rows because default `byrow` argument is FALSE. In the second matrix, we changed the argument `byrow` to `TRUE`, and now numbers 1-6 are filled by rows first and then columns.
 
 ## Data frame 
 
-You can transform an existing matrix into data frames and tibble using `as.data.frame()`.  
+You can transform an existing matrix into data frames using `as.data.frame()`  
 
 ```{r echo=TRUE}
 as.data.frame(matrix(1:6, ncol = 2) ) 
@@ -271,16 +297,17 @@ as.data.frame(matrix(1:6, ncol = 2) )
 ## Numeric variable data summary
 
 Data summarization on numeric vectors/variables:
--		`mean()`: takes the mean of x
--		`sd()`: takes the standard deviation of x
--		`median()`: takes the median of x
--		`quantile()`: displays sample quantiles of x. Default is min, IQR, max
--		`range()`: displays the range. Same as `c(min(), max())`
--		`sum()`: sum of x
--		`max()`: maximum value in x
--		`min()`: minimum value in x
-
-Note, **all have the ** `na.rm =` **argument for missing data**
+
+-	`mean()`: takes the mean of x
+-	`sd()`: takes the standard deviation of x
+-	`median()`: takes the median of x
+-	`quantile()`: displays sample quantiles of x. Default is min, IQR, max
+-	`range()`: displays the range. Same as `c(min(), max())`
+-	`sum()`: sum of x
+-	`max()`: maximum value in x
+-	`min()`: minimum value in x
+
+Note, **all have the ** `na.rm` **argument for missing data**
 
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
diff --git a/modules/Module08-DataMergeReshape.qmd b/modules/Module08-DataMergeReshape.qmd
index e1254f1..352593a 100644
--- a/modules/Module08-DataMergeReshape.qmd
+++ b/modules/Module08-DataMergeReshape.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -26,7 +27,7 @@ Pay close attention to the number of rows in your data set before and after a jo
 
 ## one-to-one merge
 
--   This means that each row of data represents a unique unit of analysis that exists in another dataset
+-   This means that each row of data represents a unique unit of analysis that exists in another dataset (e.g,. id variable)
 -   Will likely have variables that don’t exist in the current dataset (that’s why you are trying to merge it in)
 -   The merging variable (e.g., id) each represented a single time
 -   You should try to structure your data so that a 1:1 merge or 1:m merge is possible so that fewer things can go wrong.
@@ -88,7 +89,7 @@ str(df_all_wide)
 
 ## Merge the new data with the original data
 
-The second option is to add a time variable to the two data sets and then merge by `observation_id`, `time` as well as `IgG_concentration` and `age` variables. Note, I need to read in the data again b/c I removed the `IgG_concentration` and `age` variables.
+The second option is to add a time variable to the two data sets and then merge by `observation_id`,`time`,`age`,`IgG_concentration`. Note, I need to read in the data again b/c I removed the `IgG_concentration` and `age` variables.
 
 ```{r echo=TRUE}
 df <- read.csv("data/serodata.csv")
@@ -96,8 +97,10 @@ df_new <- read.csv("data/serodata_new.csv")
 ```
 
 ```{r echo=TRUE}
-df$time <- 1
+df$time <- 1 #you can put in one number and it will repeat it
 df_new$time <- 2
+head(df)
+head(df_new)
 ```
 
 Now, lets merge. Note, "By default the data frames are merged on the columns with names they both have" therefore if I don't specify the by argument it will merge on all matching variables.
@@ -105,6 +108,7 @@ Now, lets merge. Note, "By default the data frames are merged on the columns wit
 df_all_long <- merge(df, df_new, all.x=T, all.y=T) 
 str(df_all_long)
 ```
+Note, there are 1287 rows, which is the sum of the number of rows of `df` (651 rows) and `df_new` (636 rows)
 
 
 ## What is wide/long data?
@@ -112,15 +116,15 @@ str(df_all_long)
 Above, we actually created a wide and long version of the data.
 
 Wide: has many columns
-    -   Multiple columns per individual, values spread across multiple columns 
-    -   Easier for humans to read
-Long: column names become data
-    -   Multiple rows per observation, a single column contains the values
-    -   Easier for R to make plots & do analysis
-
-```{r, fig.alt="Wide versus long data rearanges the position of column names and row content.", out.width = "60%", echo = FALSE, fig.align='center'}
-knitr::include_graphics("images/pivot.png")
-```
+
+- multiple columns per individual, values spread across multiple columns 
+- easier for humans to read
+    
+Long: has many rows
+
+- column names become data
+- multiple rows per observation, a single column contains the values
+- easier for R to make plots & do analysis
 
 ## `reshape()` function 
 
@@ -142,6 +146,11 @@ xxzane - help
 xxzane - help
 
 
+## Let's get real
+
+Use the `pivot_wider()` and `pivot_longer()` from the tidyr package!
+
+
 
 ## Summary
 
diff --git a/modules/Module09-DataAnalysis.qmd b/modules/Module09-DataAnalysis.qmd
index 616b464..e6c094f 100644
--- a/modules/Module09-DataAnalysis.qmd
+++ b/modules/Module09-DataAnalysis.qmd
@@ -4,15 +4,16 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
 
 After module 9, you should be able to...
 
--		Descriptively assess association between two variables
--		Compute basic statistics 
--		Fit a generalized linear model
+-	Descriptively assess association between two variables
+-	Compute basic statistics 
+-	Fit a generalized linear model
 
 ## Import data for this module
 
@@ -28,15 +29,13 @@ head(x=df, n=3)
 Create `age_group` three level factor variable
 ```{r echo=TRUE}
 df$age_group <- ifelse(df$age <= 5, "young", 
-                       ifelse(df$age<=10 & df$age>5, "middle", 
-                              ifelse(df$age>10, "old", NA)))
+                       ifelse(df$age<=10 & df$age>5, "middle", "old"))
 df$age_group <- factor(df$age_group, levels=c("young", "middle", "old"))
 ```
 
 Create `seropos` binary variable representing seropositivity if antibody concentrations are >10 IU/mL.
 ```{r echo=TRUE}
-df$seropos <- ifelse(df$IgG_concentration<10, 0, 
-										ifelse(df$IgG_concentration>=10, 1, NA))
+df$seropos <- ifelse(df$IgG_concentration<10, 0, 1)
 ```
 
 
@@ -44,17 +43,40 @@ df$seropos <- ifelse(df$IgG_concentration<10, 0,
 
 We use `table()` prior to look at one variable, now we can generate frequency tables for 2 plus variables.  To get cell percentages, the `prop.table()` is useful.  
 
+```{r echo=TRUE, eval=FALSE}
+?prop.table
+```
+
+```{r echo=TRUE}
+library(printr)
+?prop.table
+```
+
+## 2 variable contingency tables
+
+Let's practice
 ```{r echo=TRUE}
 freq <- table(df$age_group, df$seropos)
-prop <- prop.table(freq)
 freq
-prop
 ```
 
+Now, lets move to percentages
+```{r echo=TRUE}
+prop.cell.percentages <- prop.table(freq)
+prop.cell.percentages
+prop.column.percentages <- prop.table(freq, margin=2)
+prop.column.percentages
+```
+
+
 ## Chi-Square test
 
 The `chisq.test()` function test of independence of factor variables from `stats` package.
 
+```{r echo=TRUE, eval=FALSE}
+?chisq.test
+```
+
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
 ?chisq.test
@@ -67,7 +89,7 @@ library(printr)
 chisq.test(freq)
 ```
 
-We reject the null hypothesis that the proportion of seropositive individuals who are young (<5yo) is the same for individuals who are middle (5-10yo) or old (>10yo).
+We reject the null hypothesis that the proportion of seropositive individuals in the young, middle, and old age groups are the same.
 
 
 ## Correlation
@@ -83,17 +105,31 @@ cor(df$age, df$IgG_concentration, method="pearson", use = "complete.obs") #IF ha
 
 Small positive correlation between IgG concentration and age.
 
+## Correlation confidence interval
+
+The function `cor.test()` also gives you the confidence interval of the correlation statistic. Note, it uses complete observations by default. 
+
+```{r echo=TRUE}
+cor.test(df$age, df$IgG_concentration, method="pearson")
+```
+
+
 ## T-test
 
 The commonly used are:
 
 -   **one-sample t-test** -- used to test mean of a variable in one group (to the null hypothesis mean)
--   **two-sample t-test** -- used to test difference in means of a variable between two groups (null hypothesis - the group means are the *same*); if "two groups" are data of the *same* individuals collected at 2 time points, we say it is two-sample paired t-test
+-   **two-sample t-test** -- used to test difference in means of a variable between two groups (null hypothesis - the group means are the *same*)
 
 ## T-test
 
 We can use the `t.test()` function from the `stats` package.
 
+
+```{r echo=TRUE, eval=FALSE}
+?t.test
+```
+
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
 ?t.test
@@ -125,6 +161,10 @@ The mean IgG concenration of young and old is 45.05 and 129.35 IU/mL, respective
 To fit regression models in R, we use the function `glm()` (Generalized Linear Model).
 
 
+```{r echo=TRUE, eval=FALSE}
+?glm
+```
+
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
 ?glm
@@ -134,9 +174,9 @@ library(printr)
 
 We tend to focus on three arguments:
 
--   `formula` -- model formula written using names of columns in our data
--   `data` -- our data frame
--		`family` -- error distribution and link function
+- `formula` -- model formula written using names of columns in our data
+- `data` -- our data frame
+- `family` -- error distribution and link function
 
 ```{r echo=TRUE}
 fit1 <- glm(IgG_concentration~age+gender+slum, data=df, family=gaussian())
@@ -170,9 +210,8 @@ summary(fit2)
 
 ## Summary
 
--		Use `cor()` to calculate correlation between two numeric vectors.
--   `corrplot()` and `ggpairs()` is nice for a quick visualization of correlations
--   `t.test()` or `t_test()` tests the mean compared to null or difference in means between two groups
+-	Use `cor()` or `cor.test()` to calculate correlation between two numeric vectors.
+- `t.test()` tests the mean compared to null or difference in means between two groups
 -		... xxamy more
 
 ## Acknowledgements
diff --git a/modules/Module10-DataVisualization.qmd b/modules/Module10-DataVisualization.qmd
index aaf61c4..a30c192 100644
--- a/modules/Module10-DataVisualization.qmd
+++ b/modules/Module10-DataVisualization.qmd
@@ -4,6 +4,7 @@ format:
   revealjs:
     scrollable: true
     smaller: true
+    toc: false
 ---
 
 ## Learning Objectives
@@ -26,15 +27,13 @@ head(x=df, n=3)
 Create `age_group` three level factor variable
 ```{r echo=TRUE}
 df$age_group <- ifelse(df$age <= 5, "young", 
-                       ifelse(df$age<=10 & df$age>5, "middle", 
-                              ifelse(df$age>10, "old", NA)))
+                       ifelse(df$age<=10 & df$age>5, "middle", "old")) 
 df$age_group <- factor(df$age_group, levels=c("young", "middle", "old"))
 ```
 
-Create `seropos` binary variable representing seropositivity if antibody concentrations are >10 mIUmL.
+Create `seropos` binary variable representing seropositivity if antibody concentrations are >10 IU/mL.
 ```{r echo=TRUE}
-df$seropos <- ifelse(df$IgG_concentration<10, 0, 
-										ifelse(df$IgG_concentration>=10, 1, NA))
+df$seropos <- ifelse(df$IgG_concentration<10, 0, 1)
 ```
 
 ## Base R data visualizattion functions
@@ -42,12 +41,12 @@ df$seropos <- ifelse(df$IgG_concentration<10, 0,
 The Base R 'graphics' package has a ton of graphics options. 
 
 ```{r echo=TRUE, eval=FALSE}
-library(help = "graphics")
+help(package = "graphics")
 ```
 
 ```{r echo=FALSE, eval=TRUE}
 library(printr)
-library(help = "graphics")
+help(package = "graphics")
 ```
 
 
@@ -64,15 +63,19 @@ To make a plot you often need to specify the following features:
 
 The parameter section fixes the settings for all your plots, basically the plot options. Adding attributes via `par()` before you call the plot creates ‘global’ settings for your plot.
 
-In the example below, we have set two commonly used optional attributes in the global plot settings. 
--		The `mfrow` specifies that we have one row and two columns of plots — that is, two plots side by side. 
--		The `mar` attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).
+In the example below, we have set two commonly used optional attributes in the global plot settings.
+
+-	The `mfrow` specifies that we have one row and two columns of plots — that is, two plots side by side. 
+-	The `mar` attribute is a vector of our margin widths, with the first value indicating the margin below the plot (5), the second indicating the margin to the left of the plot (5), the third, the top of the plot(4), and the fourth to the left (1).
 
 ```
 par(mfrow = c(1,2), mar = c(5,5,4,1))
 ```
 
-```{r, out.width = "70%", echo = FALSE}
+
+## 1. Parameters
+
+```{r, figwidth = "100%", echo = FALSE}
 knitr::include_graphics("images/par.png")
 ```
 
@@ -92,7 +95,7 @@ library(printr)
 
 ## Common parameter options
 
-Six useful parameter arguments help improve the readability of the plot:
+Eight useful parameter arguments help improve the readability of the plot:
 
 - `xlab`: specifies the x-axis label of the plot
 - `ylab`: specifies the y-axis label
@@ -137,7 +140,7 @@ library(printr)
 
 ## `histogram()` example
 
-Reminder
+Reminder function signature
 ```
 hist(x, breaks = "Sturges",
      freq = NULL, probability = !freq,
@@ -186,7 +189,7 @@ plot(
 	type="p", 
 	main="Age by IgG Concentrations", 
 	xlab="Age (years)", 
-	ylab="IgG Concentration (mIU/mL)", 
+	ylab="IgG Concentration (IU/mL)", 
 	pch=16, 
 	cex=0.9,
 	col="lightblue")
@@ -208,7 +211,7 @@ library(printr)
 
 ## `boxplot()` example
 
-Reminder
+Reminder function signature
 ```
 boxplot(formula, data = NULL, ..., subset, na.action = NULL,
         xlab = mklab(y_var = horizontal),
@@ -239,7 +242,7 @@ boxplot(
 
 ```{r, echo = FALSE, results = "asis"}
 library(printr)
-?boxplot
+?barplot
 ```
 
 
@@ -247,7 +250,7 @@ library(printr)
 
 The function takes the a lot of arguments to control the way the way our data is plotted. 
 
-Reminder
+Reminder function signature
 ```
 barplot(height, width = 1, space = NULL,
         names.arg = NULL, legend.text = NULL, beside = FALSE,
@@ -264,8 +267,8 @@ barplot(height, width = 1, space = NULL,
 ```{r echo=TRUE}
 freq <- table(df$seropos, df$age_group)
 barplot(freq)
-prop <- prop.table(freq)
-barplot(prop)
+prop.cell.percentages <- prop.table(freq)
+barplot(prop.cell.percentages)
 ```
 
 ## 3. Legend!
@@ -285,7 +288,7 @@ library(printr)
 
 ## Add legend to the plot
 
-Reminder
+Reminder function signature
 ```
 legend(x, y = NULL, legend, fill = NULL, col = par("col"),
        border = "black", lty, lwd, pch,
@@ -301,26 +304,52 @@ legend(x, y = NULL, legend, fill = NULL, col = par("col"),
        seg.len = 2)
 ```
 
-Let's practice (xxzane - can we make the plot bigger?)
-```{r echo=TRUE}
-barplot(prop, col=c("darkblue","red"), ylim=c(0,0.7), main="Seropositivity by Age Group")
-legend(x=2.5, y=0.7,
+Let's practice
+```{r echo=TRUE, eval=FALSE}
+barplot(prop.cell.percentages, col=c("darkblue","red"), ylim=c(0,0.5), main="Seropositivity by Age Group")
+legend(x=2.5, y=0.5,
 			 fill=c("darkblue","red"), 
 			 legend = c("seronegative", "seropositive"))
 
 ```
 
+
+## Add legend to the plot
+
+```{r echo=FALSE}
+barplot(prop.cell.percentages, col=c("darkblue","red"), ylim=c(0,0.5), main="Seropositivity by Age Group")
+legend(x=2.5, y=0.5,
+			 fill=c("darkblue","red"), 
+			 legend = c("seronegative", "seropositive"))
+
+```
+
+
 ## `barplot()` example
 
 Getting closer, but what I really want is column proportions (i.e., the proportions should sum to one for each age group). Also, the age groups need more meaningful names.
 
-```{r echo=TRUE}
+```{r echo=TRUE, eval=FALSE}
+freq <- table(df$seropos, df$age_group)
+prop.column.percentages <- prop.table(freq, margin=2)
+colnames(prop.column.percentages) <- c("1-5 yo", "6-10 yo", "11-15 yo")
+
+barplot(prop.column.percentages, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
+axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
+legend(x=2.8, y=1.35,
+			 fill=c("darkblue","red"), 
+			 legend = c("seronegative", "seropositive"))
+
+```
+
+## `barplot()` example
+
+```{r echo=FALSE, eval=TRUE}
 freq <- table(df$seropos, df$age_group)
-tot.per.age.group <- colSums(freq)
-age.seropos.matrix <- t(t(freq)/tot.per.age.group)
-colnames(age.seropos.matrix) <- c("1-5 yo", "6-10 yo", "11-15 yo")
+prop.column.percentages <- prop.table(freq, margin=2)
+colnames(prop.column.percentages) <- c("1-5 yo", "6-10 yo", "11-15 yo")
 
-barplot(age.seropos.matrix, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
+barplot(prop.column.percentages, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
 axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
 legend(x=2.8, y=1.35,
 			 fill=c("darkblue","red"), 
@@ -328,28 +357,46 @@ legend(x=2.8, y=1.35,
 
 ```
 
+
+
 ## `barplot()` example
 
 Now, let look at seropositivity by two individual level characteristics in the same plot. 
 
 ```{r echo=FALSE}
-freq <- table(df$seropos, df$slum)
-tot.per.slum.cat <- colSums(freq)
-slum.seropos.matrix <- t(t(freq)/tot.per.slum.cat)
+freq2 <- table(df$seropos, df$slum)
+prop.column.percentages2 <- prop.table(freq2, margin=2)
 ```
 
-```{r echo=TRUE}
+```{r echo=TRUE, eval=FALSE}
 par(mfrow = c(1,2))
-barplot(age.seropos.matrix, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
+barplot(prop.column.percentages, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
 axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
-legend(x=1, y=1.35, fill=c("darkblue","red"), legend = c("seronegative", "seropositive"))
+legend("topright",
+			 fill=c("darkblue","red"), 
+			 legend = c("seronegative", "seropositive"))
 
-barplot(slum.seropos.matrix, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Residence")
+barplot(prop.column.percentages2, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Residence")
 axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
-legend(x=1, y=1.35, fill=c("darkblue","red"),  legend = c("seronegative", "seropositive"))
+legend("topright", fill=c("darkblue","red"),  legend = c("seronegative", "seropositive"))
 ```
 
 
+## `barplot()` example
+
+```{r echo=FALSE, eval=TRUE}
+par(mfrow = c(1,2))
+barplot(prop.column.percentages, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Age Group")
+axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
+legend("topright",
+			 fill=c("darkblue","red"), 
+			 legend = c("seronegative", "seropositive"))
+
+barplot(prop.column.percentages2, col=c("darkblue","red"), ylim=c(0,1.35), main="Seropositivity by Residence")
+axis(2, at = c(0.2, 0.4, 0.6, 0.8,1))
+legend("topright", fill=c("darkblue","red"),  legend = c("seronegative", "seropositive"))
+```
+
 
 ## Summary
 
diff --git a/modules/ModuleXX-Iteration.qmd b/modules/ModuleXX-Iteration.qmd
index 88c9d37..e748165 100644
--- a/modules/ModuleXX-Iteration.qmd
+++ b/modules/ModuleXX-Iteration.qmd
@@ -1,6 +1,8 @@
 ---
 title: "Iteration in R"
-format: revealjs
+format:
+  revealjs:
+    toc: false
 ---
 
 ```{r}
@@ -14,7 +16,6 @@ library(printr)
 ## Learning goals
 
 1. Replace repetitive code with a `for` loop
-1. Compare and contrast `for` loops and `*apply()` functions
 1. Use vectorization to replace unnecessary loops
 
 ## What is iteration?