Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to R reference sheet #66

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 4 additions & 35 deletions REFERENCE_R.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,29 +84,23 @@ Database functions:
| `desc(column)` | Sort a column (or numeric vector) in descending order |
| `everything()` | Select all variables (used in other functions) |
| `filter(tbl, condition)` | Keep rows that match a condition |
| `fct_reorder(factor_column, ordering_column, .desc = FALSE)` | Reorder a column by sorting according to another column |
| `group_by(tbl, columns_as_arguments)` | Group a tibble by the list of columns provided |
| `map(tbl, function)` | Apply the given function to each column, creating a list |
| `map_chr(tbl, function)` | Apply the given function to each column, creating a character vector |
| `map_df(tbl, function)` | Apply the given function to each column, creating a data frame |
| `mutate(tbl, column_name = ...)` | Create or modify a column in a tibble |
| `pivot_longer(tbl, column_range, names_to = ..., values_to = ...)` | Move values from column names to cells |
| `pivot_wider(tbl, names_from = ..., values_from = ...)` | Move variables from cells to column names |
| `pull(tbl, variable)` | Extract a single variable from a tibble |
| `rowwise(tbl)` | Organize a tibble row-by-row for other functions |
| `select(tbl, columns_as_arguments)` | Keep the given columns |
| `semi_join(tbl, joining_tbl)` | Keep rows that have matching values in joining_tbl |
| `separate(tbl, column, into, sep)` | Split values in a column into new columns based on a separator |
| `summarize(tbl, summaries_as_arguments)` | Compute summary statistics on columns |
| `ungroup(tbl)` | Undo the effect of group_by() |

Functions used to convert one type to another:

| Function | Description |
| ------------------------------- | --------------------------------------------------- |
| `as_datetime(formatted_string)` | Convert a string to a Date object |
| `as_factor(column)` | Convert a column to a factor / categorical variable |
| `as_tibble(object)` | Convert an object to a tibble |
| Function | Description |
| ------------------- | --------------------------------------------------- |
| `as_factor(column)` | Convert a column to a factor / categorical variable |
| `as_tibble(object)` | Convert an object to a tibble |

Slicing functions:

Expand All @@ -119,15 +113,6 @@ Slicing functions:
| `unique(tbl)` | Delete duplicate rows |
| `tail(tbl)` | Get the last 6 rows of a tibble |

Functions used to manipulate strings:

| Function | Description |
| ----------------------------------------------- | ------------------------------------------------------ |
| `str_extract(string, pattern)` | Extract the first substring matching the given pattern |
| `str_replace_all(string, pattern, replacement)` | Replace all substrings matching the given pattern |
| `tolower(string)` | Convert a string to all-lowercase |
| `toupper(string)` | Convert a string to all-uppercase |

## Visualization

A typical `ggplot2` syntax for creating a new plot looks something like this:
Expand All @@ -142,22 +127,11 @@ my_data |> ggplot(aes(x = column1, y = column2)) +
| Function | Description |
| ------------------------------- | ------------------------------------------------------------------------------------ |
| `aes(x, y, ...)` | Specifies how variables in the data are mapped to properties of the plot |
| `element_text(size, colour)` | Used with `theme` system to control text size, colour, etc. |
| `facet_grid(rows, cols)` | Creates matrix panels with plots based on specified rows or cols variable |
| `facet_wrap(facets)` | Creates a ribbon of panels wrapped in 2d using specified facets |
| `ggmap(map)` | Used to display visual maps from Google Maps or Stamen Maps |
| `ggpairs(tbl)` | Plots each variables against all the other variables in a scatterplot matrix |
| `ggplot(tbl, mapping)` | Initialize a `ggplot` object, specifying the data and aesthetic mapping for the plot |
| `ggsave(filename, plot)` | Saves specified plot with given filename to device |
| `ggtitle(title)` | Adds specified title to the plot |
| `labs(x, y, fill, colour, ...)` | Modifies labels on the plot, specifying what the new labels should be |
| `scale_color_manual(values)` | Manually change the colour for plots by specifying the values |
| `scale_fill_brewer(palette)` | Changes the fill colour palette to the specified palette |
| `scale_fill_distiller(palette)` | Changes the fill colour palette for continuous scales |
| `scale_x_continuous(limits)` | Customize x-axis scales for continuous x variables with specified options |
| `scale_x_date(limits, breaks)` | Customize the x-axis scales for date or time variables in a plot |
| `scale_y_continuous(limits)` | Customize y-axis scales for continuous y variables with specified options |
| `theme(text) ` | Used to modify the non-data components of the plot with specified options |
| `xlab(label)` | Modifies the x-axis label to the specified label |
| `xlim(lo, hi)` | Displays only the specified range on the x-axis of the plot |
| `ylab(label)` | Modifies the y-axis label to the specified label |
Expand All @@ -168,15 +142,11 @@ Commonly used geometric objects are listed below.

| Function | Description |
| -------------------------------- | ----------------------------------------------------------------------------- |
| `geom_abline(slope, intercept)` | Adds a diagonal line to the plot with specified intercept and slope |
| `geom_bar(stat)` | Used to create bar graphs with specified `stat` (often "identity" or "count") |
| `geom_density()` | Used to create a smoothened line version of a histogram |
| `geom_freqpoly()` | Used to create a lined (not smooth) version of a histogram |
| `geom_histogram(bins, binwidth)` | Creates histogram plots with a specified number of bins and bin width |
| `geom_line()` | Adds lines to connect data points in the order of the x-axis |
| `geom_point()` | Used to create a scatterplot graphs |
| `geom_segment(x, y, xend, yend)` | Draws a straight line on plot connecting (x, y) to (xend, yend) |
| `geom_vline(xintercept)` | Adds a vertical line to the plot at the specified x-intercept |

## Modeling

Expand All @@ -199,7 +169,6 @@ The functions below are relevant for Week 7 (`classification1`) and beyond.
| ------------------------------------------ | ---------------------------------------------- |
| `add_model(workflow, model_spec)` | Add a model to a workflow |
| `add_recipe(workflow, model_recipe)` | Add a recipe to a workflow |
| `add_row(data, col1, col2)` | Add rows to a dataframe |
| `all_predictors()` | Select all predictors |
| `bake(recipe, data)` | Applies the results of prep() into the data |
| `bind_cols(df1, df2)` | Combine multiple dataframes together |
Expand Down