betweenthepipes
is an R package that currently holds two tutorials, created with learnr. There are also two sample data sets that are useful for learning to work with hockey data.
First, download this package via Github: devtools::install_github("meghall06/betweenthepipes")
.
- Introduction to R with Hockey Data. A beginner-friendly introduction to R and the tidyverse with sample hockey data. Introduces the basic tidyverse functions:
filter()
,select()
,arrange()
,filter()
,mutate()
,group_by()
, andsummarize()
. - More Data Manipulation. Going further into data manipulation with details on pivoting data (using
pivot_longer()
andpivot_wider()
), joining data, and working with strings.
Once the package has been downloaded, there are two options to access the tutorial. You can access each tutorial individually with the following code:
library(betweenthepipes)
intro()
data_manip()
Or, if you have an RStudio version 1.3 or later, there should be a Tutorial pane in the upper right corner (near Environment and Git). That pane should list all the tutorials available from the packages you've downloaded.
There are two data sets available in this package: pbp_example
and bio_example
. pbp_example
is a data set containing NHL play-by-play data for four Philadelphia Flyers games from November 2019. bio_example
is a data set containing some NHL biographic data from 2019, useful for practicing joins with the data in pbp_example
. More information on the data sets is available with ?betweenthepipes::bio_example
or ?betweenthepipes::pbp_example
.
# To load the data sets into the global environment
library(betweenthepipes)
bio_example <- bio_example
pbp_example <- pbp_example
In October 2020, I gave a tidyverse-focused workshop at the Carnegie Mellon Sports Analytics Conference using the data available in this package. The slides and code from the workshop are available on my website here.
The play-by-play data was scraped using the Evolving-Hockey R scraper and the biographic data was downloaded from NaturalStatTrick.