covid19R

The goal of covid19R is to provide a single package that allows users to access all of the tidy covid-19 datasets collected by data packages that implement the covid19R tidy data standard. It provides access to multiple data sets that meet a tidy data standard.

To learn more abou the Covid19R project, check our extensive documentation about data standards, how to get your data added to this list, and more.

Installation

You can install the development version from github with:

remotes::install_github("covid19r/covid19r")

Getting the Data Information

To see what datasets are available, use get_covid19_data_info()

library(covid19R)

data_info <- get_covid19_data_info()

head(data_info) %>% knitr::kable()

data_set_name	package_name	function_to_get_data	data_details	data_url	license_url	data_types	location_types	spatial_extent	has_geospatial_info	get_info_passing	refresh_status	last_refresh_update
covid19nytimes_states	covid19nytimes	refresh_covid19nytimes_states	Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US States. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data.	https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv	https://github.com/nytimes/covid-19-data/blob/master/LICENSE	cases_total, deaths_total	state	country	FALSE	TRUE	Passed	2020-05-04 16:08:36
covid19nytimes_counties	covid19nytimes	refresh_covid19nytimes_counties	Open Source data from the New York Times on distribution of confirmed Covid-19 cases and deaths in the US by County. For more, see https://www.nytimes.com/article/coronavirus-county-data-us.html or the readme at https://github.com/nytimes/covid-19-data.	https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv	https://github.com/nytimes/covid-19-data/blob/master/LICENSE	cases_total, deaths_total	state	country	FALSE	TRUE	Passed	2020-05-04 16:08:39
covid19france	covid19france	refresh_covid19france	Open Source data from opencovid19-fr on distribution of confirmed Covid-19 cases and deaths in the US States. For more, see https://github.com/opencovid19-fr/data.	https://raw.githubusercontent.com/opencovid19-fr/data/master/dist/chiffres-cles.csv	https://github.com/opencovid19-fr/data/blob/master/LICENSE	confirmed, dead, icu, hospitalized, recovered, discovered	county, region, country, overseas collectivity	country	FALSE	TRUE	Passed	2020-05-04 16:08:47
CanadaC19_cases	CanadaC19	refresh_CanadaC19_cases	Open Source data from multiple public reporting data throughout Canada. For more, see https://github.com/ishaberry/Covid19Canada.	https://raw.githubusercontent.com/ishaberry/Covid19Canada/master/cases.csv	https://github.com/debusklaneml/CanadaC19/blob/master/LICENSE	cases_new	state	state	FALSE	TRUE	Passed	2020-05-04 16:08:48
covid19us	covid19us	refresh_covid19us	Open Source data from COVID Tracking Project on the distribution of Covid-19 cases and deaths in the US. For more, see https://github.com/opencovid19-fr/data.	https://covidtracking.com/api	https://github.com/aedobbyn/covid19us/blob/master/LICENSE.md	positive, negative, pending, hospitalized_currently, hospitalized_cumulative, in_icu_currently, in_icu_cumulative, on_ventilator_currently, on_ventilator_cumulative, recovered, death, hospitalized, total, total_test_results, death_increase, hospitalized_increase, negative_increase, positive_increase, total_test_results_increase	state	country	FALSE	TRUE	Passed	2020-05-04 16:08:50

Accessing data

Once you have figured out what dataset you want, you can access it with get_covid19_dataset()

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

nytimes_states <- get_covid19_dataset("covid19nytimes_states")
#> Parsed with column specification:
#> cols(
#>   date = col_date(format = ""),
#>   location = col_character(),
#>   location_type = col_character(),
#>   location_code = col_character(),
#>   location_code_type = col_character(),
#>   data_type = col_character(),
#>   value = col_double()
#> )

nytimes_states %>%
  filter(date == max(date)) %>%
  filter(data_type == "cases_total") %>%
  arrange(desc(value)) %>%
  head()
#> # A tibble: 6 x 7
#>   date       location location_type location_code location_code_t… data_type
#>   <date>     <chr>    <chr>         <chr>         <chr>            <chr>    
#> 1 2020-05-03 New York state         36            fips_code        cases_to…
#> 2 2020-05-03 New Jer… state         34            fips_code        cases_to…
#> 3 2020-05-03 Massach… state         25            fips_code        cases_to…
#> 4 2020-05-03 Illinois state         17            fips_code        cases_to…
#> 5 2020-05-03 Califor… state         06            fips_code        cases_to…
#> 6 2020-05-03 Pennsyl… state         42            fips_code        cases_to…
#> # … with 1 more variable: value <dbl>

The covid19R Data Standard

While many data sets have their own unique additional columns (e.g., Latitude, Longitude, population, etc.), all datasets have the following columns and are arranged in a long format:

date - The date in YYYY-MM-DD form
location - The name of the location as provided by the data source. The counties dataset provides county and state. They are combined and separated by a ,, and can be split by tidyr::separate(), if you wish.
location_type - The type of location using the covid19R controlled vocabulary. Nested locations are indicated by multiple location types being combined with a `_
location_code - A standardized location code using a national or international standard. In this case, FIPS state or county codes. See https://en.wikipedia.org/wiki/Federal_Information_Processing_Standard_state_code and https://en.wikipedia.org/wiki/FIPS_county_code for more
location_code_type The type of standardized location code being used according to the covid19R controlled vocabulary. Here we use fips_code
data_type - the type of data in that given row. Includes total_cases and total_deaths, cumulative measures of both.
value - number of cases of each data type

Vocabularies

The location_type, location_code_type, and data_type from datasets and spatial_extent from the data info table all have their own controlled vocabularies. Others might be introduced as the collection of packages matures. To see the possible values of a standardized vocabulary, use get_covid19_controlled_vocab()

get_covid19_controlled_vocab("location_type") %>%
  knitr::kable()

location_type	description
continent	continental scale
country	a country with soverign borders
state	a spatial area inside that country such as a state, province, canton, etc.
county	a spatial area demarcated within a state
city	a single municipality - the smallest spatial grain of government in a country
canton	the cantons of Switzerland and Principality of Liechtenstein (FL)

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
R		R
docs		docs
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
covid19R.Rproj		covid19R.Rproj
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

covid19R

Installation

Getting the Data Information

Accessing data

The covid19R Data Standard

Vocabularies

About

Releases

Packages

Languages

License

Covid19R/covid19R

Folders and files

Latest commit

History

Repository files navigation

covid19R

Installation

Getting the Data Information

Accessing data

The covid19R Data Standard

Vocabularies

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages