Skip to content

KSUDS/p1_visualization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data science and visualization

If 80% of data science work is data wrangling, 80% of your impact is through visualization.

Background

Hans Rosling is one of the most popular data scientists on the web. His original TED talk was viral among my friends when it came out. We are going to create some graphics using his formatted data as our weekly case study. Note that we need to remove Kuwait from the data (discussion on this)

Tasks

Visualization review

  • Complete a review of 2-3 different data visualizations used to answer specific questions. Some fun websites are pudding.cool, wonkblog, fivethiryeight, and priceonomics (but you can use any website, blog, or article with a good visualization).

Slack, VScode, Rstudio, Git, and Github

  • Make sure you are in our Slack workspace.
  • Finish setting up VScode for programming in R and Python.
  • Finish setting up Rstudio.
  • Finish installing Git.
  • Finish creating your Github account and connecting to our organization.

R

  • Recreate the two graphics in this repo using gapminder dataset from library(gapminder) (get them to match as closely as you can).
    • Use library(tidyverse) to load ggplot2 and dplyr and the theme_bw() to duplicate the first plot.
    • Use scale_y_continuous(trans = "sqrt") to get the correct scale on the y-axis.
    • Build weighted average data set using weighted.mean() and GDP with summarise() and group_by() that will be the black continent average line on the second plot.
    • Use theme_bw() to duplicate the second plot. You will need to use the new data to make the black lines and dots showing the continent average.
    • Use ggsave() and save each plot as a .png with a width of 15 inches.

Python

  • Recreate the two graphics in this repo using the gapminder dataset from library(gapminder) (get them to match as closely as you can).
    • Export the data from R and import it into your Python environment.
    • Use plotnine or Altair to mimic the two graphics as close as possible.
    • Build a weighted average data set using GDP, the black continent average line on the second plot.

Readings

Visualization (being)

Technology

R

Python

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published