-
Notifications
You must be signed in to change notification settings - Fork 310
The Datasaurus Dozen
Thomas Lin Pedersen edited this page Sep 4, 2018
·
4 revisions
submitted by Tom Westlake
The Datasaurus Dozen is a playful twist on Anscombe's Quartet. A group of twelve datasets, with nigh-identical summary statistics, yet when plotted on a graph they prove to be distinctly dissimilar.
The animation below, utilising the datasauRus
, ggplot2
and gganimate
packages, highlights the dangers of relying solely on summary statistics without considering the whole distribution
library(datasauRus)
library(ggplot2)
library(gganimate)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
ease_aes('cubic-in-out')
Install gganimate using devtools::install_github('thomasp85/gganimate')
The Grammar
Misc
Examples