- What is data visualization?
- Why is data visualization important?
- How to do data visualization?
- Possible options and pitfalls
Technique to communicate insights from data through visual representation.
Allow easy understanding of large dataset.
Provides basic knowledge about variables.
Most efficient way to identify, locate, manipulate, format, and present data.
- Ever increasing amount of data.
- Humanly impossible to see distinct patterns.
- Improved insight.
- Faster Decision making.
- Plot in Base R
- ggplot2 package and associates
data(mtcars)
plot(mpg~wt, mtcars, pch=19, col="blue")
plot vs ggplot
Pros | Cons |
---|---|
In-built | Additional package |
Easy to learn | Steep learning curve |
Indepenedent of data-structures | Works only with data-frame |
Easy for simple plots | Verbose for complex plots |
Low level of abstraction | High abstraction level |
Visually less appealing | Visually more appealing |
Based on Grammer of graphics (Wilkinson, 2005).
Consists of several building blocks like a sentence.
- data
- aesthetic mapping
- geometric object
- scales
- coordination system
- position adjustmnets
- faceting
#install.packages("ggplot2", dependencies = T)
library(ggplot2)
ggplot(mtcars, aes(x= wt, y= mpg))+
geom_point(colour="blue", size=3)
ggplot(mtcars) #data
ggplot(mtcars, aes(x= wt, y= mpg)) #data+aesthetic map
ggplot(mtcars, aes(x= wt, y= mpg))+ #data+aesthetic map
geom_point() #geometric obj
ggplot(mtcars, aes(x= wt, y= mpg))+ #data+aesthetic map
geom_point(colour="blue", size=3) #geometric obj
ggplot(mtcars, aes(x= wt, y= mpg))+ #data+aesthetic map
geom_point(colour="blue", size=3)+ #geometric obj
ggtitle("Scatterplot") #Plot title
- DATA only data-frame is allowed
- AES takes into account the aesthetics
- GEOM stands for the different geometrices
- geom_point for point plot
- geom_bar for barplot
- geom_line for line plot
- geom_histogram for histogram
- geom_boxplot for boxplot
and so on
ggplot(mtcars, aes(x=mpg))+
geom_bar()
ggplot(mtcars, aes(x=cyl, y=mpg, fill= cyl))+
geom_bar(stat="identity")
ggplot(mtcars, aes(x=cyl, y=mpg))+
geom_point(stat="identity", size=4)
You can export any plots using the plot window from R/RStudio.
To save files in high-resolution these commands are helpful
sct <-ggplot(mtcars, aes(x= wt, y= mpg))+
geom_point(colour="blue", size=3)+ ggtitle("Scatterplot")
ggsave(sct, "Scatterplot_with_R.jpeg", dpi=100, device = "jpeg")
- Use your own data and make a basic plot (scatterplot, barplot, histogram) in ggplot
- change the color of the plot
- What is the difference if you put colour or shape in data part rather than geometric object part?
- For further queries nilanjan@wii.gov.in
- Slides: https://nilanjanchatterjee.github.io/projects/