-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathphd day workshop_examples.Rmd
198 lines (135 loc) · 5.01 KB
/
phd day workshop_examples.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
title: "Beyond barcharts"
output: html_notebook
---
# Bar chart alternatives
Let's meet the iris dataset!
Read more about it in the R help page for iris
```{r}
head(iris2)
```
## ggplot basics
Let's do ggplot basics now!
Make a jitterplot with `geom_jitter()` Set the `x aesthetic` to `Species` and the `y aesthetic` to `Sepal.Length`
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x=Species)) +
geom_jitter()
```
## Barchart in ggplot
Let's make the same plot, but as a barchart.
Do you already know `stat_summary()`?
It's ggplot's best kept secret.
Replace `geom_jitter()` with `stat_summary(geom = 'bar')` and make a barchart.
Second, add `stat_summary(geom = 'errorbar')`
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x=Species)) +
stat_summary(geom = 'bar') +
stat_summary(geom = 'errorbar')
```
That's your barchart right there!
Is it a beauty, no not quite, but we can fine-tune that in a later stage.
## Density alternatives
Let's make a couple alternatives.
### try `geom_violin()`
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x=Species)) +
geom_violin()
```
### try `geom_density`
Replace the `x` aesthetic for `col = Species` , and maybe swap x and y axes.
You can add the `linetype` aesthetic as well!
that way it works in greyscale print.
```{r}
ggplot(iris,mapping = aes(x = Sepal.Length,col=Species,linetype = Species)) +
geom_density()
```
-------====== Wait here please, we'll plenary get up to speed in a second ===========------
## point/count alternatives
### dotplot
Similar to a violin plot, but shows the actual data points.
Use `geom_dotplot(binaxis = 'y',stackdir = 'center')`
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x= Species)) +
geom_dotplot(binaxis = 'y',stackdir = 'center')
```
### histogram
```{r}
ggplot(iris,mapping = aes(x = Sepal.Length,fill= Species)) +
geom_histogram(position = 'identity',alpha=.6)
```
### boxplot
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x= Species)) +
geom_boxplot()
```
-------====== Wait here please, we'll plenary get up to speed in a second ===========------
# Combining shapes!
ggplot can layer shapes on top of one another.
We can use this to our advantage, and add more data in our figure.
### Boxplot with jitter
Make a boxplot like before, but add the datapoints with `geom_jitter`.
Make the datapoints transparent and grey if you know how.
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x= Species)) +
geom_boxplot() +
geom_jitter(alpha = .5, colour = 'grey', width = .2) +
theme_classic()
```
### dotplot with median
Was that too easy, then here is a hard one!
Next, make a dotplot, but add a red line that displays the median.
Use stat summary and the cheatsheet.
*hint: `stat_summary()` has a `fun=` argument for determining what function you want used to summarise the data.*
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x= Species)) +
geom_dotplot(binaxis = 'y',stackdir = 'center') +
stat_summary(geom = 'crossbar',fun=median,col = 'red') +
theme_classic()
```
-------====== Wait here please, we'll plenary get up to speed in a second ===========------
# Extension packages
## GGdist
ggdist, deals with displaying distributions, perfect!
Have a look at their cheatsheet.
[ggdist cheatsheet](https://github.com/mjskay/ggdist/blob/master/figures-source/cheat_sheet-slabinterval.pdf){.uri}
Try a ggdist visualisation on the iris dataset.
```{r}
ggplot(iris,mapping = aes(y = Sepal.Length,x= Species,shape = Species)) +
ggdist::stat_halfeye()
```
# A figure of your own?
Do you have a (barchart?) figure of your own at hand?
Maybe play with that!
I'll be around to help out.
```{r}
```
## Alternative:
Formatting your data into the right shape is as hard as working ggplot itself.
I'd like you to try to make a boxplot of the irisdataset.
On the x axis, should be the different lengths that were measured.
The fill colour should be the Species, and on the y axis the actual measured length.
Can you manage?
Now make your favourite alternative, maybe a dotplot with a crossbar?
Or some other shape you may have found in the ggdist package.
```{r}
```
## Challenge!
Feeling up to a challenge, try to re-create this:
![](images/clipboard-819503216.png)
```{r}
iris_long <- pivot_longer(iris, -Species, names_to = "variable", values_to = "value")
iris_long$variable <- factor(iris_long$variable,
levels = c("Petal.Width",
"Sepal.Width",
"Petal.Length" ,
"Sepal.Length"))
# make a dotplot with ggplot
ggplot(iris_long, aes(x = variable, y = value, fill = variable)) +
#geom_dotplot(binaxis = "y", stackdir = "center", position = "dodge",alpha = .5) +
geom_violin() +
facet_wrap(~Species, scales = "free_y") +
stat_summary(fun = mean, geom = "pointrange", shape = 18, size = 1, color = "black",show.legend = F) +
stat_summary(fun = mean,geom = 'line', aes(group = 1), size = 1) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
```