-
Notifications
You must be signed in to change notification settings - Fork 0
/
06_conclusion.Rmd
268 lines (204 loc) · 18.3 KB
/
06_conclusion.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
# smokingMouse
* [Presentación](https://docs.google.com/presentation/d/1Sh922_c_pLrM74313FbHaJPAES8Jeh4lLa8kHbkUsL8/edit?usp=sharing)
* Código: [LieberInstitute/smokingMouse_Indirects](https://github.com/LieberInstitute/smokingMouse_Indirects)
* Notas públicas de Daianna: [Notion](https://unequaled-boursin-0e1.notion.site/Modeling-the-effects-of-nicotine-and-smoking-exposures-on-the-developing-brain-85c9d6f413da4fd3a7dc1a5255f667d4).
# Revisión
* ¿Debemos explorar las relaciones entre nuestras variables con información de nuestras muestras previo a hacer un análisis de expresión diferencial?
* ¿Por qué usamos el paquete `edgeR`?
* ¿Por qué es importante el argumento `sort.by` en `topTable()`?
* ¿Por qué es importante el argumento `coef` en `topTable()`?
Usemos los datos de http://research.libd.org/SPEAQeasy-example/bootcamp_intro
```{r "speaqeasy_data"}
speaqeasy_data <- file.path(tempdir(), "rse_speaqeasy.RData")
download.file("https://github.com/LieberInstitute/SPEAQeasy-example/blob/master/rse_speaqeasy.RData?raw=true", speaqeasy_data, mode = "wb")
library("SummarizedExperiment")
load(speaqeasy_data, verbose = TRUE)
rse_gene
```
* ¿Cuantos genes y muestras tenemos en estos datos?
## Ejercicio en equipo
* ¿Hay diferencias en `totalAssignedGene` o `mitoRate` entre los grupos de diagnosis (`PrimaryDx`)?
* Grafica la expresión de _SNAP25_ para cada grupo de diagnosis.
* Sugiere un modelo estadistico que podríamos usar en una análisis de expresión diferencial. Verifica que si sea un modelo _full rank_. ¿Cúal sería el o los coeficientes de interés?
## Respuestas
```{r "respuestas"}
## Exploremos la variable de PrimaryDx
table(rse_gene$PrimaryDx)
## Eliminemos el diagnosis "Other" porque no tiene información
rse_gene$PrimaryDx <- droplevels(rse_gene$PrimaryDx)
table(rse_gene$PrimaryDx)
## Exploremos numéricamente diferencias entre grupos de diagnosis para
## varias variables
with(colData(rse_gene), tapply(totalAssignedGene, PrimaryDx, summary))
with(colData(rse_gene), tapply(mitoRate, PrimaryDx, summary))
## Podemos hacer lo mismo para otras variables
with(colData(rse_gene), tapply(mitoRate, BrainRegion, summary))
## Podemos resolver la primeras preguntas con iSEE
if (interactive()) iSEE::iSEE(rse_gene)
## O hacer graficas nosotros mismos. Aquí les muestro una posible respuesta
## con ggplot2
library("ggplot2")
ggplot(
as.data.frame(colData(rse_gene)),
aes(y = totalAssignedGene, group = PrimaryDx, x = PrimaryDx)
) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Diagnosis")
ggplot(
as.data.frame(colData(rse_gene)),
aes(y = totalAssignedGene, group = paste0(PrimaryDx, "_", BrainRegion), x = paste0(PrimaryDx, "_", BrainRegion))
) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Diagnosis")
ggplot(
as.data.frame(colData(rse_gene)),
aes(y = mitoRate, group = PrimaryDx, x = PrimaryDx)
) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Diagnosis")
ggplot(
as.data.frame(colData(rse_gene)),
aes(y = mitoRate, group = paste0(PrimaryDx, "_", BrainRegion), x = paste0(PrimaryDx, "_", BrainRegion))
) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Diagnosis")
## Otras variables
ggplot(
as.data.frame(colData(rse_gene)),
aes(y = mitoRate, group = BrainRegion, x = BrainRegion)
) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Brain Region")
## Encontremos el gene SNAP25
rowRanges(rse_gene)
## En este objeto los nombres de los genes vienen en la variable "Symbol"
i <- which(rowRanges(rse_gene)$Symbol == "SNAP25")
i
## Para graficar con ggplot2, hagamos un pequeño data.frame
df <- data.frame(
expression = assay(rse_gene)[i, ],
Dx = rse_gene$PrimaryDx
)
## Ya teniendo el pequeño data.frame, podemos hacer la gráfica
ggplot(df, aes(y = log2(expression + 0.5), group = Dx, x = Dx)) +
geom_boxplot() +
theme_bw(base_size = 20) +
xlab("Diagnosis") +
ylab("SNAP25: log2(x + 0.5)")
## https://bioconductor.org/packages/release/bioc/vignettes/scater/inst/doc/overview.html#3_Visualizing_expression_values
scater::plotExpression(
as(rse_gene, "SingleCellExperiment"),
features = rownames(rse_gene)[i],
x = "PrimaryDx",
exprs_values = "counts",
colour_by = "BrainRegion",
xlab = "Diagnosis"
)
if (requireNamespace("plotly", quietly = TRUE)) {
## Lo pueden instalar con
# install.packages("plotly")
## Guardemos el resultado de plotExpression()
p <- scater::plotExpression(
as(rse_gene, "SingleCellExperiment"),
features = rownames(rse_gene)[i],
x = "PrimaryDx",
exprs_values = "counts",
colour_by = "BrainRegion",
xlab = "Diagnosis"
)
## scater::plotExpression() regresa un objeto de clase ggplot
class(p)
## así que podemos usar plotly para crear una versión
## interactiva
plotly::ggplotly(p)
}
## Para el model estadístico exploremos la información de las muestras
colnames(colData(rse_gene))
## Podemos usar región del cerebro porque tenemos suficientes datos
table(rse_gene$BrainRegion)
## Pero no podemos usar "Race" porque son solo de 1 tipo
table(rse_gene$Race)
## Ojo! Acá es importante que hayamos usado droplevels(rse_gene$PrimaryDx)
## si no, vamos a tener un modelo que no sea _full rank_
mod <- with(
colData(rse_gene),
model.matrix(~ PrimaryDx + totalAssignedGene + mitoRate + rRNA_rate + BrainRegion + Sex + AgeDeath)
)
## Exploremos el modelo de forma interactiva
if (interactive()) {
## Tenemos que eliminar columnas que tienen NAs.
info_no_NAs <- colData(rse_gene)[, c(
"PrimaryDx", "totalAssignedGene", "rRNA_rate", "BrainRegion", "Sex",
"AgeDeath", "mitoRate", "Race"
)]
ExploreModelMatrix::ExploreModelMatrix(
info_no_NAs,
~ PrimaryDx + totalAssignedGene + mitoRate + rRNA_rate + BrainRegion + Sex + AgeDeath
)
## Veamos un modelo más sencillo sin las variables numéricas (continuas) porque
## ExploreModelMatrix nos las muestra como si fueran factors (categoricas)
## en vez de continuas
ExploreModelMatrix::ExploreModelMatrix(
info_no_NAs,
~ PrimaryDx + BrainRegion + Sex
)
## Si agregamos + Race nos da errores porque Race solo tiene 1 opción
# ExploreModelMatrix::ExploreModelMatrix(
# info_no_NAs,
# ~ PrimaryDx + BrainRegion + Sex + Race
# )
}
```
¿Quieres más datos? Tenemos muchos en LIBD incluyendo http://eqtl.brainseq.org/phase2/.
# R/Bioconductor-powered Team Data Science
<iframe width="560" height="315" src="https://www.youtube.com/embed/33scakbTNO0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<script async class="speakerdeck-embed" data-id="3c32410b600740abb4724486e83ebd30" data-ratio="1.77725118483412" src="//speakerdeck.com/assets/embed.js"></script>
# spatialLIBD
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">🔥off the press! 👀 our <a href="https://twitter.com/biorxivpreprint?ref_src=twsrc%5Etfw">@biorxivpreprint</a> on human 🧠brain <a href="https://twitter.com/LieberInstitute?ref_src=twsrc%5Etfw">@LieberInstitute</a> spatial 🌌🔬transcriptomics data 🧬using Visium <a href="https://twitter.com/10xGenomics?ref_src=twsrc%5Etfw">@10xGenomics</a>🎉<a href="https://twitter.com/hashtag/spatialLIBD?src=hash&ref_src=twsrc%5Etfw">#spatialLIBD</a><br><br>🔍<a href="https://t.co/RTW0VscUKR">https://t.co/RTW0VscUKR</a> <br>👩🏾💻<a href="https://t.co/bsg04XKONr">https://t.co/bsg04XKONr</a><br>📚<a href="https://t.co/FJDOOzrAJ6">https://t.co/FJDOOzrAJ6</a><br>📦<a href="https://t.co/Au5jwADGhY">https://t.co/Au5jwADGhY</a><a href="https://t.co/PiWEDN9q2N">https://t.co/PiWEDN9q2N</a> <a href="https://t.co/aWy0yLlR50">pic.twitter.com/aWy0yLlR50</a></p>— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) <a href="https://twitter.com/lcolladotor/status/1233661576433061888?ref_src=twsrc%5Etfw">February 29, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<script async class="speakerdeck-embed" data-id="329db23f5f17460da31f45c7695a9f06" data-ratio="1.33333333333333" src="//speakerdeck.com/assets/embed.js"></script>
<script async class="speakerdeck-embed" data-id="c48e671f4c93476489c3d9d679830bca" data-ratio="1.33333333333333" src="//speakerdeck.com/assets/embed.js"></script>
<script async class="speakerdeck-embed" data-id="9e099800589b49c29a99187a6415af91" data-ratio="1.77777777777778" src="//speakerdeck.com/assets/embed.js"></script>
* Platicas grabadas
Usta es la versión de un webinar para BioTuring que [pueden ver en su sitio web](https://bioturing.com/sources/webinar/60752954a433e26dd8affcbd) o a través de YouTube
<iframe width="560" height="315" src="https://www.youtube.com/embed/S8884Kde-1U" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
Aquí hay una versión anterior:
<iframe width="560" height="315" src="https://www.youtube.com/embed/aD2JU-vUv54" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
* Artículo: https://www.nature.com/articles/s41593-020-00787-0
* Software: http://research.libd.org/spatialLIBD/ o `r BiocStyle::Biocpkg("spatialLIBD")`
* Interfaz de shiny: http://spatial.libd.org/spatialLIBD/
* Libro (en construcción) donde explicamos como usar varias herramientas: https://lmweber.org/OSTA-book/
* Pre-print sobre `SpatialExperiment` https://www.biorxiv.org/content/10.1101/2021.01.27.428431v1
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Are you working with spatial transcriptomics data such as Visium from <a href="https://twitter.com/10xGenomics?ref_src=twsrc%5Etfw">@10xGenomics</a>? Then you'll be interested in <a href="https://twitter.com/hashtag/SpatialExperiment?src=hash&ref_src=twsrc%5Etfw">#SpatialExperiment</a> 📦 led by <a href="https://twitter.com/drighelli?ref_src=twsrc%5Etfw">@drighelli</a> <a href="https://twitter.com/lmwebr?ref_src=twsrc%5Etfw">@lmwebr</a> <a href="https://twitter.com/CrowellHL?ref_src=twsrc%5Etfw">@CrowellHL</a> with contributions by <a href="https://twitter.com/PardoBree?ref_src=twsrc%5Etfw">@PardoBree</a> <a href="https://twitter.com/shazanfar?ref_src=twsrc%5Etfw">@shazanfar</a> A Lun <a href="https://twitter.com/stephaniehicks?ref_src=twsrc%5Etfw">@stephaniehicks</a> <a href="https://twitter.com/drisso1893?ref_src=twsrc%5Etfw">@drisso1893</a> 🌟<br><br>📜 <a href="https://t.co/r36qlakRJe">https://t.co/r36qlakRJe</a> <a href="https://t.co/cWIiwLFitV">pic.twitter.com/cWIiwLFitV</a></p>— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) <a href="https://twitter.com/lcolladotor/status/1355208674856329218?ref_src=twsrc%5Etfw">January 29, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
Brenda Pardo
https://twitter.com/PardoBree
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Today I gave my first talk at a congress in <a href="https://twitter.com/hashtag/EuroBioc2020?src=hash&ref_src=twsrc%5Etfw">#EuroBioc2020</a> about our work on adapting the package <a href="https://twitter.com/hashtag/spatialLIBD?src=hash&ref_src=twsrc%5Etfw">#spatialLIBD</a> to use VisiumExperiment objects. <a href="https://t.co/U23yE32RWM">pic.twitter.com/U23yE32RWM</a></p>— Brenda Pardo (@PardoBree) <a href="https://twitter.com/PardoBree/status/1338560370382942209?ref_src=twsrc%5Etfw">December 14, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Our paper describing our package <a href="https://twitter.com/hashtag/spatialLIBD?src=hash&ref_src=twsrc%5Etfw">#spatialLIBD</a> is finally out! 🎉🎉🎉<br><br>spatialLIBD is an <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> / <a href="https://twitter.com/Bioconductor?ref_src=twsrc%5Etfw">@Bioconductor</a> package to visualize spatial transcriptomics data.<br>⁰<br>This is especially exciting for me as it is my first paper as a first author 🦑.<a href="https://t.co/COW013x4GA">https://t.co/COW013x4GA</a><br><br>1/9 <a href="https://t.co/xevIUg3IsA">pic.twitter.com/xevIUg3IsA</a></p>— Brenda Pardo (@PardoBree) <a href="https://twitter.com/PardoBree/status/1388253938391175173?ref_src=twsrc%5Etfw">April 30, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
# Comunidad
* https://twitter.com/miR_community
* https://twitter.com/R_LGBTQ
* https://twitter.com/conecta_R
* https://twitter.com/LatinR_Conf
* https://twitter.com/R4DScommunity
* https://twitter.com/RConsortium
* https://twitter.com/rweekly_org
* https://twitter.com/RLadiesGlobal
* https://twitter.com/RLadiesBmore
* https://twitter.com/search?q=%23RLadiesMX&src=typed_query
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">The blog post about the First annual meeting of <a href="https://twitter.com/hashtag/RLadiesMX?src=hash&ref_src=twsrc%5Etfw">#RLadiesMX</a> is ready!! All <a href="https://twitter.com/hashtag/rladies?src=hash&ref_src=twsrc%5Etfw">#rladies</a> chapters in México met for the first time! 🤩 Spread the word and join this amazing community 💜<a href="https://t.co/evY4Tc18rw">https://t.co/evY4Tc18rw</a> Thanks <a href="https://twitter.com/AnaBetty2304?ref_src=twsrc%5Etfw">@AnaBetty2304</a> <a href="https://twitter.com/Averi_GG?ref_src=twsrc%5Etfw">@Averi_GG</a> and <a href="https://twitter.com/josschavezf1?ref_src=twsrc%5Etfw">@josschavezf1</a> for all your work!</p>— RLadies Cuernavaca (@RLadiesCuerna) <a href="https://twitter.com/RLadiesCuerna/status/1355655180751151107?ref_src=twsrc%5Etfw">January 30, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
* https://twitter.com/Bioconductor
* https://twitter.com/rOpenSci
* https://twitter.com/LIBDrstats
* https://twitter.com/CDSBMexico
<blockquote class="twitter-tweet"><p lang="es" dir="ltr">¡Chequen el nuevo blog post de Erick <a href="https://twitter.com/ErickCuevasF?ref_src=twsrc%5Etfw">@ErickCuevasF</a>! 💯<br><br>Nos describe sus experiencias en <a href="https://twitter.com/hashtag/BioC2020?src=hash&ref_src=twsrc%5Etfw">#BioC2020</a> y <a href="https://twitter.com/hashtag/CDSB2020?src=hash&ref_src=twsrc%5Etfw">#CDSB2020</a><br><br>Además estamos orgullxs de que Erick se unió a la Junta Directiva de la CDSB 🤩🎉<br><br>👀 <a href="https://t.co/uGpgnqXvVM">https://t.co/uGpgnqXvVM</a><a href="https://twitter.com/hashtag/rstatsES?src=hash&ref_src=twsrc%5Etfw">#rstatsES</a> <a href="https://t.co/O2eIbk5YoZ">pic.twitter.com/O2eIbk5YoZ</a></p>— ComunidadBioInfo (@CDSBMexico) <a href="https://twitter.com/CDSBMexico/status/1296920807105540098?ref_src=twsrc%5Etfw">August 21, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
## De alumnos LCG 2021
<blockquote class="twitter-tweet"><p lang="en" dir="ltr"><a href="https://twitter.com/fikandata?ref_src=twsrc%5Etfw">@fikandata</a> <a href="https://twitter.com/MqElizabeth?ref_src=twsrc%5Etfw">@MqElizabeth</a> <br><br>Looking for a solid and useful R 📦, we stumbled upon this <br>beauty 🤩 <a href="https://t.co/KR3twAxqRY">https://t.co/KR3twAxqRY</a><br><br>shoutouts to <a href="https://twitter.com/digitalwright?ref_src=twsrc%5Etfw">@digitalwright</a> 👈!!<a href="https://twitter.com/lcolladotor?ref_src=twsrc%5Etfw">@lcolladotor</a> <a href="https://twitter.com/Bioconductor?ref_src=twsrc%5Etfw">@Bioconductor</a></p>— Axel Zagal Norman (@NormanZagal) <a href="https://twitter.com/NormanZagal/status/1364381133878611968?ref_src=twsrc%5Etfw">February 24, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">In today's lesson of bioinformatics course of undergraduate program in genomic sciences <a href="https://twitter.com/lcgunam?ref_src=twsrc%5Etfw">@lcgunam</a> we created our first personal page using <a href="https://twitter.com/seankross?ref_src=twsrc%5Etfw">@seankross</a>' postcards R package with <a href="https://twitter.com/lcolladotor?ref_src=twsrc%5Etfw">@lcolladotor</a> as our instructor. <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> <a href="https://t.co/sXUSietCZy">https://t.co/sXUSietCZy</a></p>— Angel Castillo (@angelcaztle13) <a href="https://twitter.com/angelcaztle13/status/1364466027682140162?ref_src=twsrc%5Etfw">February 24, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Wake up <br>Brew some <a href="https://twitter.com/tyulmalcafe?ref_src=twsrc%5Etfw">@tyulmalcafe</a> beans <br>Attend <a href="https://twitter.com/lcolladotor?ref_src=twsrc%5Etfw">@lcolladotor</a> amazing class on visualizing expression data using ISEE <a href="https://twitter.com/FedeBioinfo?ref_src=twsrc%5Etfw">@FedeBioinfo</a> <a href="https://twitter.com/KevinRUE67?ref_src=twsrc%5Etfw">@KevinRUE67</a> <a href="https://twitter.com/CSoneson?ref_src=twsrc%5Etfw">@CSoneson</a> <br>Am I dreaming? <br>Nah! Mug is empty.</p>— Alfredo Varela (@fikandata) <a href="https://twitter.com/fikandata/status/1364669473634983941?ref_src=twsrc%5Etfw">February 24, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Today I learned how to visualize data in a cool and easy way in <a href="https://twitter.com/lcolladotor?ref_src=twsrc%5Etfw">@lcolladotor</a> 's class. <br><br>ExploreModelMatrix definitely makes your life easier! <br>Shoutouts to:<a href="https://twitter.com/CSoneson?ref_src=twsrc%5Etfw">@CSoneson</a> <a href="https://twitter.com/FedeBioinfo?ref_src=twsrc%5Etfw">@FedeBioinfo</a> <a href="https://twitter.com/mikelove?ref_src=twsrc%5Etfw">@mikelove</a></p>— Axel Zagal Norman (@NormanZagal) <a href="https://twitter.com/NormanZagal/status/1365034931261349889?ref_src=twsrc%5Etfw">February 25, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
## De ustedes
??