-
Notifications
You must be signed in to change notification settings - Fork 2
/
report_William_Agyapong.Rmd
420 lines (292 loc) · 28.7 KB
/
report_William_Agyapong.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
---
title: 'Comparison of Centrality Measures for the midbrain subregion of two populations of prairie voles'
author: "Willliam Ofosu Agyapong"
date: "`r format(Sys.Date(), '%d/%m/%Y')`"
output:
pdf_document:
latex_engine: xelatex
number_section: yes
header-includes:
- \usepackage{booktabs}
- \usepackage{float}
- \usepackage{setspace}
- \doublespacing
- \usepackage{bm}
- \usepackage{amsmath}
- \usepackage{amssymb}
- \usepackage{amsfonts}
- \usepackage{amsthm}
- \usepackage{fancyhdr}
- \pagestyle{fancy}
- \fancyhf{}
- \rhead{William O. Agyapong}
- \lhead{DS 6336 - Project I Report}
- \cfoot{\thepage}
geometry: margin = 0.8in
fontsize: 11pt
abstract: |
In this report I analyzed network data obtained from the brains of Illinois (IL) and Kansas-Illinois (KI) cross-breeds of male prairie voles to determine if any significant differences existed between the two population in terms of the centrality measures computed from the midbrain subnetwork. Degree, closeness, and eigenvector centralities exhibited statistically significant differences, while the differeces seen in betweeness centrality appeared to be insignificant. The reticular formation appeared to be the most central part of the midbrain subregion of the IL and KI populations of prairie volves.
---
```{r setup, include=FALSE}
# Set global options for output rendering
knitr::opts_chunk$set(echo = F, warning = F, message = F, fig.align = "center")
# Load required packages
# library(summarytools)
library(dplyr)
library(tidyr)
library(stringr)
# library(broom)
library(ggplot2)
library(ggpubr) # interface for adding test results
library(patchwork)
library(readxl) # load package for reading excel files directly into R
library(igraph)
library(CINNA)
library(kableExtra)
library(knitr)
# Set the current working directory to the file path
setwd(dirname(rstudioapi::getSourceEditorContext()$path))
# Set default rounding to 4 decimal places
options(digits = 4)
# Define a default ggplot theme
theme_set(theme_classic())
# Global settings
# subregion = "Prefrontal Cortex"
subregion = "midbrain"
IL_col <- c("tomato", "gold")
KI_col <- c("tomato", "dodgerblue")
# Import user-defined helper functions
source("custom_functions.R")
```
```{r "Importing data"}
IL_More_Social_raw <- read_excel("UTEPDataExample.xlsx", sheet = 1)
KI_Less_Social_raw <- read_excel("UTEPDataExample.xlsx", sheet = 2)
Defined_Regions <- read_excel("UTEPDataExample.xlsx", sheet = 3)
# save(IL_More_Social_raw, KI_Less_Social_raw, Defined_Regions, file = "net_data.RData")
```
```{r "Data Munging"}
# select region of interest
subregion_df <- Defined_Regions %>%
filter(Subregion == stringr::str_to_title(subregion)) %>%
mutate(ROI = gsub("'", "", ROI))
# prepare data for adjacency matrix
IL_More_Social <- IL_More_Social_raw[, -1]
KI_Less_Social <- KI_Less_Social_raw[, -1]
#-- Create graph object from adjacency matrix
IL_net <- graph.adjacency(as.matrix(IL_More_Social), mode="undirected", weighted = T)
V(IL_net)$name <- gsub("'", "", V(IL_net)$name) # clean names
KI_net <- graph.adjacency(as.matrix(KI_Less_Social), mode="undirected", weighted = T)
V(KI_net)$name <- gsub("'", "", V(KI_net)$name)
```
# Introduction
## Background
Translational research involving animals offers a great opportunity for understanding human behaviors that would otherwise be impossible through a direct study conducted on humans. However, the success of translational research depends largely on the use of appropriate animal models since typical laboratory animals do not exhibit much of the complex social behaviors observed in humans. This is where the prairie vole (Microtus ochrogaster) comes in handy as it has been found to provide an extraordinary animal model that can be used to study several aspects of social cognition, because they establish long-term socially monogamous relationships with mates, provide biparental care and in natural habitats, alloparental behavior is commonly observed. The prairie vole model has already contributed to the better understanding of the neurobiology and genetics of social bonding, parental care, social buffering, the effects of early life experience in later social behavior and social related depression (Ortiz et al., 2018).
## Objective
Utilizing functional magnetic resonance imaging (fMRI) data from culturally and behaviorally distinct populations of prairie voles generated by Ortiz et al. (2021), this report conducts a statistical comparison of each of the centrality measures (degree, closeness, betweenness, eigenvector centrality), comparing the two populations of voles (IL and KI). IL and KI denote the prairie male voles from Illinois and male cross-breed off-springs of Kansas dam and Illinois sires, respectively. The goal is to determine if the two populations of prairie voles exhibit any significant differences in prosocial behavior evident from differences in the centrality measures for the **midbrain** subregion networks.
Because Ortiz et. al (2021) found Illinois prairie males to be more sociable than KI males and differential connectivity accounts for differential expression of prosociality and aggression, I hypothesized that differences in some of the centrality measures for the midbrain subregion will definitely be observed.
<!-- ## Task: -->
<!-- conduct a statistical comparison of each of the centrality measures (degree, closeness, betweenness, eigenvector centrality), comparing the two populations of voles (IL and KI). Use proper statistical methodology and terminology. Justify your statistical approach by citing relevant assumptions, parametric assumptions, or violations. Visualize the graph effectively to make the information as easy to read as possible. -->
## Source of data and data description
The data were provided by Richard J. Ortiz, a Research Technician at the Department of Biological Sciences at UTEP, as an excel workbook consisting of three main sheets relevant to our analysis. First sheet contains IL voles brain network data in the form of an adjacency matrix with 111 nodes produced using a vole-specific atlas. The second sheet contains similar data for the KI voles whiles the third provides a list of all nodes which are the region of interest (ROIs) and their corresponding subregions. Edges in the network data are weighted by the absolute Pearson's correlation coefficients across all node pairs where a **2.3** threshold was used to avoid weak node connections and zero means no connection between any two pairs of nodes.
The various results obtained in this report are highly reproducible not only for the `midbrain` subregion but for all the other subregions as well; first because all the R codes generating the results are placed side-by-side with the report documentation in an accompanying RMarkdown file, and second via an online web application where the user can interact with to see real time results. To replicate the results for any other subregion, simply provide the name of the particular subregion under the `Global settings` at the beginning of the RMarkdown file. Alternatively, just select any subregion on the online web application. Below is a link to the online web application:
> [https://william-agyapong.shinyapps.io/vbna/](https://william-agyapong.shinyapps.io/vbna/)
All statistical analyses and visualizations were done in the R statistical software.
The rest of the report is organized as follows. Section 2 presents methods covering graph theory concepts and the statistical methods implemented to compare the two populations of voles by centrality measures. We explore the network data and the centrality measures of interest for the midbrain subregion and conduct statistical comparison tests in section 3. We conclude the report with a brief dicussion in section 4.
# Methodology
## Graph Theory Concepts
In graph theory, a network is defined, in its simplest form, as a collection of points joined together in pairs by lines, where a point is referred to as a node or vertex and a line is referred to as an edge. In the mathematical literature, a network is also called a graph. Examples include the internet in which the nodes are computers and the edges are data connections between them; social networks in which the nodes are people and the edges between them are social connections of some kind, such as friendships , communication, or collaboration; and biological networks such as neural networks consisting of connections between neurons in the brain. This report uses data from neural networks in the brains of male prairie voles.
A network could either be directed or undirected, weighted or unweighted. In a directed network each edge has a direction, pointing from one node to another. Edges that connect nodes to themselves are called *self-edges* or *self-loops*. A network that has neither self-edges nor multiedges is called a *simple network*.
### The adjacency matrix
Network data can be represented in different ways and the *adjacency matrix* is one of the fundamental mathematical representation of a network. When two nodes are directly connected by an edge they are said to be *adjacent*, hence the name *adjacency matrix*. The adjacency matrix $A$ of an undirected simple network is defined as an $n\times n$ matrix with elements $A_{ij}$ such that
$$ A_{ij} = \begin{cases}
1 \quad \text{if there is an edge between nodes i and j,}\\
0 \quad \text{otherwise.}
\end{cases}$$
where $n$ represents the number of nodes or vertices.
The networks considered in this report have edges that represent the strength of linear relationship between nodes measured by the absolute value of the Pearson's correlation between pairs of nodes. Such networks are called *weighted networks* where, in our case, the elements $A_{ij}$ equal to the absolute value of the Pearson's correlations.
### Centrality Measures
Centrality measures quantify how important or central nodes are in a network, where the definition of importance often relies on the particular context from which the network was derived. This report considers the four popular centrality measures which are described in the next subsections.
#### Degree Centrality
Degree is the simplest centrality measure for a node in a network which measures the number of connections (edges) the node has. In directed networks, nodes have both an *in-degree* and an *out-degree*, and both may be useful as measures of centrality in the appropriate circumstances. The out-degree of a node is the number of other nodes to which a vertex has an outgoing edge directed to. The in-degree is the number of edges received from other vertices. A node with the highest degree has the most connections to other nodes in the network. This means degree centrality can help us find very connected individuals, popular individuals, individuals who are likely to hold most information or individuals who can quickly connect with the wider network.
#### Betweenness Centrality
Betweenness centrality measures the number of times a node lies on the shortest path between other nodes. A path is a series of adjacent nodes ( A series of edges that take us from one node to another node). The shortest path between any two nodes is the least amount of total steps (or edges). If a node C is on a shortest path between A and B, then it means C is important to the efficient flow between A and B. Without C, flows would have to take a longer route to get from A to B. Thus, betweenness effectively counts how many shortest paths each node is on. Nodes with high betweenness are key bridges between different parts of a network. The higher a node’s betweenness, the more important they are for the efficient flow in a network.
Betweenness centrality can be very large, so it is often helpful to normalize it by dividing by the maximum and multiplying by some scalar when plotting.
#### Closeness Centrality
The closeness centrality also makes use of the shortest paths between nodes. We measure the distance between two nodes as the length of the shortest path between them. Farness, for a given node, is the average distance from that node to all other nodes. Closeness is then the reciprocal of farness (1/farness).
#### Eigenvector Centrality
Degree centrality only takes into account the number of edges for each node, but it leaves out information about the relative importance of the neighboring nodes. In many circumstances a node’s importance in a network is increased by having connections to other nodes that are themselves important. For instance, If A and B have the same degree centrality, but A is tied to all high degree people and B is tied to all low degree people, then intuitively we want to see A with a higher score than B. This is where the eigenvector centrality comes in as an extension of the degree centrality by also taking into account how well connected a node is, and how many edges their connections have, and so on through the network. In other words, the eigenvector centrality awards each node points proportional to the centrality scores of their neighbors.
### Other concepts
Another measure of how interconnected a network is **average path length**. This is computed by determining the mean of the lengths of the shortest paths between all pairs of vertices in the network. The longest path length between any pair of vertices is called the **diameter** of the network graph. **Density** of a network is the proportion of edges that actually exist out of the total possible edges that can be formed. This also indicates how interconnected a network is.
## Statistical test for comparison
I used the Wilcoxon signed rank test to compare whether there is a difference in the centrality measures in the IL and KI populations. It is a non-parametric equivalent to the paired samples t-test which compares the median rather than the mean. The distributions of the various centrality measures are generally skewed with some extreme values (outliers), suggesting that the distributions are not normal, so to be on the safer side, Wilcoxon signed rank test instead of the paired t-test was used. Again, another justification is that normality tests are sensitive to sample size in which case small samples most often pass normality tests, and considering the fact that the sample size of 12 for the midbrain subregion is not large enough, there is high chance that normality test will suggest the use of paired t.test when in fact the underlying distribution is not normal. In contrast, non-parametric tests do not assume any particular distribution for the dependent variable.
One assumption inherent in the use of the Wilcoxon signed rank test is that the data from the groups of voles are dependent or paired. Here, it is reasonable to assume that the centrality measures for the two populations are paired since the KI voles are offsprings of both Illinois and Kansas parents. Another assumption is independence which requires that the paired samples are randomly and independently drawn from their respective populations. In general this assumption is difficult to assess so we reasonably assume that the voles were randomly selected from their respective populations for the study that yielded the network data. The Wilcoxon test also requires the distributions for the two groups to be identical which is satisfied in our case since the distributions of centrality measures for the two populations were found to be skewed in the same directions.
The null hypothesis for the Wilcoxon signed rank that was performed assumes that the true median difference between the paired samples is zero (0) while the alternative hypothesis assumes that the true median difference between the paired samples is not equal to zero.
<!-- -->
# Analysis and Results
## Network Exploration
```{r}
measures <- c("Number of Vertices", "Number of Edges", "Density (edges)", "Average path length", "Diameter")
IL_measures <- c(vcount(IL_net), gsize(IL_net), edge_density(IL_net), mean_distance(IL_net, directed = F), diameter(IL_net, directed = F))
KI_measures <- c(vcount(KI_net), gsize(KI_net), edge_density(KI_net), mean_distance(KI_net, directed = F), diameter(KI_net, directed = F))
# display results
data.frame(measures, IL_measures, KI_measures) %>%
kable(booktabs = T, linesep = "", col.names = c("Characteristic", "IL Network", "KI Network"), align = "lcc",
caption = "Characteristics of the two full networks") %>%
kable_paper() %>%
kable_styling(latex_options = c("HOLD_position"), full_width = F)
```
### Full network with all available edges (connections)
```{r "whole graph"}
par(mfrow= c(1,2))
pgraph(IL_net, main = "IL Voles Network", color = IL_col, vertex.label = NA, legend = T)
pgraph(KI_net, main = "KI Voles Network", color = KI_col, vertex.label = NA, legend = T)
```
The first two graphs (from left to right) display the full networks for the network data for Voles from the two populations including all subregions and all available connections among the `111` vertices. There are a total of `680` and `465` edges, respectively. The twelve vertices related to the `midbrain` subregion, which are the main focus of the report, are earmarked in red on both graphs. **It can be seen that the Illinois network appears densely connected than the Kansas network**.
### Full network with edges for only the nodes of interest and their connections
```{r "whole graph 2"}
# Creating sub networks
IL_net2 <- get_subgraph(IL_net)
KI_net2 <- get_subgraph(KI_net)
par(mfrow = c(1,2))
pgraph(IL_net2, legend = T, color = IL_col, vertex.label = NA)
pgraph(KI_net2, legend = T, color = KI_col, vertex.label = NA)
```
These two graphs are intended to draw attention to the midbrain subnetwork within the full network and the connections among them as well as connections to other vertices part of the whole network. 160 and 95 such connections exist for IL and KI, respectively.
### Exploring the midbrain subregion networks
```{r "subregion abbr"}
# include centrality measures
subregion_df %>%
relocate(Subregion, .before = "ROI") %>%
mutate(Notation = abbr(ROI)) %>%
mutate(d1 = degree(IL_net)[ROI],
d2 = degree(KI_net)[ROI],
c1 = closeness(IL_net)[ROI],
c2 = closeness(KI_net)[ROI],
b1 = betweenness(IL_net)[ROI],
b2 = betweenness(KI_net)[ROI],
e1 = eigen_centrality(IL_net)$vector[ROI],
e2 = eigen_centrality(KI_net)$vector[ROI]
) %>%
kable(booktabs = T, linesep = "", col.names = c("Subregion", "ROI", "Abbr",rep(c("IL", "KI"), times = 4)), align = "lllcccccccc",
caption = "Vertices labels lookup table with centrality measures") %>%
add_header_above(c("", "", "", "Degree"=2, "Closeness"=2, "Betweenness"=2, "Eigenvector"=2)) %>%
kable_paper() %>%
kable_styling(latex_options = c("HOLD_position"), full_width = F)
# abbreviate(str_to_title(V(IL_net)$name), 3, method = "both")
```
The above table is provided mainly to associate the abbreviated vertices names to their full names for clarity in the subsequent graphs. The centrality measures were simply appended as an additional information.
We explore the `Midbrain` sub-region at length in the network graphs that follow, highlighting the most important vertices by making their sizes proportional to their centrality values. Vertices related to other sub-regions to which the midbrain sub-regions have connections to are less emphasized by minimizing their sizes to a fixed constant. The sizes of the vertices are made proportional to the values of the centrality measure in question for clear visual comparisons. The graphs on the left side correspond to IL prairie voles while those on the right relate to KI prairie voles.
**Emphasizing Degree Centrality**
```{r "midbrain subregion"}
# create subnetworks with unwanted vertices deleted
IL_sub_net <- get_subgraph(IL_net, dvwne = T)
# V(IL_sub_net)$color <- ifelse(colnames(IL_More_Social) %in% subregion_df$ROI, "tomato", "gold") #
KI_sub_net <- get_subgraph(KI_net, dvwne = T)
# V(KI_net2)$color <- ifelse(colnames(KI_Less_Social) %in% subregion_df$ROI, "tomato", "dodgerblue")
par(mfrow = c(1,2))
# pgraph(IL_sub_net, legend = T, v.size = degree(IL_sub_net), color = IL_col)
# pgraph(KI_sub_net, legend = T, v.size = degree(KI_sub_net), color = KI_col)
psubgraph(IL_sub_net, centrality = "degree", color = IL_col)
psubgraph(KI_sub_net, centrality = "degree", color = KI_col)
# plot(IL_sub_net, vertex.label = abbr(V(IL_sub_net)$name),
# vertex.size = degree(IL_sub_net))
# plot(KI_sub_net, vertex.label = abbr(V(KI_sub_net)$name),
# vertex.size = degree(KI_sub_net),
# layout = layout_on_grid(KI_sub_net))
#
# plot(KI_sub_net, vertex.label = abbr(V(KI_sub_net)$name),
# vertex.size = betweenness(KI_sub_net)/max(betweenness(KI_sub_net))*20,
# layout = layout_on_grid(KI_sub_net))
```
**Emphasizing Closeness Centrality**
```{r eval=T}
par(mfrow = c(1,2))
# pgraph(IL_sub_net, legend = T, v.size = closeness(IL_sub_net), color = IL_col, vertex.label=NA)
# pgraph(KI_sub_net, legend = T, v.size = closeness(KI_sub_net), color = KI_col)
psubgraph(IL_sub_net, centrality = "closeness", color = IL_col)
psubgraph(KI_sub_net, centrality = "closeness", color = KI_col)
```
**Emphasizing Betweenness Centrality**
```{r eval=T}
par(mfrow = c(1,2))
psubgraph(IL_sub_net, centrality = "betweenness", color = IL_col)
psubgraph(KI_sub_net, centrality = "betweenness", color = KI_col)
```
**Emphasizing Eigenvector Centrality**
```{r eval=T}
par(mfrow = c(1,2))
psubgraph(IL_sub_net, centrality = "eigenvector", color = IL_col)
psubgraph(KI_sub_net, centrality = "eigenvector", color = KI_col)
```
Using the information from the table and the above graphs, it is seen that the **reticular formation** is the most important brain part in terms of degree, closeness, and eigenvector centralities for the two populations of voles under consideration. For the betweenness centrality **lemniscal nucleus** happens to be the most important for IL while the **reticular formation** remains the most important node for KI.
### Comparison between distributions of centrality Measures
```{r "centrality distr"}
(compare_distrib(IL_net, KI_net, "degree") +
compare_distrib(IL_net, KI_net, "degree", plt = "density")) /
(compare_distrib(IL_net, KI_net, "closeness") +
compare_distrib(IL_net, KI_net, "closeness", plt = "density")
)
# theme(legend.position = c(0.3,0.87))
```
```{r "centrality distr 2"}
(compare_distrib(IL_net, KI_net, "betweenness") +
compare_distrib(IL_net, KI_net, "betweenness", plt = "density")) /
(compare_distrib(IL_net, KI_net, "eigenvector") +
compare_distrib(IL_net, KI_net, "eigenvector", plt = "density"))
```
The boxplots provide five-number summaries, median, 1st quartile, 3rd quartile, minimum and maximum values. While the boxplots help us see clearly the median (center), the density plots help reveal the true shape of the distributions. It is observed that all the distributions are skewed (not symmetric or normal). The points on the graph indicate the presence of extreme values or outliers. For skewed distributions like these ones with outliers, it is best to report the median and interquartile range (or range) as measures of center and spread(variability), respectively.
In general, the medians of the centrality measures for Illinios are considerably larger than those for the KI group. Interestingly, however, the differences are much bigger in terms of degree and closeness. The distributions of betweenness look very similar for the two populations. The sizes of the boxplots or the area under the density curves give us an idea about how the measurements vary by which we see that there is much variability in the distributions for KI than the IL group.
## Statistical test of differences between distributions
From the previous section, noticeable differences were observed in all the distributions for the four centrality measures under discussion. An important question that arises is whether the differences observed is due to chance alone or as a result of the uniqueness in our populations of interest. In other words, is the difference between a given centrality measure for the two populations of Voles statistically significant? To answer this question, we used the Wilcoxon signed rank test, a non-parametric test. The p-values for the tests that were performed are show in red on the plots below. To decide whether to reject or fail to reject the null hypothesis ($H_0$) that the true median differences between the two populations is zero (0), we used a significance level of $5\%$, where we reject $H_0$ if the p-value is less than $0.05$, and fail to reject otherwise.
```{r "Test results"}
# install.packages("ggpubr")
compare_distrib(IL_net, KI_net, "degree", annotate.test = T) +
compare_distrib(IL_net, KI_net, "closeness", annotate.test = T) +
compare_distrib(IL_net, KI_net, "betweenness", annotate.test = T) +
compare_distrib(IL_net, KI_net, "eigenvector", annotate.test = T)
# +
# stat_compare_means(comparisons = list(c("Illinios", "Kansas")),
# vjust = 2, color = "red")
```
The p-values of the tests for degree, closeness, and eigenvector centrality are significantly less than $0.05$. We can conclude that the median degree, closeness, and eigenvector centrality for the Illinois male voles are significantly different from the median degree, closeness, and eigenvector centrality for the KI male voles. On the other hand, the p-value = 0.45 for betweenness is greater than our significance level of $5\%$, signifying that the betweenness centrality is approximately the same for the two populations of voles.
Tables of statistical tests results including the test statistics, p-values, and confidence intervals can be found in the appendix. The confidence intervals which measure the magnitude of the differences (effect size) also lead us to the same conclusions reached above since the confidence interval for the betweenness centrality is the only one that contains 0.
```{r "degree"}
IL_deg <- degree(IL_net)
IL_midbrain_deg <- IL_deg[subregion_df$ROI]
KI_deg <- degree(KI_net, mode = "all")
KI_midbrain_deg <- KI_deg[subregion_df$ROI]
# confirm with example in excel file: differences observed
# amygdala <- Defined_Regions %>% filter(Subregion=="Amygdala")
# IL_deg[amygdala$ROI]
# KI_deg[amygdala$ROI]
# IL_amygdala <- IL_More_Social_raw %>%
# filter(`...1` %in% amygdala$ROI) %>%
# select(contains(amygdala$ROI))
# amy_net <- graph.adjacency(as.matrix(IL_amygdala), mode = "directed", weighted = T)
# plot(amy_net)
# degree(amy_net, mode = "all")
# degree distribution
# deg.dist <- degree_distribution(IL_net, cumulative=T, mode="all")
#
# plot( x=0:max(IL_deg), y=1-deg.dist, pch=19, cex=1.2, col="orange",
#
# xlab="Degree", ylab="Cumulative Frequency")
```
# Discussion
The distributions of the centrality measures were found to depart substantially from normality due to the skewness and the presence of some extreme values. This served as grounds for using the Wilcoxon signed rank test instead of the paired two-sample t-test. According to the results for the Wilcoxon signed rank tests we have sufficient evidence supporting the claim that the IL male prairie voles and KI male prairie voles are significantly different in terms of degree centrality, closeness centrality, and eigenvector centrality when we consider only the **midbrain** subregion. Given that three out of the four centrality measures suggested differences between the midbrain regions of the two population of voles, it is clear that IL prairie voles will exhibit prosocial behaviors and aggression that are different from those exhibited by their KI counterparts, since parts of the midbrain such as *dorsal raphe* and *tegemental nucleus* contribute to the expression of prosocial behaviors.
# Refernces
- Ashtiani, M., Salehzadeh-Yazdi, A., Razaghi-Moghadam, Z., Hennig, H., Wolkenhauer, O., Mirzaie, M., & Jafari, M. (2017). A systematic survey of centrality measures for protein-protein interaction networks. bioRxiv, 149492.
- Richard J. Ortiz, Amy E. Wagler, Jason R. Yee, Praveen P. Kulkarni, Xuezhu Cai, Craig F. Ferris, Bruce S. Cushing, Functional connectivity differences between two culturally distinct prairie vole populations: insights into the prosocial network, Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 2021, ISSN 2451-9022, https://doi.org/10.1016/j.bpsc.2021.11.007. (https://www.sciencedirect.com/science/article/pii/S2451902221003207)
- Ortiz, J.J., Portillo, W., Paredes, R.G. et al. Resting state brain networks in the prairie vole. Sci Rep 8, 1231 (2018). https://doi.org/10.1038/s41598-017-17610-9
- Network Analysis in R by Dai Shizuka: https://dshizuka.github.io/networkanalysis/tutorials.html
- Centrality measures: https://cambridge-intelligence.com/keylines-faqs-social-network-analysis/
- https://bookdown.org/markhoff/social_network_analysis/your-first-network.html
- https://kateto.net/netscix2016.html
# Appendix
**Wilcoxon signed rank test results**
```{r}
test_diff(IL_net, KI_net, "degree")
test_diff(IL_net, KI_net, "closeness")
test_diff(IL_net, KI_net, "betweenness")
test_diff(IL_net, KI_net, "eigenvector")
```