-
Notifications
You must be signed in to change notification settings - Fork 265
/
Copy pathDemo.Rmd
404 lines (320 loc) · 15.2 KB
/
Demo.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
---
title : "A Short Demo of R"
shorttitle : "Demo"
author:
- name : "Hu Chuan-Peng"
affiliation : "1,2"
corresponding : yes # Define only one corresponding author
address : "#122 Ninghai Road, Gulou District, Nanjing, China"
email : "hcp4715@hotmail.com"
role: # Contributorship roles (e.g., CRediT, https://credit.niso.org/)
- "Conceptualization"
- "Writing - Original Draft Preparation"
- "Writing - Review & Editing"
- "Supervision"
- name : "Wen Jia Hui"
affiliation : "1"
role:
- "Writing - Review & Editing"
affiliation:
- id : "1"
institution : "Nanjing Normal University"
- id : "2"
institution : "Chinese Open Science Network"
authornote: |
This is a demostration of *papaja*.
Enter author note here.
abstract: |
One or two sentences providing a **basic introduction** to the field, comprehensible to a scientist in any discipline.
Two to three sentences of **more detailed background**, comprehensible to scientists in related disciplines.
One sentence clearly stating the **general problem** being addressed by this particular study.
One sentence summarizing the main result (with the words "**here we show**" or their equivalent).
Two or three sentences explaining what the **main result** reveals in direct comparison to what was thought to be the case previously, or how the main result adds to previous knowledge.
One or two sentences to put the results into a more **general context**.
Two or three sentences to provide a **broader perspective**, readily comprehensible to a scientist in any discipline.
<!-- https://tinyurl.com/ybremelq -->
keywords : "R, Teaching, "
wordcount : "X"
bibliography : "r-references.bib"
floatsintext : no
linenumbers : yes
draft : no
mask : no
figurelist : no
tablelist : no
footnotelist : no
classoption : "man"
output : papaja::apa6_pdf
---
```{r setup, include = FALSE}
if (!requireNamespace("pacman", quietly = TRUE))
{
install.packages("pacman")
} # 检查是否已安装 pacman, 如果未安装,则安装包
pacman::p_load("papaja","here", "tidyverse","report")
r_refs("r-references.bib")
```
# Introduction
R is a powerful programming language for statistical analyses and more. We can use R for the whole workflow after getting our raw data, from pre-processing to the final manuscript!
Here we will demonstrate how to use `papaja` for preparing manuscript in APA 6th style.
```{r analysis-preferences}
# Seed for random number generation
set.seed(42)
knitr::opts_chunk$set(cache.extra = knitr::rand_seed)
```
# Methods
We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study. <!-- 21-word solution (Simmons, Nelson & Simonsohn, 2012; retrieved from http://ssrn.com/abstract=2160588) -->
```{r pre-processing data, message=FALSE, warning=FALSE}
# 定义原始数据的路径
dataDir <- here::here("data", "match")
# df_raw_match <- dataDir %>%
# 读取文件的文件名,仅包括 *rep_match*的文件
flist <- list.files(dataDir, pattern = "*rep_match.*\\.out", full.names = TRUE)
# 使用朴素的for循环来读取、合并数据
# rm(df.L)
for (file in flist){
# if the merged df.L doesn't exist, create it
if (!exists("df.L")){
df.L <- read.table(file, header=TRUE, sep="",stringsAsFactors=F)
}
# if the merged df.L does exist, append to it
if (exists("df.L")){
temp_dataset <-read.table(file, header=TRUE, sep="",stringsAsFactors=F)
df.L<-rbind(df.L, temp_dataset)
rm(temp_dataset)
}
}
# 替代方法
# do.call("rbind",lapply(as.list(flist),FUN=function(file){read.table(file, header=TRUE, sep="",stringsAsFactors=F)}))
# render the data in numeric format for future analysis
cols.num <- c('Sub',"Age","Block","Bin","Trial","RT","ACC")
df.L[cols.num] <- sapply(df.L[cols.num], as.numeric)
# remove trials without resposne
df.L <- df.L[!is.na(df.L$ACC),]
# get the independent varialbes
# moral valence
df.L$Morality[grepl("moral", df.L$Shape, fixed=TRUE)] <- "Good"
df.L$Morality[grepl("immoral", df.L$Shape, fixed=TRUE)] <- "Bad"
# self-referential
df.L$Identity[grepl("Self", df.L$Shape, fixed=TRUE)] <- "Self"
df.L$Identity[grepl("Other", df.L$Shape, fixed=TRUE)] <- "Other"
# rename columns
colnames(df.L)[colnames(df.L)=="Sub"] <- "Subject"
# order the variables
df.L$Morality <- factor(df.L$Morality, levels = c("Good","Bad"))
df.L$Identity <- factor(df.L$Identity, levels = c("Self","Other"))
df.L$Match <- factor(df.L$Match, levels = c("match","mismatch"))
# change all gender from number to string
df.L$Sex[df.L$Sex == '1'] <- 'male'
df.L$Sex[df.L$Sex == '2'] <- 'female'
# optional
# write.csv(df.L, paste(dataDir,'df_matching_raw.csv', sep = "/"),row.names = F)
### Rule 1: wrong trials numbers because of procedure errors
excldSub1 <- df.L %>%
dplyr::mutate(ACC = ifelse(ACC == 1, 1, 0)) %>% # no response as wrong
dplyr::group_by(Subject, Match, Identity,Morality) %>%
dplyr::summarise(N = length(ACC)) %>% # count the trial # for each condition of each subject
dplyr::ungroup() %>%
dplyr::filter(N != 75) %>% # filter the rows that trial Number is not 75
dplyr::distinct(Subject) %>% # find the unique subject ID
dplyr::pull(Subject) # pull the subj ID as vector
### Rule 2: overall accuracy < 0.5
excldSub2 <- df.L %>%
dplyr::mutate(ACC = ifelse(ACC == 1, 1, 0)) %>% # no response as wrong
dplyr::group_by(Subject) %>%
dplyr::summarise(N = length(ACC),
countN = sum(ACC),
ACC = sum(ACC)/length(ACC)) %>% # count the trial # for each condition of each subject
dplyr::ungroup() %>%
dplyr::filter(ACC < .5) %>% # filter the subjects with over all ACC < 0.5
dplyr::distinct(Subject) %>% # find the unique subject ID
dplyr::pull(Subject) # pull the subj ID as vector
### Rule 3: one condition with zero ACC
excldSub3 <- df.L %>%
dplyr::mutate(ACC = ifelse(ACC == 1, 1, 0)) %>% # no response as wrong
dplyr::group_by(Subject, Match, Identity,Morality) %>%
dplyr::summarise(N = length(ACC),
countN = sum(ACC),
ACC = sum(ACC)/length(ACC)) %>% # count the trial # for each condition of each subject
dplyr::ungroup() %>%
dplyr::filter(ACC == 0) %>% # filter the subjects with over all ACC < 0.5
dplyr::distinct(Subject) %>% # find the unique subject ID
dplyr::pull(Subject) # pull the subj ID as vector
# Id of all excluded subjects
excldSubs <- c(excldSub1, excldSub2, excldSub3)
df.L.V <- df.L %>%
# dplyr::mutate(ACC = ifelse(ACC == 1, 1, 0)) %>% # no response as wrong
dplyr::filter(!Subject %in% excldSubs) # exclude the invalid subjects
# check the number of participants are correct
# length(unique(df.L.V$Subject)) + length(excldSubs) == length(unique(df.L$Subject))
# excluded correct trials with < 200ms RT
ratio.excld.trials <- nrow(df.L.V[df.L.V$RT*1000 <= 200 & df.L.V$ACC == 1,])/nrow(df.L.V) # ratio of invalid trials
df.L.V <- df.L.V %>% dplyr::filter(!(RT <= 0.2 & ACC==1)) # filter invalid trials
## Basic information of the data ####
df.L.T.basic <- df.L %>%
dplyr::select(Subject, Age, Sex) %>%
dplyr::distinct(Subject, Age, Sex) %>%
dplyr::summarise(subj_N = length(Subject),
female_N = sum(Sex == 'female'),
male_N = sum(Sex == 'male'),
Age_mean = round(mean(Age),2),
Age_sd = round(sd(Age),2))
# valide data for matching task
df.L.V.basic <- df.L.V %>%
dplyr::select(Subject, Age, Sex) %>%
dplyr::distinct(Subject, Age, Sex) %>%
dplyr::summarise(subj_N = length(Subject),
female_N = sum(Sex == 'female'),
male_N = sum(Sex == 'male'),
Age_mean = round(mean(Age),2),
Age_sd = round(sd(Age),2))
```
## Participants
We recruited `r df.L.T.basic$subj_N` participants (`r df.L.T.basic$female_N` females, age = `r df.L.T.basic$Age_mean` $\pm$ `r df.L.T.basic$Age_sd`). ....
## Material
We used *Pyschopy 3* to present stimuli and collect participants' responses. ...
## Procedure
We follow the procedure of Sui et al (2012)
```{r calculate dprime}
# calculate the number of hit,CR,miss or FA
df.L.V$sdt <- NA
for (i in 1:nrow(df.L.V)){
if (df.L.V$ACC[i] == 1 & df.L.V$Match[i] == "match"){
df.L.V$sdt[i] <- "hit"
} else if (df.L.V$ACC[i] == 1 & df.L.V$Match[i] == "mismatch" ){
df.L.V$sdt[i] <- "CR"
} else if (df.L.V$ACC[i] == 0 & df.L.V$Match[i] == "match"){
df.L.V$sdt[i] <- "miss"
} else if (df.L.V$ACC[i] == 0 & df.L.V$Match[i] == "mismatch" ){
df.L.V$sdt[i] <- "FA"
}
}
# calculate the number of each for each condition
df.L.V.SDT <- df.L.V %>%
dplyr::group_by(Subject,Age, Sex, Morality, Identity,sdt) %>%
dplyr::summarise(N = length(sdt)) %>%
dplyr::filter(!is.na(sdt)) # no NAs
df.L.V.SDT_w <- reshape2::dcast(df.L.V.SDT, Subject + Age + Sex + Morality + Identity ~ sdt,value.var = "N")
df.L.V.SDT_w <- df.L.V.SDT_w %>%
dplyr::mutate(miss = ifelse(is.na(miss),0,miss), # if not miss trial, to 0
FA = ifelse(is.na(FA),0,FA), # if not FA trial, to 0
hitR = hit/(hit + miss), # calculate the hit rate
faR = FA/(FA+CR)) # calculate the FA rate
# one way to deal with the extreme values
for (i in 1:nrow(df.L.V.SDT_w)){
if (df.L.V.SDT_w$hitR[i] == 1){
df.L.V.SDT_w$hitR[i] <- 1 - 1/(2*(df.L.V.SDT_w$hit[i] + df.L.V.SDT_w$miss[i]))
}
}
for (i in 1:nrow(df.L.V.SDT_w)){
if (df.L.V.SDT_w$faR[i] == 0){
df.L.V.SDT_w$faR[i] <- 1/(2*(df.L.V.SDT_w$FA[i] + df.L.V.SDT_w$CR[i]))
}
}
# define the d prime function
dprime <- function(hit,fa) {
qnorm(hit) - qnorm(fa)
}
# calculate the d prime for each condition
df.L.V.SDT_w$dprime <- mapply(dprime, df.L.V.SDT_w$hitR,df.L.V.SDT_w$faR)
# ANOVA
```
## Data analysis
We used `r cite_r("r-references.bib")` for all our analyses.
# Results
```{r 'dprime', fig.cap="*d* prime.", fig.height=4.5, fig.width=5, warning=FALSE}
# prepare the data for plotting
df.plot <- df.L.V.SDT_w %>%
dplyr::mutate(Morality =factor(Morality, levels = c('Good', 'Bad')),
# create an extra column for ploting the individual data cross different conditions.
Conds = dplyr::case_when(
(Morality == 'Good' & Identity == 'Self') ~ "0.8",
(Morality == 'Good' & Identity == 'Other') ~ "1.2",
(Morality == 'Bad' & Identity == 'Self') ~ "1.8",
(Morality == 'Bad' & Identity == 'Other') ~ "2.2"),
Conds = as.numeric(Conds),
Conds_j = jitter(Conds, amount=.06)
)
# get the summary data
df.plot.sum_p <- df.plot %>%
dplyr::group_by(Morality,Identity) %>%
dplyr::summarise(d_mean = mean(dprime),
d_sd = sd(dprime),
se = sd(dprime)/sqrt(n())) %>%
dplyr::ungroup()
pd1 <- position_dodge(0.8)
scaleFUN <- function(x) sprintf("%.2f", x)
scales_y <- list(
# RT = scale_y_continuous(limits = c(400, 900)),
dprime = scale_y_continuous(labels=scaleFUN)
)
p_df_sum <- df.plot %>% # dplyr::filter(DVs== 'RT') %>%
ggplot(., aes(x = Morality, y = dprime)) +
geom_line(aes(x = Conds_j, y = dprime, group = Subject), # link individual's points by transparent grey lines
linetype = 1, size = 0.8, colour = "#000000", alpha = 0.03) +
geom_point(aes(x = Conds_j, y = dprime, group = Subject, colour = as.factor(Identity)), # plot individual points
#colour = "#000000",
size = 3, shape = 20, alpha = 0.15) +
geom_line(data = df.plot.sum_p, aes(x = as.numeric(Morality), # plot the group means
y = d_mean,
group = Identity,
colour = as.factor(Identity)),
linetype = 1, position = pd1, size = 2) +
geom_point(data = df.plot.sum_p, aes(x = as.numeric(Morality), # group mean
y = d_mean,
group = Identity,
colour = as.factor(Identity)),
shape = 18, position = pd1, size = 6) +
geom_errorbar(data = df.plot.sum_p, aes(x = as.numeric(Morality), # group error bar.
y = d_mean, group = Identity,
colour = as.factor(Identity),
ymin = d_mean - 1.96*se,
ymax = d_mean + 1.96*se),
width = .05, position = pd1, size = 2, alpha = 0.75) +
scale_colour_brewer(palette = "Dark2") +
scale_x_continuous(breaks=c(1, 2),
labels=c("Good", "Bad")) +
scale_fill_brewer(palette = "Dark2") +
theme_bw()+
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
panel.border = element_blank(),
text=element_text(family='Times'),
legend.title=element_blank(),
#legend.text = element_text(size =6),
legend.text = element_blank(),
legend.position = 'none',
plot.title = element_text(lineheight=.8, face="bold", size = 18, margin=margin(0,0,20,0)),
axis.text = element_text (size = 18, color = 'black'),
axis.title = element_text (size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.line.x = element_line(color='black', size = 1), # increase the size of font
axis.line.y = element_line(color='black', size = 1), # increase the size of font
strip.text = element_text (size = 16, color = 'black'), # size of text in strips, face = "bold"
panel.spacing = unit(1.5, "lines")
)
p_df_sum
```
See figure \@ref(fig:dprime) for *d* prime of the experiment.
```{r ANOVA}
resANOVA <- afex::aov_ez(id="Subject", dv = "dprime", data = df.L.V.SDT_w, within = c("Identity", "Morality"))
resANOVA2 <- afex::aov_4(dprime ~ Identity * Morality + (Identity * Morality|Subject), data = df.L.V.SDT_w )
# bruceR::MANOVA(df.L.V.SDT_w,subID = "Subject", dv = "dprime", within = c("Identity", "Morality"))
apa_anova <- apa_print(resANOVA)
```
Morality (`r apa_anova$full$Morality`) has an effect on *d* prime and there is an interaction bewteen these two variables. `r apa_anova$full$Identity_Morality`.
```{r anovaTable}
apa_table(
apa_anova$table
, caption = "A really beautiful ANOVA table."
, note = "Note that the column names contain beautiful mathematical copy: This is because the table has variable labels."
)
```
# Discussion
Here we show R is powerful.
\newpage
# References
::: {#refs custom-style="Bibliography"}
:::