GitHub repo:
https://github.com/expfactory-experiments/stroop-5min

Pre-registration (approved July 24, 2018):
https://osf.io/9pc46/

Live demo version:
https://expfactory-experiments.github.io/stroop-5min



library(knitr)
library(dplyr)
library(ggplot2)
library(effsize)
library(Rmisc)
library(ggthemr)
library(gmodels)
library(TOSTER)
library(lsr)
ggthemr('dust')

Changelog

- Practice trials reduced from 24 to 18
- Test trials reduced from 96 to 72
- Ratio of congruent:incongruent trials increased from 1:1 to 2:1
- Self-paced (trial ends immediately on response)
- Colors selected to be color-blind safe (http://colororacle.org)
- Color words in bolder font
- Instructions improved with finger placement diagram
- Background color changed to black to minimize eye strain
- Various small improvements to instructions and feedback
- Attention check added

Data and analysis

We recruited 50 participants from MTurk, of which 3 failed the attention check and were therefore not included in the performance analysis. Data collected with the original expfactory Stroop task (n = 277) is freely available and was included in the performance analyses as a baseline (Eisenberg et al., 2018).1

Stroop effect (milliseconds)

The Stroop effect is typically measured as the difference in response times between incongruent and congruent stimuli. Because incongruent stimuli create "interference" while congruent stimuli create response "facilitation," individuals are slower to respond to incongruent (vs. congruent) stimuli. Our measure of the Stroop effect is the within-subjects mean difference in response times between incongruent and congruent stimuli, where a higher value corresponds with a larger Stroop effect. We can see that the Stroop effect was as large in the 5 Minute Stroop as it was in the Original Stroop. (With fewer data points the confidence interval was wider, but this is to be expected.)

load("/home/rstudio/usr/MTurk/Stroop/m_ref.R")
load("/home/rstudio/usr/MTurk/Stroop/m.R")

table <- data.frame(matrix(c(
            "Original", 
            round(mean(m_ref$RT_incong) - mean(m_ref$RT_cong)), 
            ci(m_ref$RT_incong - m_ref$RT_cong)[[1]] - ci(m_ref$RT_incong - m_ref$RT_cong)[[2]],
            
            "5 Minute", 
            round(mean(m$RT_incong) - mean(m$RT_cong)), 
            ci(m$RT_incong - m$RT_cong)[[1]] - ci(m$RT_incong - m$RT_cong)[[2]]
            
            ),ncol=3, byrow=TRUE), stringsAsFactors=FALSE)

colnames(table) <- c("Task", "Mean", "CI")
table[,2] <- as.numeric(table[,2])
table[,3] <- as.numeric(table[,3])
table$Task <- factor(table$Task, levels = c("Original", "5 Minute"))

ggplot(table, aes(x=Task, y=Mean, fill=Task)) + 
    geom_bar(position=position_dodge(), stat="identity", width=0.5) +
    geom_errorbar(aes(ymin=as.numeric(Mean-CI), ymax=as.numeric(Mean+CI)),
                  width=.1, color="black",
                  position=position_dodge(.9)) + ylab("Stroop Effect (ms)") +
    geom_text(aes(x = Task, y = Mean, label = Mean, group = Task),
              position = position_dodge(width = 1),
              vjust = 7, size = 6
            )

Stroop effect (Cohen’s d)

We can also measure the Stroop effect in terms of the magnitude of its effect size. Cohen's d effect sizes are often interpreted with the following rule of thumb: d = 0.2 is a "small" effect, d = 0.5 is a "medium" effect, and d = 0.8 or greater is a "large" effect. As with the millisecond measure of the effect, we also see similar effect sizes as measured by Cohen's d between the two tasks, and for both tasks the effect was large.

table <- data.frame(matrix(c(
            "Original", 
            round(cohen.d(m_ref$RT_incong, m_ref$RT_cong, paired=TRUE)[[3]],2), 
            round(cohen.d(m_ref$RT_incong, m_ref$RT_cong, paired=TRUE)[[3]],1)-round(cohen.d(m_ref$RT_incong, m_ref$RT_cong, paired=TRUE)[[4]][[1]],1),
            
            "5 Minute", 
            round(cohen.d(m$RT_incong, m$RT_cong, paired=TRUE)[[3]],2), 
            round(cohen.d(m$RT_incong, m$RT_cong, paired=TRUE)[[3]],1)-round(cohen.d(m$RT_incong, m$RT_cong, paired=TRUE)[[4]][[1]],1)
            ),ncol=3, byrow=TRUE), stringsAsFactors=FALSE)

colnames(table) <- c("Task", "Mean", "CI")
table[,2] <- as.numeric(table[,2])
table[,3] <- as.numeric(table[,3])
table$Task <- factor(table$Task, levels = c("Original", "5 Minute"))

ggplot(table, aes(x=Task, y=Mean, fill=Task)) + 
    geom_bar(position=position_dodge(), stat="identity", width=0.5) +
    geom_errorbar(aes(ymin=as.numeric(Mean-CI), ymax=as.numeric(Mean+CI)),
                  width=.1, color="black",
                  position=position_dodge(.9)) + ylab("Stroop Effect (Cohen's d)") +
    geom_text(aes(x = Task, y = Mean, label = Mean, group = Task),
              position = position_dodge(width = 1),
              vjust = 7, size = 6
            )

Accuracy

Speeded-response tasks like the Stroop will often examine response times as well as accuracy. When examining accuracy, the logic is similar to response times: Because incongruent stimuli create response interference, activating an incorrect response to a greater degree than a congruent stimulus, there is a greater chance that participants will respond incorrectly to incongruent stimuli. For this reason, we would expect a greater proportion of incorrect responses for incongruent (vs. congruent) stimuli. We see this in the graphs below, and we also see that performance was similar between the two Stroop task versions.

table <- data.frame(matrix(c(
            "Original", 
            round(mean(m_ref$Acc_cong),2), 
            ci(m_ref$Acc_cong)[[1]] - ci(m_ref$Acc_cong)[[2]],
            
            "5 Minute", 
            round(mean(m$Acc_cong),2), 
            ci(m$Acc_cong)[[1]] - ci(m$Acc_cong)[[2]]
            
            ),ncol=3, byrow=TRUE), stringsAsFactors=FALSE)

colnames(table) <- c("Task", "Mean", "CI")
table[,2] <- as.numeric(table[,2])
table[,3] <- as.numeric(table[,3])
table$Task <- factor(table$Task, levels = c("Original", "5 Minute"))

plot1 <- ggplot(table, aes(x=Task, y=Mean, fill=Task)) + 
    geom_bar(position=position_dodge(), stat="identity", width=0.5) +
    geom_errorbar(aes(ymin=as.numeric(Mean-CI), ymax=as.numeric(Mean+CI)),
                  width=.1, color="black",
                  position=position_dodge(.9)) + ylab("Proportion Correct") +
    geom_text(aes(x = Task, y = Mean, label = Mean, group = Task),
              position = position_dodge(width = 1),
              vjust = 7, size = 6
            ) + ggtitle("Congruent")


table <- data.frame(matrix(c(
            "Original", 
            round(mean(m_ref$Acc_incong),2), 
            ci(m_ref$Acc_incong)[[1]] - ci(m_ref$Acc_incong)[[2]],
            
            "5 Minute", 
            round(mean(m$Acc_incong),2), 
            ci(m$Acc_incong)[[1]] - ci(m$Acc_incong)[[2]]
            
            ),ncol=3, byrow=TRUE), stringsAsFactors=FALSE)

colnames(table) <- c("Task", "Mean", "CI")
table[,2] <- as.numeric(table[,2])
table[,3] <- as.numeric(table[,3])
table$Task <- factor(table$Task, levels = c("Original", "5 Minute"))

plot2 <- ggplot(table, aes(x=Task, y=Mean, fill=Task)) + 
    geom_bar(position=position_dodge(), stat="identity", width=0.5) +
    geom_errorbar(aes(ymin=as.numeric(Mean-CI), ymax=as.numeric(Mean+CI)),
                  width=.1, color="black",
                  position=position_dodge(.9)) + ylab("Proportion Correct") +
    geom_text(aes(x = Task, y = Mean, label = Mean, group = Task),
              position = position_dodge(width = 1),
              vjust = 7, size = 6
            ) + ggtitle("Incongruent")


plot1