As an example to show how accessible resampling can be, here’s bit of code that resamples an anova and computes an empirical p value.
temp
is a dataframe containing the data Condition
is a column with exactly that temp_col
is the name of a dependent variable. It’s a string to make this easy to reuse. if you haven’t used map
before it’s basically a for loop that returns a list. When the output get’s passed into unlist
it becomes an array.
temp_shuffle <- temp
resample_array <- map(1:1000, function(i){
temp_shuffle$Condition <- sample(temp_shuffle$Condition, replace = F)
fm <- lm(as.formula(paste0(temp_col, " ~ Condition")), data = temp_shuffle)
return(car::Anova(fm)[1,3])
}) %>% unlist()
ep <- mean(resample_array >= car::Anova(fm)[1,3])
The down side is that it takes orders of magnitude more time to run because you’re running the same code hundreds or thousands of times. This is only a problem if you need crazy high precision or have a really complex/hard to fit model. For reference using the code above took about ~2 seconds/dv for 1000 iterations on my machine.
A handy pattern is to use map to summarize data and then bind it.