# Parallelizing simr::powercurve() in R

The `powercurve`

function from the simr package in R (Green & MacLeod, 2016) can incur very long running times when the method used for the calculation of *p* values is Kenward-Roger or Satterthwaite (see Luke, 2017). Here I suggest three ways for cutting down this time.

Where possible, use a high-performance (or high-end) computing cluster. This removes the need to use personal computers for these long jobs.

In case you’re using the

`fixed()`

parameter of the`powercurve`

function, and calculating the power for different effects, run these at the same time (‘in parallel’) on different machines, rather than one after another.*Parallelize*the`breaks`

argument. The`breaks`

argument of the`powercurve`

function allows the calculation of power for different levels of the grouping factor passed to`along`

. Some grouping factors are*participant*,*trial*and*item*. The`breaks`

argument sets the different sample sizes for which power will be calculated. Parallelizing`breaks`

is done by running each number of levels in a separate function. When each has been run and saved, they are`c`

ombined to allow the plotting. This procedure is demonstrated below.

## Parallelizing `breaks`

Let’s do a minimal example using a toy `lmer`

model. A power curve will be craeted for the fixed effect of `x`

along different sample sizes of the grouping factor `g`

.

Notice that the six sections of the power curve below are serially arranged, one after another. In contrast, to enable parallel processing, each power curve would be placed in a single script, and they would all be run at the same time.

Although the power curves below run in a few minutes, the settings that are often used (e.g., a larger model; `fixed('x', 'sa')`

instead of `fixed('x')`

; `nsim = 500`

instead of `nsim = 50`

) take far longer. That is where parallel processing becomes useful.^{1}

```
library(lme4)
library(simr)
# Toy model
fm = lmer(y ~ x + (x | g), data = simdata)
# Extend sample size of `g`
fm_extended_g = extend(fm, along = 'g', n = 12)
# Parallelize `breaks` by running each number of levels in a separate function.
# 4 levels of g
pwcurve_4g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 4,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 6 levels of g
pwcurve_6g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 6,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 8 levels of g
pwcurve_8g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 8,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 10 levels of g
pwcurve_10g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 10,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 12 levels of g
pwcurve_12g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 12,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
```

Having saved each section of the power curve, we must now combine them to be able to plot them together.

```
# Create a destination object using any of the power curves above.
all_pwcurve = pwcurve_4g
# Combine results
all_pwcurve$ps = c(pwcurve_4g$ps[1], pwcurve_6g$ps[1], pwcurve_8g$ps[1],
pwcurve_10g$ps[1], pwcurve_12g$ps[1])
# Combine the different numbers of levels.
all_pwcurve$xval = c(pwcurve_4g$nlevels, pwcurve_6g$nlevels, pwcurve_8g$nlevels,
pwcurve_10g$nlevels, pwcurve_12g$nlevels)
print(all_pwcurve)
```

```
## Power for predictor 'x', (95% confidence interval),
## by number of levels in g:
## 4: 46.00% (31.81, 60.68) - 40 rows
## 6: 74.00% (59.66, 85.37) - 60 rows
## 8: 92.00% (80.77, 97.78) - 80 rows
## 10: 98.00% (89.35, 99.95) - 100 rows
## 12: 100.0% (92.89, 100.0) - 120 rows
##
## Time elapsed: 0 h 0 m 8 s
```

`plot(all_pwcurve, xlab = 'Levels of g')`

```
# For reproducibility purposes
sessionInfo()
```

```
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] simr_1.0.5 lme4_1.1-28 Matrix_1.4-0
## [4] knitr_1.37 xaringanExtra_0.5.5
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.1.2 xfun_0.29 bslib_0.3.1 purrr_0.3.4
## [5] splines_4.1.2 lattice_0.20-45 carData_3.0-5 vctrs_0.3.8
## [9] generics_0.1.2 htmltools_0.5.2 yaml_2.3.5 mgcv_1.8-38
## [13] utf8_1.2.2 rlang_1.0.1 jquerylib_0.1.4 nloptr_2.0.0
## [17] pillar_1.7.0 glue_1.6.1 uuid_1.0-3 plyr_1.8.6
## [21] binom_1.1-1 lifecycle_1.0.1 stringr_1.4.0 blogdown_1.8
## [25] evaluate_0.15 fastmap_1.1.0 RLRsim_3.1-6 parallel_4.1.2
## [29] pbkrtest_0.5.1 fansi_1.0.2 broom_0.7.12 Rcpp_1.0.8
## [33] backports_1.4.1 plotrix_3.8-2 jsonlite_1.8.0 abind_1.4-5
## [37] digest_0.6.29 stringi_1.7.6 bookdown_0.24 dplyr_1.0.8
## [41] grid_4.1.2 cli_3.2.0 tools_4.1.2 magrittr_2.0.2
## [45] sass_0.4.0 tibble_3.1.6 tidyr_1.2.0 pkgconfig_2.0.3
## [49] crayon_1.5.0 car_3.0-12 MASS_7.3-55 ellipsis_0.3.2
## [53] minqa_1.2.4 rmarkdown_2.11 rstudioapi_0.13 iterators_1.0.14
## [57] R6_2.5.1 boot_1.3-28 nlme_3.1-155 compiler_4.1.2
```

### Just the code

```
library(lme4)
library(simr)
# Toy model
fm = lmer(y ~ x + (x | g), data = simdata)
# Extend sample size of `g`
fm_extended_g = extend(fm, along = 'g', n = 12)
# Parallelize `breaks` by running each number of levels in a separate function.
# 4 levels of g
pwcurve_4g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 4,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 6 levels of g
pwcurve_6g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 6,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 8 levels of g
pwcurve_8g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 8,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 10 levels of g
pwcurve_10g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 10,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# 12 levels of g
pwcurve_12g = powerCurve(fm_extended_g, fixed('x'), along = 'g', breaks = 12,
nsim = 50, seed = 123,
# No progress bar
progress = FALSE)
# Create a destination object using any of the power curves above.
all_pwcurve = pwcurve_4g
# Combine results
all_pwcurve$ps = c(pwcurve_4g$ps[1], pwcurve_6g$ps[1], pwcurve_8g$ps[1],
pwcurve_10g$ps[1], pwcurve_12g$ps[1])
# Combine the different numbers of levels.
all_pwcurve$xval = c(pwcurve_4g$nlevels, pwcurve_6g$nlevels, pwcurve_8g$nlevels,
pwcurve_10g$nlevels, pwcurve_12g$nlevels)
print(all_pwcurve)
plot(all_pwcurve, xlab = 'Levels of g')
```

## References

Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. *Journal of Cognition, 1*(1), 9. http://doi.org/10.5334/joc.10

Green, P., & MacLeod, C. J. (2016). SIMR: An R package for power analysis of generalized linear mixed models by simulation. *Methods in Ecology and Evolution 7*(4), 493–498, https://doi.org/10.1111/2041-210X.12504

Kumle, L., Vo, M. L. H., & Draschkow, D. (2021). Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R. *Behavior Research Methods*, 1–16. https://doi.org/10.3758/s13428-021-01546-0

Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. *Behavior Research Methods, 49*(4), 1494–1502. https://doi.org/10.3758/s13428-016-0809-y

The number of simulations set by

`nsim`

should be larger (Brysbaert & Stevens, 2018; Green & MacLeod, 2016). In addition, the effect size for`x`

should be adjusted to the value that best fits with the planned study (Kumle et al., 2021).↩︎