Appendix A: Selection of lexical covariates

Lexical covariates are usually used in conceptual processing studies due to the widespread connections among lexical and semantic variables (Petilli et al., 2021; e.g., Pexman & Yap, 2018; Wingfield & Connell, 2022b). Including these covariates—or nuisance variables—in the model allows a more rigorous analysis of the predictors of interest (Sassenhagen & Alday, 2016). In each of the present studies, the covariates were selected out of a group of five variables that had been used as covariates in Wingfield and Connell (2022b), and are widely used (e.g., Petilli et al., 2021). Some of these covariates were highly intercorrelated (\(r\) > .70), as shown below. To avoid the problem of multicollinearity, the maximum zero-order correlation allowed between any two covariates was of \(r\) = \(\pm\).70 (Dormann et al., 2013; Harrison et al., 2018). In cases of higher correlations, the covariate with the largest effect in the model, based on the estimate (β), was selected.

In Studies 2.1 (semantic priming) and 2,2 (semantic decision), the lexical covariates were selected out of five variables, which mirrored those used by Wingfield and Connell (2022b): namely, number of letters (i.e., orthographic length, which we computed in R), word frequency, number of syllables (both the latter from Balota et al., 2007), orthographic Levenshtein distance (Yarkoni et al., 2008) and phonological Levenshtein distance (Suárez et al., 2011; Yap & Balota, 2009). In Study 2.3 (lexical decision), the procedure was more particular, as it served two purposes. First, the variable that had the largest effect out of the five was selected as the language-based predictor of interest (see reason in Study 2.3 in the main text). Second, one variable was selected as a covariate among the remaining four.

All the models included by-participant and by-word random intercepts, as well as by-participant random slopes for every predictor. Below, the correlations and the selection model are shown for each study.


Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English Lexicon Project. Behavior Research Methods, 39, 445–459.
Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J., Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder, B., Skidmore, A. K., Zurell, D., & Lautenbach, S. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27–46.
Harrison, X. A., Donaldson, L., Correa-Cano, M. E., Evans, J., Fisher, D. N., Goodwin, C., Robinson, B. S., Hodgson, D. J., & Inger, R. (2018). A brief introduction to mixed effects modelling and multi-model inference in ecology. PeerJ, 6, 4794.
Petilli, M. A., Günther, F., Vergallito, A., Ciapparelli, M., & Marelli, M. (2021). Data-driven computational models reveal perceptual simulation in word processing. Journal of Memory and Language, 117, 104194.
Pexman, P. M., & Yap, M. J. (2018). Individual differences in semantic processing: Insights from the Calgary semantic decision project. Journal of Experimental Psychology: Learning, Memory, and Cognition, 44(7), 1091–1112.
Sassenhagen, J., & Alday, P. M. (2016). A common misapplication of statistical inference: Nuisance control with null-hypothesis significance tests. Brain and Language, 162, 42–45.
Suárez, L., Tan, S. H., Yap, M. J., & Goh, W. D. (2011). Observing neighborhood effects without neighbors. Psychonomic Bulletin & Review, 18(3), 605–611.
Wingfield, C., & Connell, L. (2022b). Understanding the role of linguistic distributional knowledge in cognition. Language, Cognition and Neuroscience, 1–51.
Yap, M. J., & Balota, D. A. (2009). Visual word recognition of multisyllabic words. Journal of Memory and Language, 60(4), 502–529.
Yarkoni, T., Balota, D., & Yap, M. J. (2008). Moving beyond Coltheart’s N: A new measure of orthographic similarity. Psychonomic Bulletin & Review, 15(5), 971–979.

Pablo Bernabeu, 2022. Licence: CC BY 4.0.

Online book created using the R package bookdown.