The present studies

We revisit three larger-than-average studies to investigate the interplay between language and embodiment in conceptual processing. We devote a study to each of the three original studies. Thus, Study 2.1 is centred on Hutchison et al. (2013) and uses the semantic priming paradigm. Study 2.2 is centred on Pexman et al. (2017) and uses the semantic decision paradigm. Study 2.3 is centred on Balota et al. (2007) and uses the lexical decision paradigm. Each of these central studies contained measures of participants’ vocabulary size and gender. Furthermore, the core data sets were expanded by adding variables that captured the language-based information in words (Mandera et al., 2017; Wingfield & Connell, 2022b) and the vision-based information in words (Lynott et al., 2020; Petilli et al., 2021)—the latter being used to represent the embodiment system. One of the key questions we investigated using this array of variables was whether individual differences in vocabulary and gender modulated participants’ sensivity to the language-based and vision-based information in words. Alongside the effects of interest, several covariates were included in the models to allow a rigorous analysis (Sassenhagen & Alday, 2016). These covariates comprised measures of general cognition and lexical characteristics of the stimulus words. Last, in each study, we performed a statistical power analysis to help estimate the sample size needed to investigate a variety of effects in future studies.

Below, we delve into the language and the embodiment components of these studies.


Studies have operationalised the language system at the word level using measures that capture the relationships among words without explicitly drawing on any sensory or affective modalities. Two main types of linguistic measures exist: those based on text corpora—dubbed word co-occurrence measures (Bullinaria & Levy, 2007; Petilli et al., 2021; Wingfield & Connell, 2022b)—and those based on associations collected from human participants—dubbed word association measures (De Deyne et al., 2016, 2019). Notwithstanding the interrelation between word co-occurrence and word association (Planchuelo et al., 2022), co-occurrence is more purely linguistic, whereas association indirectly captures more of the sensory and affective meaning of words (De Deyne et al., 2021).

Operationalisation and hypotheses

In Study 2.1 (semantic priming) and Study 2.2 (semantic decision), co-occurrence measures were used to represent the language system at the word level. Specifically, in Study 2.1, this measure was called language-based similarity, and it was based on the degree of text-based co-occurrence between the prime word and the target word in each trial (Mandera et al., 2017). In Study 2.2, the measure was called word co-occurrence, and it was based on the degree of text-based co-occurrence between each stimulus word and the words ‘abstract’ and ‘concrete’ (Wingfield & Connell, 2022b). In Study 2.3 (lexical decision), a co-occurrence measure could not be used, as the co-occurrence of words in consecutive trials could not be calculated due to the high frequency of nonword trials throughout the lexical decision task. Therefore, a single-word measure had to be used instead. Word frequency was used as it was the lexical variable, among five, that had the largest effect (see Appendix A).

At the individual level, language was represented by participants’ vocabulary size in Studies 2.1 and 2.2, and by participants’ vocabulary age in Study 2.3. Vocabulary size and age did not differ in any consequential way. They both captured the amount of vocabulary knowledge of each participant, by testing their knowledge of a small sample of pre-normed words, and thereby inferring their overall knowledge.

We hypothesised that word co-occurrence, word frequency and vocabulary size would all have facilitatory effects on participants’ performance, with higher values leading to shorter RTs (Pexman & Yap, 2018; Wingfield & Connell, 2022b; Yap et al., 2009).

Embodiment represented by vision-based information

In previous studies, the embodiment system has been represented at the word level by perceptual, motor, affective or social variables (Fernandino et al., 2022; Vigliocco et al., 2009; X. Wang et al., 2021). For instance, the perceptual modalities have often corresponded to the five Aristotelian senses—vision, hearing, touch, taste and smell (Bernabeu et al., 2017, 2021; Louwerse & Connell, 2011)—and, less often, to interoception (Connell et al., 2018). Yet, out of all these domains, vision has been most frequently used in research (e.g., Bottini et al., 2021; De Deyne et al., 2021; Pearson & Kosslyn, 2015; Petilli et al., 2021; Yee et al., 2012). The hegemony of vision is likely due to the central position of vision in the human brain (Reilly et al., 2020) as well as in several languages (Bernabeu, 2018; I.-H. Chen et al., 2019; Lynott et al., 2020; Miceli et al., 2021; Morucci et al., 2019; Roque et al., 2015; Speed & Brybaert, 2021; Speed & Majid, 2020; Vergallito et al., 2020; Winter et al., 2018; Zhong et al., 2022). In the present study, we focussed on vision alone due to three reasons. First, we wanted to use a single variable to represent sensorimotor information, just as a single variable would be used to represent linguistic information. Using a single variable for each system facilitates the analysis of interactions with other variables. Second, vision is very prominent in cognition, as we just reviewed. Third, we had planned to use the present research to determine the sample size of a subsequent study that focusses on vision (indeed, the present study grew out of a statistical power analysis).

Operationalisation and hypotheses

At the word level, we operationalised visual information using the visual strength variable from the Lancaster Sensorimotor Norms (Lynott et al., 2020). This variable measures the degree of visual experience associated with concepts. In Study 2.1, we created the variable visual-strength difference by subtracting the visual strength of the prime word from that of the target word, in each trial. Thus, visual-strength difference measured—in each trial—how much the prime word and the target word differed in their degrees of vision-based information. Even though we could not find any previous studies that reported the effect of visual strength (or visual-strength difference) on RT, we hypothesised a priming effect underpinned by this variable, consistent with related research (Petilli et al., 2021). Specifically, we hypothesised that visual-strength difference would have an inhibitory effect on participants’ performance, with higher values leading to longer RTs.

In Studies 2.2 and 2.3, we used the visual strength score per stimulus word. We hypothesised that this variable would have a facilitatory effect on participants’ performance—i.e., higher values leading to shorter RTs—, consistent with related research (Petilli et al., 2021).

Unlike language, vision was not examined at the individual level because the available variables were based on one self-reported value per participant (Balota et al., 2007; Hutchison et al., 2013), contrasting with the greater precision of the vocabulary measures, which consisted of multiple trials. Nonetheless, we recognise the need to investigate the role of perceptual experience (Muraki & Pexman, 2021; Plaut & Booth, 2000) alongside that of linguistic experience in the future.

Levels of analysis

Experimental data in psycholinguistics can be divided into various levels, such as individuals, words and tasks. The simultaneous examination of a theory across several levels is expected to enhance our understanding of the theory (Ostarek & Bottini, 2021)—for instance, by revealing the distribution of explanatory power (that is, effect size) within and across these levels. Several studies have probed more than one level—for instance, word level and individual level (Aujla, 2021; Lim et al., 2020; Pexman & Yap, 2018; Yap et al., 2009), or word level and task level (Al-Azary et al., 2022; Connell & Lynott, 2013, 2014a; Ostarek & Huettig, 2019; Petilli et al., 2021). This multilevel approach is complementary to a different line of research that aims to test the causality of various sources of information in conceptual processing, such as language (Ponari, Norbury, Rotaru, et al., 2018), perception (Stasenko et al., 2014) and action (Speed et al., 2017).

The three levels considered in this study—individual, word and task—are described below.

Individual level

The individual level is concerned with the role of individual differences in domains such as language, perception, mental imagery and physical experience (e.g., Daidone & Darcy, 2021; Davies et al., 2017; Dils & Boroditsky, 2010; Fetterman et al., 2018; Holt & Beilock, 2006; Mak & Willems, 2019; Miceli et al., 2022; Pexman & Yap, 2018; Vukovic & Williams, 2015; Yap et al., 2009, 2012, 2017).5 Recent studies have revealed important roles of participant-specific variables in topics where these variables have not traditionally been considered (DeLuca et al., 2019; Kos et al., 2012; Montero-Melis, 2021).

Vocabulary size is used to represent the language system at the individual level. It measures the number of words a person can recognise out of a sample. Furthermore, covariates akin to general cognition—where available—were included the models (see Covariates section below).

Word level

The word level is concerned with the lexical and semantic information in words (e.g., De Deyne et al., 2021; Lam et al., 2015; Lund et al., 1995; Lund & Burgess, 1996; Lynott et al., 2020; Mandera et al., 2017; Petilli et al., 2021; Pexman et al., 2017; Santos et al., 2011; Wingfield & Connell, 2022b). The word-level variables of interest in this study are language-based and vision-based information (both described above). The covariates are lexical variables and word concreteness. The lexical covariates were selected in each study out of the same five variables (see Covariates section below).

Task level

The task level is concerned with experimental conditions affecting, for instance, processing speed. In Study 2.1 (semantic priming), there is one task-level factor, namely, stimulus onset asynchrony (SOA), which measures the temporal interval between the onset of the prime word and the onset of the target word.6 In Studies 2.2 and 2.3, there are no task-level variables.

Beyond task-level variables, there is an additional source of task-related information across the three studies—namely, the experimental paradigm used in each study (i.e., semantic priming, semantic decision and lexical decision). Indeed, it is possible to examine how the effects vary across these paradigms (see Wingfield & Connell, 2022b). This comparison, however, must be considered cautiously due to the existence of other non-trivial differences across these studies, such as the numbers of observations. With this caveat noted, the tasks used across these studies likely elicit varying degrees of semantic depth, as ordered below (see Balota & Lorch, 1986; Barsalou et al., 2008; Becker et al., 1997; de Wit & Kinoshita, 2015; Joordens & Becker, 1997; Lam et al., 2015; Muraki & Pexman, 2021; Ostarek & Huettig, 2017; Versace et al., 2021; Wingfield & Connell, 2022b).

  1. Semantic decision (Study 2.2) likely elicits the deepest semantic processing, as the instructions of this task ask for a concreteness judgement. In this task, participants are asked to classify words as abstract or concrete, which elicits deeper semantic processing than the task of identifying word forms—i.e., lexical decision (de Wit & Kinoshita, 2015).

  2. Semantic priming (Study 2.1). The task administered to participants in semantic priming studies is often lexical decision, as in Study 2.1 below. The fundamental characteristic of semantic priming is that, in each trial, a prime word is briefly presented before the target word. The prime word is not directly relevant to the task, as participants respond to the target word. Nonetheless, participants normally process both the prime word and the target word in each trial, and this combination allows researchers to analyse responses based on the prime–target relationship. In this regard, this paradigm could be considered more deeply semantic than lexical decision. Indeed, slower responses in semantic priming studies—reflecting difficult lexical decisions—have been linked to larger priming effects (Balota et al., 2008; Hoedemaker & Gordon, 2014; Yap et al., 2013), revealing a degree of semantic association that has not been identified in the lexical decision task.

  3. Lexical decision (Study 2.3) is likely the semantically-shallowest task of these three, as it focusses solely on the identification of word forms.


The central objective of the present studies is the simultaneous investigation of language-based and vision-based information, along with the interactions between each of those and vocabulary size, gender and presentation speed (i.e., SOA). Previous studies have examined subsets of these effects using the same data sets we are using (Balota et al., 2007; Petilli et al., 2021; Pexman et al., 2017; Pexman & Yap, 2018; Wingfield & Connell, 2022b; Yap et al., 2012, 2017). Out of these studies, only Petilli et al. (2021) investigated both language and vision. However, in contrast to our present study, Petilli et al. did not examine the role of vocabulary size or any other individual differences, instead collapsing the data across participants.

In addition to main effects of the aforementioned variables, our three studies have four interactions in common: (1a) language-based information × vocabulary size, (1b) vision-based information × vocabulary size, (2a) language-based information × participants’ gender, and (2b) vision-based information × participants’ gender. In addition, Study 2.1 contained two further interactions: (3a) language-based information × SOA, (3b) vision-based information × SOA (note that the names of some predictors vary across studies, as detailed in the present studies section above). Each interaction and the corresponding hypotheses are addressed below.

1a. Language-based information × vocabulary size

We outline three hypotheses supported by literature regarding the interaction between language-based information and participants’ vocabulary size.

  • Larger vocabulary, larger effects. Higher-vocabulary participants might be more sensitive to linguistic features than lower-vocabulary participants, thanks to a larger number of semantic associations (Connell, 2019; Landauer et al., 1998; Louwerse et al., 2015; Paivio, 1990; Pylyshyn, 1973). For instance, Yap et al. (2017) revisited the semantic priming study of Hutchinson and Louwerse (2013) and observed a larger semantic priming effect in higher-vocabulary participants.

  • Larger vocabulary, smaller effects. Higher-vocabulary participants might be less sensitive to linguistic features, thanks to a more automated language processing (Perfetti & Hart, 2002). Some of the evidence aligned with this hypothesis was obtained by Yap et al. (2009), who observed a smaller semantic priming effect in higher-vocabulary participants. Similarly, Yap et al. (2012) found that higher-vocabulary participants in a lexical decision task (Balota et al., 2007) were less sensitive to a cluster of lexical and semantic features (i.e., word frequency, semantic neighborhood density and number of senses).

  • Larger vocabulary, more task-relevant effects. Higher-vocabulary participants might present a greater sensitivity to task-relevant variables, borne out of their greater linguistic experience, relative to lower vocabulary participants. This would be consistent with the findings of Pexman and Yap (2018), who revisited the semantic decision study of Pexman et al. (2017). The semantic decision task of the Pexman et al. consisted of classifying words as abstract or concrete. Pexman and Yap found that word concreteness—a very relevant source of information for this task—was more influential in higher-vocabulary participants than in lower-vocabulary ones. In contrast, word frequency and age of acquisition—-not as relevant to the task–were more influential in lower-vocabulary participants (also see Lim et al., 2020). In our present studies, we set our hypotheses regarding the ‘task-relevance advantage’ by working under the assumption that the language-based information in words—represented by one variable in each study—is important for the three tasks, given the large effects of language across tasks (Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021). Therefore, the relevance hypothesis predicts that higher-vocabulary participants—compared to lower-vocabulary ones—will be more sensitive to language-based information (as represented by language-based similarity in Study 2.1, word co-occurrence in Study 2.2, and word frequency in Study 2.3).

1b. Vision-based information × vocabulary size

To our knowledge, no previous studies have investigated the interaction between vision-based information and participants’ vocabulary size. We entertained two hypotheses. First, lower-vocabulary participants might be more sensitive to visual strength than higher-vocabulary participants. In this way, lower-vocabulary participants might compensate for the disadvantage on the language side. Second, we considered the possibility that there were no interaction effect.

2a. Language-based information × gender

We entertained two hypotheses regarding the interaction between language-based information and participants’ gender: (a) that the language system would be more important in female participants than in males (Burman et al., 2008; Hutchinson & Louwerse, 2013; Jung et al., 2019; Ullman et al., 2008), and (b) that this interaction effect would be absent, as a recent review suggested that gender differences are negligible in the general population (Wallentin, 2020).

2b. Vision-based information × gender

To our knowledge, no previous studies have investigated the interaction between vision-based information and participants’ gender. We entertained two hypotheses. Our first hypothesis was that this interaction would stand opposite to the interaction between language and gender. That is, if female participants were to present a greater role of language-based information, male participants would present a greater role of vision-based information, thereby compensating for the disadvantage on the language side. Our second hypothesis was the absence of this interaction effect (see Wallentin, 2020).

3a. Language-based information × SOA

Previous research predicts that language-based information will have a larger effect with the short SOA than with the long one (Lam et al., 2015; Petilli et al., 2021)), which also aligns with research demonstrating the fast activation of language-based information (Louwerse & Connell, 2011; Santos et al., 2011; Simmons et al., 2008).

3b. Vision-based information × SOA

The interaction between vision-based information and SOA allows three hypotheses. First, some previous research predicts that the role of vision-based information will be more prevalent with the long SOA than with the short one (Louwerse & Connell, 2011; Santos et al., 2011; Simmons et al., 2008; also see Barsalou et al., 2008). Second, in contrast, other research (Petilli et al., 2021) based on the same data that we are analysing (Hutchison et al., 2013) predicts vision-based priming only with the short SOA (200 ms), and not with the long one (1,200 ms). Third, other research does not predict any vision-based priming effect (Hutchison, 2003; Ostarek & Huettig, 2017; Pecher et al., 1998; Yee et al., 2012). In this regard, some studies have observed vision-based priming when the task was preceded by another task that required attention to visual features of concepts (Pecher et al., 1998; Yee et al., 2012), but the present data (Hutchison et al., 2013) does not contain such a prior task.

Language and vision across studies

Next, we consider our hypotheses regarding the role of language and vision across studies. Yet, before addressing those, we reiterate that caution is required due to the existence of other differences across these studies, such as the number of observations. First, we hypothesise that language-based information will be relevant in the three studies due to the consistent importance of language observed in past studies (Banks et al., 2021; Kiela & Bottou, 2014; Lam et al., 2015; Louwerse et al., 2015; Pecher et al., 1998; Petilli et al., 2021). Second, the extant evidence regarding vision-based information is less conclusive. Some studies have observed effects of vision-based information (Connell & Lynott, 2014a; Flores d’Arcais et al., 1985; Petilli et al., 2021; Schreuder et al., 1984), whereas others have not (Hutchison, 2003; Ostarek & Huettig, 2017), and a third set of studies have only observed them when the critical task was preceded by a task that required attention to visual features of concepts (Pecher et al., 1998; Yee et al., 2012). Based on these precedents, we hypothesise that vision-based information will be relevant in semantic decision, whereas it might or might not be relevant in semantic priming and in lexical decision.

Statistical power analysis

Statistical power depends on the following factors: (1) sample size—comprising the number of participants, items, trials, etc.—, (2) effect size, (3) measurement variability and (4) number of comparisons being performed. Out of these, sample size is the factor that can best be controlled by researchers (Kumle et al., 2021). The three studies we present below, containing larger-than-average sample sizes, offer an opportunity to perform an a-priori power analysis to help determine the sample size of future studies (Albers & Lakens, 2018).


Insufficient statistical power lowers the reliability of effect sizes, and increases the likelihood of false positive results—i.e., Type I errors—as well as the likelihood of false negative results—i.e., Type II errors (Gelman & Carlin, 2014; Loken & Gelman, 2017; Tversky & Kahneman, 1971; von der Malsburg & Angele, 2017). For instance, Vasishth and Gelman (2021) illustrate how, in low-powered studies, effect sizes associated with significant results tend to be overestimated (also see Vasishth, Mertzen, et al., 2018).

Over the past decade, replication studies and power analyses have uncovered insufficient sample sizes in psychology (Brysbaert, 2019; Heyman et al., 2018; Lynott et al., 2014; Montero-Melis et al., 2017, 2022; Rodríguez-Ferreiro et al., 2020; Vasishth, Mertzen, et al., 2018). In one of these studies, Heyman et al. (2018) demonstrated that increasing the sample size resulted in an increase of the reliability of the estimates, which in turn lowered the Type I error rate and the Type II error rate—i.e., false negative and false positive results, respectively. Calls for larger sample sizes have also been voiced in the field of neuroscience. For instance, Marek et al. (2022) estimated the sample size that would be required to reliably study the mapping between individual differences—such as general cognition—and brain structures. The authors found that the current median of 25 participants in each of these studies contrasted with the thousands of participants—around 10,000—that would be needed for a well-powered study (also see Button et al., 2013).

More topic-specific power analyses are necessary due to several reasons. First, power analyses provide greater certainty on the reasons behind non-replications (e.g., Open Science Collaboration, 2015), and behind non-significant results at large. Non-replications are not solely explained by methodological differences across studies, questionable research practices and publication bias (C. J. Anderson et al., 2016; Barsalou, 2019; Corker et al., 2014; Gilbert et al., 2016; Williams, 2014; Zwaan, 2014; also see Tiokhin et al., 2021). In addition to these factors, a lack of statistical power can cause non-replications and non-significant results (see Loken & Gelman, 2017; Vasishth & Gelman, 2021).

Regarding non-significant results, it is worthwhile to consider some examples from research on individual differences. In this literature, there is a body of non-significant results, both in behavioural studies (Daidone & Darcy, 2021; Hedge et al., 2018; Muraki & Pexman, 2021; Ponari, Norbury, Rotaru, et al., 2018; Rodríguez-Ferreiro et al., 2020; for a Bayes factor analysis, see Rouder & Haaf, 2019) and in neuroscientific studies (Diaz et al., 2021). A greater availability of power analyses within this topic area and others will at least shed light on the influence of statistical power on the results. Furthermore, power analyses facilitate the identification of sensible sample sizes for future studies. Last, it should be noted that although increasing the statistical power comes at a cost in the short term, power analyses will help maximise the use of research funding in the long term by fostering more replicable research (see Vasishth & Gelman, 2021; remember Open Science Collaboration, 2015).


Al-Azary, H., Yu, T., & McRae, K. (2022). Can you touch the N400? The interactive effects of body-object interaction and task demands on N400 amplitudes and decision latencies. Brain and Language, 231, 105147.
Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of Experimental Social Psychology, 74, 187–195.
Anderson, C. J., Bahník, Š., Barnett-Cowan, M., Bosco, F. A., Chandler, J., Chartier, C. R., Cheung, F., Christopherson, C. D., Cordes, A., Cremata, E. J., Della Penna, N., Estel, V., Fedor, A., Fitneva, S. A., Frank, M. C., Grange, J. A., Hartshorne, J. K., Hasselman, F., Henninger, F., … Zuni, K. (2016). Response to Comment on Estimating the reproducibility of psychological science.” Science, 351(6277), 1037–1037.
Aujla, H. (2021). Language experience predicts semantic priming of lexical decision. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 75(3), 235.
Balota, D. A., & Lorch, R. F. (1986). Depth of automatic spreading activation: Mediated priming effects in pronunciation but not in lexical decision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12(3), 336–345.
Balota, D. A., Yap, M. J., Cortese, M. J., & Watson, J. M. (2008). Beyond mean response latency: Response time distributional analyses of semantic priming. Journal of Memory and Language, 59(4), 495–523.
Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English Lexicon Project. Behavior Research Methods, 39, 445–459.
Banks, B., Wingfield, C., & Connell, L. (2021). Linguistic distributional knowledge and sensorimotor grounding both contribute to semantic category production. Cognitive Science, 45(10), e13055.
Barsalou, L. W. (2019). Establishing generalizable mechanisms. Psychological Inquiry, 30(4), 220–230.
Barsalou, L. W., Santos, A., Simmons, W. K., & Wilson, C. D. (2008). Language and simulation in conceptual processing. In Symbols and Embodiment. Oxford University Press.
Becker, S., Moscovitch, M., Behrmann, M., & Joordens, S. (1997). Long-term semantic priming: A computational account and empirical evidence. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(5), 1059–1082.
Bernabeu, P. (2018). Dutch modality exclusivity norms for 336 properties and 411 concepts. PsyArXiv.
Bernabeu, P., Lynott, D., & Connell, L. (2021). Preregistration: The interplay between linguistic and embodied systems in conceptual processing. OSF.
Bernabeu, P., Willems, R. M., & Louwerse, M. M. (2017). Modality switch effects emerge early and increase throughout conceptual processing: Evidence from ERPs. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. J. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 1629–1634). Cognitive Science Society.
Bottini, R., Morucci, P., D’Urso, A., Collignon, O., & Crepaldi, D. (2021). The concreteness advantage in lexical decision does not depend on perceptual simulations. Journal of Experimental Psychology: General.
Brauer, M., & Curtin, J. J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychological Methods, 23(3), 389–411.
Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2(1, 1), 16.
Bullinaria, J. A., & Levy, J. P. (2007). Extracting semantic representations from word co-occurrence statistics: A computational study. Behavior Research Methods, 39(3), 510–526.
Burman, D., Bitan, T., & Both, J. (2008). Sex differences in neural processing of language among children. Neuropsychologia, 46, 5, 1349–1362.
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5, 5), 365–376.
Chen, I.-H., Zhao, Q., Long, Y., Lu, Q., & Huang, C.-R. (2019). Mandarin Chinese modality exclusivity norms. PLOS ONE, 14(2), e0211336.
Connell, L. (2019). What have labels ever done for us? The linguistic shortcut in conceptual processing. Language, Cognition and Neuroscience, 34(10), 1308–1318.
Connell, L., & Lynott, D. (2013). Flexible and fast: Linguistic shortcut affects both shallow and deep conceptual processing. Psychonomic Bulletin & Review, 20, 3, 542–550.
Connell, L., & Lynott, D. (2014a). I see/hear what you mean: Semantic activation in visual word recognition depends on perceptual attention. Journal of Experimental Psychology: General, 143(2), 527–533.
Connell, L., Lynott, D., & Banks, B. (2018). Interoception: The forgotten modality in perceptual grounding of abstract and concrete concepts. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1752), 20170143.
Corker, K. S., Lynott, D., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). High quality direct replications matter: Response to Williams (2014). Social Psychology, 45(4), 324–326.
Daidone, D., & Darcy, I. (2021). Vocabulary size is a key factor in predicting second language lexical encoding accuracy. Frontiers in Psychology, 12, 688356.
Davies, R. A., Arnell, R., Birchenough, J. M., Grimmond, D., & Houlson, S. (2017). Reading through the life span: Individual differences in psycholinguistic effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(8), 1298.
De Deyne, S., Navarro, D. J., Collell, G., & Perfors, A. (2021). Visual and affective multimodal models of word meaning in language and mind. Cognitive Science, 45(1), e12922.
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M., & Storms, G. (2019). The Small World of Words English word association norms for over 12,000 cue words. Behavior Research Methods, 51, 987–1006.
De Deyne, S., Perfors, A., & Navarro, D. (2016). Predicting human similarity judgments with distributional models: The value of word associations. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1861–1870.
de Wit, B., & Kinoshita, S. (2015). The masked semantic priming effect is task dependent: Reconsidering the automatic spreading activation process. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41(4), 1062–1075.
DeLuca, V., Rothman, J., Bialystok, E., & Pliatsikas, C. (2019). Redefining bilingualism as a spectrum of experiences that differentially affects brain structure and function. Proceedings of the National Academy of Sciences, 116(15), 7565–7574.
Diaz, M. T., Karimi, H., Troutman, S. B. W., Gertel, V. H., Cosgrove, A. L., & Zhang, H. (2021). Neural sensitivity to phonological characteristics is stable across the lifespan. NeuroImage, 225, 117511.
Dils, A. T., & Boroditsky, L. (2010). Visual motion aftereffect from understanding motion language. Proceedings of the National Academy of Sciences, 107(37), 16396–16400.
Fernandino, L., Tong, J.-Q., Conant, L. L., Humphries, C. J., & Binder, J. R. (2022). Decoding the information structure underlying the neural representation of concepts. Proceedings of the National Academy of Sciences, 119(6).
Fetterman, A. K., Wilkowski, B. M., & Robinson, M. D. (2018). On feeling warm and being warm: Daily perceptions of physical warmth fluctuate with interpersonal warmth. Social Psychological and Personality Science, 9(5), 560–567.
Flores d’Arcais, G. B., Schreuder, R., & Glazenborg, G. (1985). Semantic activation during recognition of referential words. Psychological Research, 47(1), 39–49.
Gelman, A., & Carlin, J. (2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641–651.
Gilbert, D. T., King, G., Pettigrew, S., & Wilson, T. D. (2016). Comment on Estimating the reproducibility of psychological science.” Science, 351(6277), 1037–1037.
Hedge, C., Powell, G., & Sumner, P. (2018). The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods, 50(3), 1166–1186.
Heyman, T., Bruninx, A., Hutchison, K. A., & Storms, G. (2018). The (un)reliability of item-level semantic priming effects. Behavior Research Methods, 50(6), 2173–2183.
Hoedemaker, R. S., & Gordon, P. C. (2014). It takes time to prime: Semantic priming in the ocular lexical decision task. Journal of Experimental Psychology: Human Perception and Performance, 40(6), 2179–2197.
Holt, L. E., & Beilock, S. L. (2006). Expertise and its embodiment: Examining the impact of sensorimotor skill expertise on the representation of action-related text. Psychonomic Bulletin & Review, 13(4), 694–701.
Hutchinson, S., & Louwerse, M. M. (2013). Language statistics and individual differences in processing primary metaphors. Cognitive Linguistics, 24(4), 667–687.
Hutchison, K. A. (2003). Is semantic priming due to association strength or feature overlap? A microanalytic review. Psychonomic Bulletin & Review, 10(4), 785–813.
Hutchison, K. A., Balota, D. A., Neely, J. H., Cortese, M. J., Cohen-Shikora, E. R., Tse, C.-S., Yap, M. J., Bengson, J. J., Niemeyer, D., & Buchanan, E. (2013). The semantic priming project. Behavior Research Methods, 45, 1099–1114.
Joordens, S., & Becker, S. (1997). The long and short of semantic priming effects in lexical decision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(5), 1083–1105.
Jung, M., Mody, M., Fujioka, T., Kimura, Y., Okazawa, H., & Kosaka, H. (2019). Sex differences in white matter pathways related to language ability. Frontiers in Human Neuroscience, 13, 898.
Kiela, D., & Bottou, L. (2014). Learning image embeddings using convolutional neural networks for improved multi-modal semantics. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP, 36–45.
Kos, M., Van den Brink, D., & Hagoort, P. (2012). Individual variation in the late positive complex to semantic anomalies. Frontiers in Psychology, 3(318).
Kumle, L., Võ, M. L.-H., & Draschkow, D. (2021). Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R. Behavior Research Methods.
Lam, K. J., Dijkstra, T., & Rueschemeyer, S. A. (2015). Feature activation during word recognition: Action, visual, and associative-semantic priming effects. Frontiers in Psychology, 6, 659.
Lamiell, J. T. (2019). Statistical thinking in psychology: Some needed critical perspective on what “everyone knows.” In J. T. Lamiell (Ed.), Psychology’s Misuse of Statistics and Persistent Dismissal of its Critics (pp. 99–121). Springer International Publishing.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2-3), 259–284.
Lim, R. Y., Yap, M. J., & Tse, C.-S. (2020). Individual differences in Cantonese Chinese word recognition: Insights from the Chinese Lexicon Project. Quarterly Journal of Experimental Psychology, 73(4), 504–518.
Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584–585.
Louwerse, M. M., & Connell, L. (2011). A taste of words: Linguistic context and perceptual simulation predict the modality of words. Cognitive Science, 35(2), 381–398.
Louwerse, M. M., Hutchinson, S., Tillman, R., & Recchia, G. (2015). Effect size matters: The role of language statistics and perceptual simulation in conceptual processing. Language, Cognition and Neuroscience, 30(4), 430–447.
Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.
Lund, K., Burgess, C., & Atchley, R. A. (1995). Semantic and associative priming in high-dimensional semantic space. Proceedings of the Cognitive Science Society, 660–665.
Lynott, D., Connell, L., Brysbaert, M., Brand, J., & Carney, J. (2020). The Lancaster Sensorimotor Norms: Multidimensional measures of perceptual and action strength for 40,000 English words. Behavior Research Methods, 52, 1271–1291.
Lynott, D., Corker, K. S., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). Replication of Experiencing physical warmth promotes interpersonal warmth” by Williams and Bargh (2008). Social Psychology, 45(3), 216–222.
Mak, M., & Willems, R. M. (2019). Mental simulation during literary reading: Individual differences revealed with eye-tracking. Language, Cognition and Neuroscience, 34(4), 511–535.
Mandera, P., Keuleers, E., & Brysbaert, M. (2017). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language, 92, 57–78.
Marek, S., Tervo-Clemmens, B., Calabro, F. J., Montez, D. F., Kay, B. P., Hatoum, A. S., Donohue, M. R., Foran, W., Miller, R. L., Hendrickson, T. J., Malone, S. M., Kandala, S., Feczko, E., Miranda-Dominguez, O., Graham, A. M., Earl, E. A., Perrone, A. J., Cordova, M., Doyle, O., … Dosenbach, N. U. F. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 1–7.
Miceli, A., Wauthia, E., Lefebvre, L., Ris, L., & Simoes Loureiro, I. (2021). Perceptual and interoceptive strength norms for 270 french words. Frontiers in Psychology, 12.
Miceli, A., Wauthia, E., Lefebvre, L., Vallet, G. T., Ris, L., & Loureiro, I. S. (2022). Differences related to aging in sensorimotor knowledge: Investigation of perceptual strength and body object interaction. Archives of Gerontology and Geriatrics, 102, 104715.
Montero-Melis, G. (2021). Consistency in motion event encoding across languages. Frontiers in Psychology, 12(625153).
Montero-Melis, G., Eisenbeiss, S., Narasimhan, B., Ibarretxe-Antuñano, I., Kita, S., Kopecka, A., Lüpke, F., Nikitina, T., Tragel, I., Jaeger, T. F., & Bohnemeyer, J. (2017). Satellite- vs. Verb-framing underpredicts nonverbal motion categorization: Insights from a large language sample and simulations. Cognitive Semantics, 3(1), 36–61.
Montero-Melis, G., van Paridon, J., Ostarek, M., & Bylund, E. (2022). No evidence for embodiment: The motor system is not needed to keep action verbs in working memory. Cortex, 150, 108–125.
Morucci, P., Bottini, R., & Crepaldi, D. (2019). Augmented modality exclusivity norms for concrete and abstract Italian property words. Journal of Cognition, 2(1), 42.
Muraki, E. J., & Pexman, P. M. (2021). Simulating semantics: Are individual differences in motor imagery related to sensorimotor effects in language processing? Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(12), 1939–1957.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Ostarek, M., & Bottini, R. (2021). Towards strong inference in research on embodiment – Possibilities and limitations of causal paradigms. Journal of Cognition, 4(1), 5.
Ostarek, M., & Huettig, F. (2017). A task-dependent causal role for low-level visual processes in spoken word comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(8), 1215–1224.
Ostarek, M., & Huettig, F. (2019). Six challenges for embodiment research. Current Directions in Psychological Science, 28(6), 593–599.
Paivio, A. (1990). Mental representations: A dual coding approach. Oxford University Press.
Pearson, J., & Kosslyn, S. M. (2015). The heterogeneity of mental representation: Ending the imagery debate. Proceedings of the National Academy of Sciences, 112(33), 10089–10092.
Pecher, D., Zeelenberg, R., & Raaijmakers, J. G. W. (1998). Does pizza prime coin? Perceptual priming in lexical decision and pronunciation. Journal of Memory and Language, 38(4), 401–418.
Perfetti, C. A., & Hart, L. (2002). The lexical quality hypothesis. In L. Verhoeven, C. Elbro, & P. Reitsma (Eds.), Studies in Written Language and Literacy (Vol. 11, pp. 189–213). John Benjamins Publishing Company.
Petilli, M. A., Günther, F., Vergallito, A., Ciapparelli, M., & Marelli, M. (2021). Data-driven computational models reveal perceptual simulation in word processing. Journal of Memory and Language, 117, 104194.
Pexman, P. M., Heard, A., Lloyd, E., & Yap, M. J. (2017). The Calgary semantic decision project: Concrete/abstract decision data for 10,000 English words. Behavior Research Methods, 49(2), 407–417.
Pexman, P. M., & Yap, M. J. (2018). Individual differences in semantic processing: Insights from the Calgary semantic decision project. Journal of Experimental Psychology: Learning, Memory, and Cognition, 44(7), 1091–1112.
Planchuelo, C., Buades-Sitjar, F., Hinojosa, J. A., & Duñabeitia, J. A. (2022). The Nature of Word Associations in Sentence Contexts. Experimental Psychology.
Plaut, D. C., & Booth, J. R. (2000). Individual and developmental differences in semantic priming: Empirical and computational support for a single-mechanism account of lexical processing. Psychological Review, 107(4), 786–823.
Ponari, M., Norbury, C. F., Rotaru, A., Lenci, A., & Vigliocco, G. (2018). Learning abstract words and concepts: Insights from developmental language disorder. Philosophical Transactions of the Royal Society B: Biological Sciences, 373, 20170140.
Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind’s brain: A critique of mental imagery. Psychological Bulletin, 80(1), 1–24.
Reilly, J., Flurie, M., & Peelle, J. E. (2020). The English lexicon mirrors functional brain activation for a sensory hierarchy dominated by vision and audition: Point-counterpoint. Journal of Neurolinguistics, 55, 100895.
Rodríguez-Ferreiro, J., Aguilera, M., & Davies, R. (2020). Semantic priming and schizotypal personality: Reassessing the link between thought disorder and enhanced spreading of semantic activation. PeerJ, 8, e9511.
Roque, L. S., Kendrick, K. H., Norcliffe, E., Brown, P., Defina, R., Dingemanse, M., Dirksmeyer, T., Enfield, N. J., Floyd, S., Hammond, J., Rossi, G., Tufvesson, S., Putten, S. van, & Majid, A. (2015). Vision verbs dominate in conversation across cultures, but the ranking of non-visual verbs varies. Cognitive Linguistics, 26(1), 31–60.
Rouder, J. N., & Haaf, J. M. (2019). A psychometrics of individual differences in experimental tasks. Psychonomic Bulletin & Review, 26(2), 452–467.
Santos, A., Chaigneau, S. E., Simmons, W. K., & Barsalou, L. W. (2011). Property generation reflects word association and situated simulation. Language and Cognition, 3(1), 83–119.
Sassenhagen, J., & Alday, P. M. (2016). A common misapplication of statistical inference: Nuisance control with null-hypothesis significance tests. Brain and Language, 162, 42–45.
Schreuder, R., Flores d’Arcais, G. B., & Glazenborg, G. (1984). Effects of perceptual and conceptual similarity in semantic priming. Psychological Research, 45(4), 339–354.
Simmons, W. K., Hamann, S. B., Harenski, C. L., Hu, X. P., & Barsalou, L. W. (2008). fMRI evidence for word association and situated simulation in conceptual processing. Journal of Physiology-Paris, 102(1), 106–119.
Speed, L. J., & Brybaert, M. (2021). Dutch sensory modality norms. Behavior Research Methods.
Speed, L. J., & Majid, A. (2020). Grounding language in the neglected senses of touch, taste, and smell. Cognitive Neuropsychology, 37(5-6), 363–392.
Speed, L. J., van Dam, W. O., Hirath, P., Vigliocco, G., & Desai, R. H. (2017). Impaired comprehension of speed verbs in parkinson’s disease. Journal of the International Neuropsychological Society, 23(5), 412–420.
Stasenko, A., Garcea, F. E., Dombovy, M., & Mahon, B. Z. (2014). When concepts lose their color: A case of object-color knowledge impairment. Cortex, 58, 217–238.
Tiokhin, L., Yan, M., & Morgan, T. J. H. (2021). Competition for priority harms the reliability of science, but reforms can help. Nature Human Behaviour, 5(7, 7), 857–867.
Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105–110.
Ullman, M. T., Miranda, R. A., & Travers, M. L. (2008). Sex differences in the neurocognition of language. In J. B. Becker, K. J. Berkley, N. Geary, E. Hampson, J. Herman, & E. Young (Eds.), Sex on the brain: From genes to behavior (pp. 291–309). Oxford University Press.
Vasishth, S., & Gelman, A. (2021). How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis. Linguistics, 59(5), 1311–1342.
Vasishth, S., Mertzen, D., Jäger, L. A., & Gelman, A. (2018). The statistical significance filter leads to overoptimistic expectations of replicability. Journal of Memory and Language, 103, 151–175.
Vergallito, A., Petilli, M. A., & Marelli, M. (2020). Perceptual modality norms for 1,121 Italian words: A comparison with concreteness and imageability scores and an analysis of their impact in word processing tasks. Behavior Research Methods, 52(4), 1599–1616.
Versace, R., Bailloud, N., Magnan, A., & Ecalle, J. (2021). The impact of embodied simulation in vocabulary learning. The Mental Lexicon, 16(1), 2–22.
Vigliocco, G., Meteyard, L., Andrews, M., & Kousta, S. (2009). Toward a theory of semantic representation. 1(2), 219–247.
von der Malsburg, T., & Angele, B. (2017). False positives and other statistical errors in standard analyses of eye movements in reading. Journal of Memory and Language, 94, 119–133.
Vukovic, N., & Williams, J. N. (2015). Individual differences in spatial cognition influence mental simulation of language. Cognition, 142, 110–122.
Wallentin, M. (2020). Chapter 6 - Gender differences in language are small but matter for disorders. In R. Lanzenberger, G. S. Kranz, & I. Savic (Eds.), Handbook of Clinical Neurology (Vol. 175, pp. 81–102). Elsevier.
Wang, X., Li, G., Zhao, G., Li, Y., Wang, B., Lin, C.-P., Liu, X., & Bi, Y. (2021). Social and emotion dimensional organizations in the abstract semantic space: The neuropsychological evidence. Scientific Reports, 11(1, 1), 23572.
Williams, L. E. (2014). Improving psychological science requires theory, data, and caution: Reflections on Lynott et al. (2014). Social Psychology, 45(4), 321–323.
Wingfield, C., & Connell, L. (2022b). Understanding the role of linguistic distributional knowledge in cognition. Language, Cognition and Neuroscience, 1–51.
Winter, B., Perlman, M., & Majid, A. (2018). Vision dominates in perceptual language: English sensory vocabulary is optimized for usage. Cognition, 179, 213–220.
Yap, M. J., Balota, D. A., Sibley, D. E., & Ratcliff, R. (2012). Individual differences in visual word recognition: Insights from the English Lexicon Project. Journal of Experimental Psychology: Human Perception and Performance, 38, 1, 53–79.
Yap, M. J., Balota, D. A., & Tan, S. E. (2013). Additive and interactive effects in semantic priming: Isolating lexical and decision processes in the lexical decision task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(1), 140–158.
Yap, M. J., Hutchison, K. A., & Tan, L. C. (2017). Individual differences in semantic priming performance: Insights from the semantic priming project. In M. N. Jones (Ed.), Frontiers of cognitive psychology. Big data in cognitive science (pp. 203–226). Routledge/Taylor & Francis Group.
Yap, M. J., Tse, C.-S., & Balota, D. A. (2009). Individual differences in the joint effects of semantic priming and word frequency revealed by RT distributional analyses: The role of lexical integrity. Journal of Memory and Language, 61(3), 303–325.
Yee, E., Ahmed, S. Z., & Thompson-Schill, S. L. (2012). Colorless green ideas (can) prime furiously. Psychological Science, 23(4), 364–369.
Zhong, Y., Wan, M., Ahrens, K., & Huang, C.-R. (2022). Sensorimotor norms for Chinese nouns and their relationship with orthographic and semantic variables. Language, Cognition and Neuroscience, 0(0), 1–23.
Zwaan, R. A. (2014). Replications should be performed with power and precision: A response to Rommers, Meyer, and Huettig (2013). Psychological Science, 25(1), 305–307.

  1. According to Lamiell (2019), ‘individual differences’ is a misnomer in that the analyses used to examine those (e.g, regression) are not participant-specific. While this may partly hold for the current study too, the use of by-participant random effects increases the role of individuals in the analysis.↩︎

  2. The names of all variables used in the analyses were slightly adjusted for this text to facilitate their understanding—for instance, by replacing underscores with spaces (conversions reflected in the scripts available at One specific case deserves further comment. We use the formula of the SOA in this paper, instead of the ‘interstimulus interval’ (ISI)—which we used in the analysis—, as the SOA has been more commonly used in previous papers (e.g., Hutchison et al., 2013; Pecher et al., 1998; Petilli et al., 2021; Yap et al., 2017). In our analysis, we used the ISI formula as it was the one present in the data set of Hutchison et al. (2013)—retrieved from The only difference between these formulas is that the ISI does not count the presentation of the prime word. In the current study (Hutchison et al., 2013), the presentation of the prime word lasted 150 ms. Therefore, the 50 ms ISI is equivalent to a 200 ms SOA, and the 1,050 ms ISI is equivalent to a 1,200 ms SOA. The use of either formula in the analysis would not affect our results, as the ISI conditions were recoded as -0.5 and 0.5 (Brauer & Curtin, 2018).↩︎

Pablo Bernabeu, 2022. Licence: CC BY 4.0.

Online book created using the R package bookdown.