Publications  

Investigating object orientation effects across 18 languages

Mental simulation theories of language comprehension propose that people automatically create mental representations of objects mentioned in sentences. Mental representation is often measured with the sentence-picture verification task, wherein participants first read a sentence that implies the object property (i.e., shape and orientation). Participants then respond to an image of an object by indicating whether it was an object from the sentence or not. Previous studies have shown matching advantages for shape, but findings concerning object orientation have not been robust across languages. This registered report investigated the match advantage of object orientation across 18 languages in nearly 4,000 participants. The preregistered analysis revealed no compelling evidence for a match advantage for orientation across languages. Additionally, the match advantage was not predicted by mental rotation scores. In light of these findings, we discuss the implications for current theory and methodology surrounding mental simulation.

Multi-region investigation of ‘man’ as default in attitudes

Previous research has studied the extent to which men are the default members of social groups in terms of memory, categorization, and stereotyping, but not attitudes which is critical because of attitudes’ relationship to behavior. Results from our survey (N > 5000) collected via a globally distributed laboratory network in over 40 regions demonstrated that attitudes toward Black people and politicians had a stronger relationship with attitudes toward the men rather than the women of the group. However, attitudes toward White people had a stronger relationship with attitudes toward White women than White men, whereas attitudes toward East Asian people, police officers, and criminals did not have a stronger relationship with attitudes toward either the men or women of each respective group. Regional agreement with traditional gender roles was explored as a potential moderator. These findings have implications for understanding the unique forms of prejudice women face around the world.

Starting from the very beginning: Unraveling third language (L3) development with longitudinal data from artificial language learning and related epistemology

The burgeoning field of third language (L3) acquisition has increasingly focused on intermediate stages of language development, aiming to establish the groundwork for comprehensive models of L3 learning that encompass the entire developmental sequence. This article underscores the importance of a robust epistemological foundation, advocating for incremental knowledge building through longitudinal research. In the study presented here, we use artificial languages to investigate L3 acquisition from initial exposure with complete input control, factoring in individual differences in executive functions and history of bilingual exposure/engagement to assess the role of these variables in shaping learning trajectories and modulating cross-linguistic influence (CLI). This approach not only advances our understanding of L3 development under controlled conditions but also links L3 acquisition research to broader cognitive science inquiries.

Language and vision in conceptual processing: Multilevel analysis and statistical power

Research has suggested that conceptual processing depends on both language-based and vision-based information. We tested this interplay at three levels of the experimental structure: individuals, words and tasks. To this end, we drew on three existing, large data sets that implemented the paradigms of semantic priming, semantic decision and lexical decision. We extended these data sets with measures of language-based and vision-based information, and analysed how the latter variables interacted with participants’ vocabulary size and gender, and also with presentation speed in the semantic priming study. We performed the analysis using mixed-effects models that included a comprehensive array of fixed effects—including covariates—and random effects. First, we found that language-based information was more important than vision-based information. Second, in the semantic priming study—whose task required distinguishing between words and nonwords—, both language-based and vision-based information were more influential when words were presented faster. Third, a ‘task-relevance advantage’ was identified in higher-vocabulary participants. Specifically, in lexical decision, higher-vocabulary participants were more sensitive to language-based information than lower-vocabulary participants. In contrast, in semantic decision, higher-vocabulary participants were more sensitive to word concreteness. Fourth, we demonstrated the influence of the analytical method on the results. These findings support the interplay between language and vision in conceptual processing, and demonstrate the influence of measurement instruments on the results. Last, we estimated the sample size required to reliably investigate various effects. We found that 300 participants were sufficient to examine the effect of language-based information contained in words, whereas more than 1,000 participants were necessary to examine the effect of vision-based information and the interactions of both former variables with vocabulary size, gender and presentation speed. In conclusion, this power analysis reveals the need to increase sample sizes when conducting research on perceptual simulation and individual differences.

Language and sensorimotor simulation in conceptual processing: Multilevel analysis and statistical power

Research has suggested that conceptual processing depends on both language-based and sensorimotor information. In this thesis, I investigate the nature of these systems and their interplay at three levels of the experimental structure—namely, individuals, words and tasks. In Study 1, I contributed to a multi-lab replication of the object orientation effect, which has been used to test sensorimotor simulation. The effect did not appear across any of the 18 languages examined, and it was not influenced by individual differences in mental rotation. Next, in Study 2, we drew on three existing data sets that implemented semantic priming, semantic decision and lexical decision. We extended these data sets with measures of language-based and vision-based information, and analysed their interactions with participants’ vocabulary size and gender, and with presentation speed. The analysis had a conservative structure of fixed and random effects. First, we found that language-based information was more important than vision-based information. Second, in the semantic priming study—whose task required distinguishing between words and nonwords—, both language-based and vision-based information were more influential when words were presented faster. Third, a ‘task-relevance advantage’ was identified in higher-vocabulary participants. Specifically, in lexical decision, higher-vocabulary participants were more sensitive to language-based information than lower-vocabulary participants. In contrast, in semantic decision, higher-vocabulary participants were more sensitive to word concreteness. Fourth, we demonstrated the influence of the analytical method on the results. Last, we estimated the sample size required to investigate various effects. We found that 300 participants were sufficient to examine the effect of language-based information in words, whereas more than 1,000 participants were necessary to examine the effect of vision-based information and the interactions of both former variables with vocabulary size, gender and presentation speed. This power analysis reveals the need to increase sample sizes when conduct research on perceptual simulation and individual differences.

Preregistration: The interplay between linguistic and embodied systems in conceptual processing

This preregistration outlines a study that will investigate the dynamic nature of conceptual processing by examining the interplay between linguistic distributional systems—comprising word co-occurrence and word association—and embodied systems—comprising sensorimotor and emotional information. A set of confirmatory research questions are addressed using data from the Calgary Semantic Decision project, along with additional measures for the stimuli corresponding to distributional language statistics, embodied information, and individual differences in vocabulary size.

More refined typology and design in linguistic relativity: The case of motion event encoding

Linguistic relativity is the influence of language on other realms of cognition. For instance, the way movement is expressed in a person’s native language may influence how they perceive movement. Motion event encoding (MEE) is usually framed as a typological dichotomy. Path-in-verb languages tend to encode path information within the verb (e.g., ‘leave’), whereas manner-in-verb languages encode manner (e.g., ‘jump’). The results of MEE-based linguistic relativity experiments range from no effect to effects on verbal and nonverbal cognition. Seeking a more definitive conclusion, we propose linguistic and experimental enhancements. First, we examine state-of-the-art typology, suggesting how a recent MEE classification across twenty languages (Verkerk, 2014) may enable more powerful analyses. Second, we review procedural challenges such as the influence of verbal thought and second-guessing in experiments. To tackle these challenges, we propose distinguishing verbal and nonverbal subgroups, and having enough filler items. Finally we exemplify this in an experimental design.

Dutch modality exclusivity norms for 336 properties and 411 concepts

Part of the toolkit of language researchers is formed of stimuli that have been rated on various dimensions. The current study presents modality exclusivity norms for 336 properties and 411 concepts in Dutch. Forty-two respondents rated the auditory, haptic, and visual strength of these words. Mean scores were then computed, yielding acceptable reliability values. Measures of modality exclusivity and perceptual strength were also computed. Furthermore, the data includes psycholinguistic variables from other corpora, covering length (e.g., number of phonemes), frequency (e.g., contextual diversity), and distinctiveness (e.g., number of orthographic neighbours), along with concreteness and age of acquisition. To test these norms, Lynott and Connell's (2009, 2013) analyses were replicated. First, unimodal, bimodal, and tri-modal words were found. Vision was the most prevalent modality. Vision and touch were relatively related, leaving a more independent auditory modality. Properties were more strongly perceptual than concepts. Last, sound symbolism was investigated using regression, which revealed that auditory strength predicted lexical properties of the words better than the other modalities did, or else with a different direction. All the data and analysis code, including a web application, are available from https://osf.io/brkjw.

Modality switch effects emerge early and increase throughout conceptual processing: Evidence from ERPs

We tested whether conceptual processing is modality-specific by tracking the time course of the Conceptual Modality Switch effect. Forty-six participants verified the relation between property words and concept words. The conceptual modality of consecutive trials was manipulated in order to produce an Auditory-to-visual switch condition, a Haptic-to-visual switch condition, and a Visual-to-visual, no-switch condition. Event-Related Potentials (ERPs) were time-locked to the onset of the first word (property) in the target trials so as to measure the effect online and to avoid a within-trial confound. A switch effect was found, characterized by more negative ERP amplitudes for modality switches than no-switches. It proved significant in four typical time windows from 160 to 750 milliseconds post word onset, with greater strength in posterior brain regions, and after 350 milliseconds. These results suggest that conceptual processing may be modality-specific in certain tasks, but also that the early stage of processing is relatively amodal.

Modality switches occur early and extend late in conceptual processing: Evidence from ERPs

The engagement of sensory brain regions during word recognition is widely documented, yet its precise relevance is less clear. It would constitute perceptual simulation only if it has a functional role in conceptual processing. We investigated this in an Event-Related Potential (ERP) experiment implementing the conceptual modality switch paradigm. In each trial, participants verified the relation between a property word and a concept word. Orthogonally, we manipulated the conceptual modality of successive trials, and tested whether switching modalities incurred any processing costs at different stages of word recognition. Unlike previous studies, we time-locked ERPs to the first word of target trials, in order to measure the modality transitions from the beginning, and also to reduce confounds within the target trial. Further, we included different types of switch—one from auditory to visual modality, and one from haptic to visual—, which were compared to the non-switch—visual to visual. Also, one group of participants was asked to respond quickly (n = 21), and another group to respond self-paced (n = 21), whilst a few others received no constraints (n = 5). We found ERP effects in four typical time windows from 160 to 750 ms post word onset. The overall effect is characterized by a negativity for modality-switching relative to not switching, and it increases over time. Further, the effect arises with both types of switch, and influences both participant groups within anterior and posterior brain regions. The emergence of this effect in the first time window particularly suggests that sensory regions may have a functional role in conceptual processing. The increased effect later on converges with previous studies in supporting the compatibility of distributional and embodied processing. On a less conclusive note, more research may be necessary to ascertain the nature of the effect at late stages.

Language evolution: Current status and future directions

The topic of language evolution is characterised by the scarcity of records, but also by a large flow of research produced within multiple subtopics and perspectives. Over the past few decades, significant advancement has been made on the geographical and temporal origins of language, while current work is rather devoted to the underpinnings of language, in brain, genes, body, and culture of humans. Much of this literature is polarized over the crucial dichotomy of nativism versus emergentism. Our state of affairs report also confirms a high degree of speculation, albeit with a decrease for modelling. To tackle the speculation and the large research flow, we propose a more impersonal kind of review, focused on the topic’s questions rather than on particular accounts. Another observation is that novel perspectives are on the rise. One of these highlights the importance of perceptual cognition, often dubbed ‘embodiment,’ in the earlier evolution of language. In following this lead, we adapted a previous experiment which had investigated the correspondence between certain perceptual features of events, and different grammatical orders arising as participants acted out those events. That design made a perfect basis for us to put in an additional variable, namely the contrast between body-based communication (gestures), and more disembodied communication (symbol matching). Albeit tentative, the results of this pilot experiment reveal a greater effect of the embodiment variable on the grammatical preferences, which we see as inviting further exploration of embodied cognition in language evolution.

Web Applications and Dashboards  

Articles republished on R-bloggers

WebVTT caption transcription app

This open-source, R-based web application allows the conversion of video captions (subtitles) from the Web Video Text Tracks (WebVTT) Format into plain texts. For this purpose, users upload a WebVTT file with the extension of ‘vtt’ or ‘txt’. Automatically, metadata such as timestamps are removed, and the text is formatted.

Web application for the simulation of experimental data

This open-source, R-based web application is suitable for educational or research purposes in experimental sciences. It allows the creation of varied data sets with specified structures, such as between-group or within-participant variables, that can be categorical or continuous. These features can be selected along the different tabs. In the penultimate tab, a custom summary of the current data set can be constructed. In the last tab, the list of parameters and the data set can be downloaded.

Data dashboard: Butterfly species richness in Los Angeles

Dashboard with open data from a study by Prudic et al. (2018), that compares citizen science with traditional methods in butterfly sampling. Coding tasks included long-transforming, merging, and as ever, wrangling with a table.

Dutch modality exclusivity norms

This app presents linguistic data over several tabs. The code combines the great front-end of Flexdashboard—based on R Markdown and yielding an unmatched user interface—, with the great back-end of Shiny—allowing users to download sections of data they select, in various formats. The hardest nuts to crack included modifying the rows/columns orientation without affecting the functionality of tables. A cool, recent finding was the reactable package. A nice feature, allowed by Flexdashboard, was the use of quite different formats in different tabs.

Modality switch effects emerge early and increase throughout conceptual processing

We tested whether conceptual processing is modality-specific by tracking the time course of the Conceptual Modality Switch effect. Forty-six participants verified the relation between property words and concept words. The conceptual modality of consecutive trials was manipulated in order to produce an Auditory-to-visual switch condition, a Haptic-to-visual switch condition, and a Visual-to-visual, no-switch condition. Event-Related Potentials (ERPs) were time-locked to the onset of the first word (property) in the target trials so as to measure the effect online and to avoid a within-trial confound. A switch effect was found, characterized by more negative ERP amplitudes for modality switches than no-switches. It proved significant in four typical time windows from 160 to 750 milliseconds post word onset, with greater strength in the Slow group, in posterior brain regions, and in the N400 window. The earliest switch effect was located in the language brain region, whereas later it was more prominent in the visual region. In the N400 and Late Positive windows, the Quick group presented the effect especially in the language region, whereas the Slow had it rather in the visual region. These results suggest that contextual factors such as time resources modulate the engagement of linguistic and embodied systems in conceptual processing.

Workshops

Introduction to , statistics, visualisation, reproducible documents, web applications and dashboards, HTML, CSS, web hosting (further details).

Date Title Event and location Registration
Aug 2020 Open data and reproducibility: Markdown, data dashboards and Binder v2.1 (co-led with Florencia D'Andrea) CarpentryCon@Home, The Carpentries [online] Link
July 2020 Open data and reproducibility: Markdown, data dashboards and Binder (co-led with Eirini Zormpa) UK Cognitive Linguistics Conference, University of Birmingham [online] Link
May 2020 Markdown Lancaster University [online]

Presentations

Date Format Title Event
Apr 2025 Poster The interplay of cognition and bilingual repertoires in L3 learning (1) Meeting of the Experimental Psychology Society, Lancaster University
Oct, Nov 2024; Mar 2025 Poster Smart starts: Cognitive differences predict prior knowledge involvement in language learning (1) XIV Conference of the Spanish Society for Experimental Psychology (SEPEX), Almería;
(2) 65th Annual Meeting of the Psychonomic Society, New York;
(3) 3rd MULTILINGUA Network Meeting, Barcelona
Feb 2025 Talk Unpacking ERP Responses in Artificial Language Learning Lunch Seminar at the Center for Language, Brain and Learning (C-LaBL), UiT The Arctic University of Norway
Nov 2024 Co-organisation and presentations Tasks tailored for several electroencephalography and behavioural stations. Short presentations given on electroencephalography, language acquisition and executive functions. Public outreach event of the UiT Center for Language, Brain and Learning (C-LaBL), hosted at the Arctic University Museum of Norway.
Oct, Nov 2023; June, July, Aug, Sept 2024 Posters and speed talk Investigating language learning and morphosyntactic transfer longitudinally using artificial languages (1) AcqVA Aurora Centre, UiT The Arctic University of Norway;
(2) PoLaR Lab, UiT The Arctic University of Norway;
(3) 9th Conference of the Scandinavian Association for Language and Cognition, Norwegian University of Science and Technology;
(4) Highlights in the Language Sciences 2024, Radboud University;
(5) 13th International Conference on Third Language Acquisition and Multilingualism, University of Groningen;
(6) 30th Conference on Architectures and Mechanisms for Language Processing (AMLaP), University of Edinburgh
July 2024 Poster Language and vision in conceptual processing: Multilevel analysis and statistical power Highlights in the Language Sciences 2024, Radboud University
Mar, July, Sept 2024 Poster Making research materials Findable, Accessible, Interoperable and Reusable (1) AcqVA Aurora Closing Event, UiT The Arctic University of Norway;
(2) Highlights in the Language Sciences 2024, Radboud University;
(3) 30th Conference on Architectures and Mechanisms for Language Processing (AMLaP), University of Edinburgh
Mar 2024 Poster Is third language learning influenced by working memory, implicit learning and inhibitory control? AcqVA Aurora Closing Event, UiT The Arctic University of Norway
Mar 2024 Poster Effects of cognitive individual differences on cross-linguistic effects in L3 acquisition AcqVA Aurora Closing Event, UiT The Arctic University of Norway
May 2023 Reading group discussion Discussion of Labotka et al. (2023): Testing the effects of congruence in adult multilingual acquisition with implications for creole genesis Reading group of the PoLaR Lab, UiT The Arctic University of Norway
Jan 2023 Reading group discussion Discussion of Jost et al. (2019): Input complexity affects long-term retention of statistically learned regularities in an artificial language learning task Reading group of the PoLaR Lab, UiT The Arctic University of Norway
Oct 2022 Talk Language and sensorimotor simulation in conceptual processing: Multilevel analysis and statistical power AcqVA Aurora Centre, UiT The Arctic University of Norway
Sept 2022 Talk The interplay between linguistic and embodied systems in conceptual processing Presented by Dr. Dermot Lynott at the 22nd Meeting of the European Society for Cognitive Psychology (ESCOP), in Lille, France
Feb 2022 Talk Language and vision in conceptual processing: Multilevel analysis and statistical power Language and Cognition Seminars, Dept. Psychology, Lancaster University
May 2021 Talk Linguistic and embodied systems in conceptual processing: Variation across individuals and items Lancaster University Postgraduate Psychology Conference 2021
May 2021 Talk Towards reproducibility and maximally-open data Open Science Week 2021, Open Scholarship Community Galway
Nov 2020 Talk Mixed-effects models in R and a new tool for data simulation New Tricks Seminars, Dept. Psychology, Lancaster University
Oct 2020 Talk Reproducibilidad en torno a una aplicación web Reprohack en español, LatinR Conference 2020
Apr 2020 Talk Embedding open research and reproducibility in the UG and PGT curricula (with Andrew Stewart and Phil McAleer) Collaborations Workshop, Software Sustainability Institute
Sept 2019 x 2 Talk Presentations at two open days on the education and the research at our department Department of Psychology, Lancaster University
Dec 2018 Talk Presenting data interactively online using Shiny Research Software Forum, Lancaster University
Nov 2018 Talk Linguistic and embodied systems in conceptual processing: Role of individual differences Psychology postgraduate medley, Lancaster University
Jan 2017 x2; Apr, July, Nov 2017 Poster Modality switch effects emerge early and increase throughout conceptual processing: Evidence from ERPs (1) Event representations in episodic and semantic memory, University of York;
(2) Netherlands Graduate School of Linguistics, Radboud University;
(3) Juniorendag, Utrecht University;
(4) 39th Annual Conference of the Cognitive Science Society, London;
(5) 58th Annual Meeting of the Psychonomic Society, Vancouver
June 2016 Talk Conceptual processing at different speeds: Probing linguistic and embodied systems Synapsium, Radboud University
May 2016 Poster Norming study of modality exclusivity in Dutch and an ongoing EEG study of linguistic and embodied conceptual processing Psycholinguistics in Flanders, University of Antwerp
June 2015 Talk New reviews and insights on language evolution Tenth Language at the University of Essex (LangUE) Conference, University of Essex
Feb, May 2015 Talk Shallow and deep conceptual representation: An ERP design (1) Theme Meetings, Radboud University;
(2) Neurobiology of Language Lab meeting, MPI Psycholinguistics
Jan, Mar 2015 Poster Linguistic relativity in motion (1) Netherlands Graduate School of Linguistics, University of Amsterdam;
(3) Juniorendag, Radboud University
Avatar

Pablo Bernabeu

Researcher
University of Oxford

pablo.bernabeu@education.ox.ac.uk
pcbernabeu@gmail.com

               

Interests

  • Education & digital technologies
  • Cognitive psychology & neuroscience
  • Linguistics
  • Data science & programming
  • Research methods & open science
  View CV

I conduct research and data analysis on digitally-enhanced childhood learning as part of the Learning in Families through Technology (LiFT) project at the Department of Education at the University of Oxford.

Previously, I held a postdoctoral fellowship at UiT The Arctic University of Norway, contributing to a project on language learning, crosslinguistic influence and executive functions. Prior to that, I completed a PhD in Psychology at Lancaster University, having previously earned a research master’s in Language and Communication from Tilburg University.

Beyond education and digital technologies, my interests include cognitive psychology and neuroscience, linguistics, digital technologies, data science, research methods and open science.

I have worked with a wide range of research methods, including behavioural and EEG experiments, corpus analysis and computational modelling.

Materials from my research are available at osf.io/25u3x.

Want to find out more? Please drop me an email (or try this chatbot for fun).

Recent Work

  • Researcher, June 2025 – May 2027

    Department of Education, University of Oxford

    -  Additional service: Assistance at inaugural convening of AI in Education at Oxford University

  • Postdoctoral Fellow, Nov 2022 – Feb 2025

    Center for Language, Brain and Learning, UiT The Arctic University of Norway

    -  I worked at the Department of Language and Culture, and specifically within the PoLaR Lab and C-LaBL. As the local manager of the LESS Project (Language Economy through Transfer Source Selectivity), I worked on a longitudinal study that investigates how bilingual people acquire an additional language, how this process is influenced by the characteristics of the languages, and how the process is instantiated in the brain. As part of this work, I contributed to designing our main study and developing materials in Norwegian and English, as well as creating materials in Spanish and English for a partner study in Spain. After documenting and pretesting these materials, I prepared a preregistration for the studies. Additionally, I recruited participants, designed the protocol for electroencephalography (EEG) sessions, and trained students and research assistants in both the protocol and EEG methodology more broadly. I also established and managed an EEG lab, conducted numerous sessions, supervised those led by research assistants, and monitored the longitudinal progress per participant. Moreover, I presented the study design at conferences and collaborated with research assistants to preprocess, visualise and analyse EEG and behavioural data.

    -  Additional service: co-organisation of multiple events, including the lunch meetings of AcqVA Aurora and C-LaBL, and a public outreach event of C-LaBL. Peer-review for Cognition, Cognitive Science and EuroSLA Conference.

  • Statistical Consultant (25%), Nov–Dec 2022

    AcqVA Aurora Center, UiT The Arctic University of Norway

    -  I worked as a statistical consultant for the CLICK Project (Cross-Linguistic Influence of Competing Knowledge), which investigated multilingualism in heritage speakers. I worked with questionnaire and eye-tracking data.

Education

  • PhD Psychology, 2018–2022

    Lancaster University (United Kingdom)

    -  Additional service: peer-review for Cognitive Science and for Psychological Science Accelerator; development of website for open science group in my department.

    Activities

  • Research Master Language and Communication, 2013–2017

    Tilburg University and Radboud University (the Netherlands)

    -  Grade: 7.54 out of 10 (Distinction)

    -  In my thesis, conducted at Tilburg University and at the Max Planck Institute for Psycholinguistics, I investigated how word comprehension during reading is modulated by modality-specific information (i.e., visual, auditory, and haptic). Consistent with a large body of research, I observed that conceptual processing is not restricted to abstract linguistic representations, but is modulated by the perceptual information in words and by people's individual experiences in perceptual domains. Outside of my thesis, I investigated the influence of specific languages in nonverbal cognition. Specifically, I reviewed research examining how and why people's perception of motion may be modulated by the way in which their first language encodes motion events in sentences. Furthermore, I investigated the co-evolution of language and other cognitive systems. Throughout this research, I used a range of multidisciplinary methods including word classification surveys, corpus analysis and electroencephalography.

    -  Student member, Master’s curriculum and accreditation committee

  • BA English, 2007–2013

    Autonomous University of Madrid (Spain)

    -  Grade: 7.30 out of 10 (2:1 Hons)

    -  One-year Erasmus exchange at University of Jyväskylä, Finland

    -  One-year exchange at University of Barcelona, Spain

    -  Six-month Spanish teaching placement in Kaunas, Lithuania

Further Courses

Teaching, Supervision and Advice

Since 2018, I have advised numerous students and colleagues on designing and conducting behavioural and EEG experiments, as well as on the management, preprocessing and analysis of data. For instance, during my PhD, I supervised an undergraduate internship. During my postdoctoral fellowship at UiT, I supervised three research assistantships and co-supervised a master's thesis. Furthermore, I am a certified Carpentries Instructor, and have designed and led several workshops on data analysis using . Earlier in my career, I taught English to secondary-education students in Spain, and taught Spanish to adults in Lithuania.

During my PhD, I held a graduate teaching assistantship that involved 180 hours of teaching annually, covering seminars, essay marking and lab sessions. Each year, I led 30 seminars and marked 80 essays in developmental, cognitive and social psychology, while also helping in 30 statistics lab sessions (activities summarised below). Furthermore, I was a representative for graduate teaching assistants in the department for a year.

Course and remit
2021–22 Introduction to developmental psychology (115) — Seminars and essay marking
Introduction to neuroscience (112) — Seminars
Introduction to cognitive psychology (111) — Seminars and essay marking
Social psychology in the digital age (113) — Seminars
Statistics for psychologists I (121) — Lab sessions
2020–21 Introduction to developmental psychology (115) — Seminars and essay marking
Introduction to neuroscience (112) — Seminars
Introduction to cognitive psychology (111) — Seminars and essay marking
Social psychology in the digital age (113) — Seminars
Statistics for psychologists I and II (121 and 122) — Lab sessions
2019–20 Understanding psychology (101) — Seminars and essay marking
Cognitive psychology (201) — Seminars and essay marking
Master's statistics (401) — Lab sessions
2018–19 Understanding psychology (101) — Seminars and essay marking
Investigating psychology: Analysis (102) — Lab sessions

Lastly, I have created two web applications intended for educational contexts. One supports the simulation of data, enhancing the teaching of statistical principles, while the other streamlines the transcription of video captions for use in multimedia learning environments.

Teaching Philosophy

My teaching experiences have honed my ability to create a collaborative and engaging learning environment, where students are encouraged to think critically and apply their knowledge effectively. As a result, my teaching approach ensures that students not only acquire foundational knowledge but also develop the skills necessary to excel in their academic and professional endeavours. To this end, I draw on a range of applications that foster participation and collaboration, including MS Teams, Google Docs, GitHub, Mentimeter, Vevox, Miro, etc.

I have been guided by a few core principles that are outlined below.

Situating Cognitive Faculties in Meaningful Contexts

I strive to situate the concepts I teach in the appropriate contexts. For instance, language is produced in the brain and in society. These contexts shape language as a human faculty and languages as human products. In the same vein, language shares the space with other cognitive faculties and other cultural products, which often help us understand language. By pointing out these contexts, I encourage students to explore perspectives beyond traditional boundaries. Indeed, my teaching incorporates insights from psychology, neuroscience, linguistics and cross-cultural research.

Connecting Theory to Practice

I strive to connect theoretical concepts to the methods that are used for their study. This helps prevent the disconnects that are occasionally experienced by students and academics, where there can be an unhelpful focus on a method without theory or a theory without method.

Promoting Scientific Rigour and Reproducibility

My commitment to open science and reproducibility informs my teaching. By embedding these principles into research workflows, I help students produce reliable and sustainable scholarship that is can withstand the test of time. In practical terms, these standards are designed to (1) enhance the quality of research, (2) optimise the use of academic resources in the medium and long term by facilitating access to and reuse of research materials, and (3) enhance students’ professional prospects by equipping them with a high-value, translatable set of skills.

I would like to continue honing these principles, aided by the advice and inspiration from more experienced colleagues and by the regular feedback from students.

Some Possible Courses

Below, I present some examples of courses that I would like to teach. Blending interdisciplinary perspectives with rigorous methodological training, these courses include explorations of language and cognition, cutting-edge research techniques, and best practices in reproducibility and data visualisation.

1. Introduction to the Psychology and Neurobiology of Language

This course explores the intricate relationship between language, cognition and neurobiology, providing students with a foundational understanding of how language is processed and represented in the mind and brain. Topics include the historical and evolutionary development of language, mechanisms of comprehension and production, and the cognitive processes underpinning bilingualism and multilingualism. Additionally, the course examines the interactions between language and cognitive functions like executive control and sensorimotor simulation, culminating in an in-depth discussion of linguistic relativity.

2. Research Methods in the Psychology and Neurobiology of Language

Focusing on the methodological challenges and opportunities in studying language and cognition, this course provides students with the tools to design and conduct world-class crosslinguistic research. Students will tackle issues such as overcoming biases inherent in WEIRD (Western, Educated, Industrialised, Rich and Democratic) samples and identifying meaningful crosslinguistic patterns. The curriculum integrates theoretical frameworks, such as modularity versus holism, with practical training in experimental paradigms and methods that have become essential. Emphasis is placed on the use of psychophysical and neuroimaging techniques, including electroencephalography, magnetoencephalography, functional magnetic resonance imaging, eye-tracking and pupillometry, to provide a comprehensive understanding of the methods driving the field forward.

3. Electroencephalography

This course immerses students in the theory and application of electroencephalography (EEG) for studying cognitive processes, with a focus on language and decision-making. Students will gain a historical perspective on EEG research and a practical understanding of its implementation in Psychology and Linguistics. Key topics include event-related potentials, time-frequency analysis and experimental designs. Through a combination of lectures and laboratory sessions, students will gain the theoretical and technical skills needed to design and conduct EEG studies. As part of this work, students will practice how to search for solutions reliably and responsibly by drawing on community forums, business support services and artificial intelligence applications.

4. Increasing Reproducibility Throughout the Workflow of a Study

Reproducibility is a cornerstone of scientific integrity, and this course empowers students to embed reproducible practices in their research workflows. Grounded in open science principles, students will examine the role of Psychology in the replication crisis, and become familiar with methodological frameworks like FAIR, which helps create more Findable, Accessible, Interoperable and Reusable data. Practical sessions will focus on implementing tools such as the Open Science Framework and R Markdown, while providing training in reproducible experiment design, data analysis and manuscript preparation. As part of this work, students will practice how to search for solutions reliably and responsibly by drawing on community forums and artificial intelligence applications. By the end of the course, students will be equipped to produce transparent, replicable research that meets the highest standards of scientific rigour.

5. Data Visualisation

Effective data visualisation is crucial for interpreting and communicating our research, and this course teaches students how to achieve this using R. With an emphasis on clarity and accessibility, students will learn to create a variety of visualisations, from static plots to interactive web applications. The course covers best practices for summarising data, combining plots and integrating tables into reports. Practical sessions provide experience with advanced visualisation techniques, ensuring students can present complex data in an engaging and professional manner. As part of this work, students will practice how to search for solutions reliably and responsibly by drawing on community forums and artificial intelligence applications.

These courses collectively emphasise the integration of theory and practice, reproducibility and methodological innovation. My academic experience in linguistics, psychology, statistics and research methods directly informs the design of these courses, ensuring that students gain both foundational knowledge and practical skills to excel in their academic and professional endeavours.

6. Introduction to Statistical Analysis

This course provides a foundational understanding of statistical reasoning and methods, with an emphasis on transparent and defensible research practices. Using R as the primary tool, students will explore key concepts such as descriptive statistics, probability theory, hypothesis testing, correlation and basic linear models. Emphasis is placed on integrating statistical considerations into the early stages of research design, ensuring that methods are appropriately aligned with research questions and data structures. Through a combination of theoretical instruction and practical exercises, students will learn to justify their analytical choices and interpret statistical results with care and precision.

7. Advanced Statistical Modelling

Building on introductory concepts, this course equips students with the skills to conduct sophisticated statistical analyses using R. Topics include generalised linear models, multilevel (mixed-effects) modelling, model selection and comparison, approaches to handling non-normal and hierarchical data, and dimensionality reduction techniques such as principal component analysis and factor analysis. Throughout, students will be encouraged to consider statistical design and analysis as an integrated part of the research workflow, making principled methodological decisions in response to the specific demands of their data and hypotheses. The course also addresses challenges such as overfitting, multicollinearity and multiple comparisons, with a strong focus on transparent reporting and justification of analytical strategies. Practical sessions will provide hands-on experience with real-world data and complex experimental designs.

Funding and Awards

Year Grant/Award Purpose/Reason
2021 Joint second place in the Open Scholarship Prize Competition organised by Open Scholarship Community Galway Prize obtained after a final series of presentations.
2020 RepliCATS Grant, University of Melbourne Obtained for completing 20 RepliCATS research assessments.
2020 Gorilla Grant from Gorilla and Prolific Conducting a large-sample experiment on the internet.
2020 Software Sustainability Institute Fellowship Organising training and practice activities in research software, focussed on data presentation using R.
Apr 2019 Travel grant, UK Open Science Working Group. Aston University, UK Attendance at first meeting of the UK Open Science Working Group.
2018 – 2022 Scholarship for PhD and graduate teaching assistantship, Lancaster University, UK See research and other activities.
Nov 2017 Psychonomic Society Graduate Travel Award for 58th Annual Meeting Presenting a poster on research from my master's degree.
July 2017 Student Volunteer, Cognitive Science Society Conference Presenting a poster on research from my master's degree.
July 2017 Grindley Grant from Experimental Psychology Society Presenting a poster on research from my master's degree at Cognitive Science Society Conference.
Jan 2017 Grindley Grant from Experimental Psychology Society Presenting a poster on research from my master's degree at conference ‘Event representations in episodic and semantic memory’.
May – June 2016 Funding for experiment from Neurobiology of Language Dept., Max Planck Institute for Psycholinguistics Conducting an EEG experiment for my master's thesis.

Other Work

May – July 2018
Service Analyst. Onfido, London, UK
I was responsible for verifying identity checks through random sampling and data analysis, leveraging insights retrieved from Power BI. I designed and implemented a self-updating Excel dashboard with dynamic tables and visualisations to streamline data reporting. My role involved close collaboration with service analysts and engineers to ensure accurate and actionable insights. Additionally, I utilised tools such as Jira, Confluence and Zendesk to support project management, documentation and customer service processes.
Dec 2015 – Feb 2016
Data Science Market Researcher and Student Recruiter (part-time). Tilburg University
I contributed to the elaboration of a leaflet, and informed prospective students at open days.
2015 – 2016
Student Representative at Master’s Degree Fairs in Spain (part-time). Radboud University, Tilburg University
I worked at three fairs with Radboud University, and at one with Tilburg University.
2013 – 2016
Presenter of my Master's Degree at Open Days (part-time). Tilburg University
2013 – 2016
Communication, Website and Student Recruiter (part-time). Academia Bravosol, Madrid, Spain
2011 – 2013
Teacher of English and Spanish (part-time). Academia Bravosol, Madrid, Spain

Skills

Science Best practices for research integrity, data protection, data management, technical documentation, open science, effective collaboration and project management. Electroencephalography—lab setup, recording, preprocessing, analysis (e.g., )—, BrainVision, transcranial brain stimulation, eye tracking, online experiments, jsPsych, OpenSesame (e.g., ), Gorilla, ELAN, Turnitin, Zotero
Statistics Frequentist and Bayesian statistics. Linear mixed-effects models (e.g., ), principal component analysis (), etc.
Programming in R Statistics, modular programming, Markdown, web applications and dashboards, Rhino, Crosstalk, Plotly, Leaflet, bibliometrics, Binder environments (e.g., ), big data, natural language processing (e.g., ), Tidyverse (e.g., )
Web, data, Artificial Intelligence and typesetting HTML, CSS, Javascript, regular expressions, Git, Linux, bash, containerisation, Python, machine learning models including MediaPipe and Whisper, SQL, AI-assisted development (incl. GitHub Copilot), high-performance computing (incl. Slurm), LaTeX (e.g., )
Business and administration Project management, technical documentation, quantitative and qualitative service analysis. Power BI, Zendesk, Jira, Confluence, Asana, Google Analytics
Languages Proficient: English, Spanish. Intermediate: Catalan, French. Basic: Dutch, Italian, Norwegian

Videos and Podcasts

Not the most riveting channel on YouTube—much less on Spotify, Apple Podcasts or iVoox.

2025 ·  The modular mini-grammar: Building testable and reproducible artificial languages using FAIR principles

Created using Google Gemini and NotebookLM.

In the high-stakes world of scientific inquiry, methods and findings are inextricable. Yet, issues of reproducibility remain a challenge, especially in experimental linguistics and cognitive science. As the old adage goes, “To err is human”, but when creating research materials, adhering to best practices can significantly reduce mistakes and enhance long-term efficiency.

In this episode of Codex Mentis, we explore the crucial application of the FAIR Guiding Principles—making materials Findable, Accessible, Interoperable, and Reusable—to the creation of stimuli and experimental workflows.

Drawing on research presented by Bernabeu and colleagues, we delve into a complex study on multilingualism using artificial languages, designed specifically to ensure the materials are reproducible, testable, modifiable, and expandable. Unlike many previous artificial language studies that showed low to medium accessibility, this methodology emphasizes high standards for scientific data management.

What you will learn:

• The Power of Open Source: We discuss the importance of using free, script-based, open-source software, such as R and OpenSesame, to augment the credibility and reliability of research.

• Modular Frameworks: Discover how creating a modular workflow based on minimal components in R facilitates the expansion of materials to new languages or within the same language set.

• Rigour and Reproducibility: We examine crucial testing steps exerted throughout the preparation workflow—including checking if all experimental elements appear equally often—to prevent blatant disparities and spurious effects.

• Detailed Experimentation: Hear how custom Python code within OpenSesame was implemented to manage complex procedures across multiple sessions, including assigning participant-specific parameters (like mini-language or resting-state order).

• Measuring the Brain: We look at the technical challenge of accurately time-locking electroencephalographic (EEG) measurements. The episode details the custom Python script used in OpenSesame to send triggers to the serial port, enabling precise Event-Related Potential (ERP) recording.

• Generous Documentation: Why detailed documentation, using formats like README.txt that are universally accessible, is essential for allowing collaborators and future researchers (or even your future self) to understand, reproduce, and reuse the materials.

Adhering to FAIR standards ensures that the investment in research materials facilitates researchers’ work beyond the shortest term, contributing to the best use of resources and increasing scientific reliability.

View sources and related content.

2025 ·  Third language learning and morphosyntactic transfer

Created using Google Gemini and NotebookLM.

Many of us know how difficult it is to master a second language (L2). But what happens when you decide to go for a third? You might assume the process gets easier once your brain is “warmed up,” but the reality is far more complex and far more fascinating.

In this insightful episode of Codex Mentis, we explore the burgeoning science of Third Language Acquisition, or L3 acquisition. We reveal why learning an L3 presents a fundamentally different cognitive puzzle than learning an L2.

The Two-Blueprint Problem: When an L3 learner approaches a new language, their brain has two prior linguistic blueprints—the native language (L1) and the second language (L2)—instead of just one. This means they already have experience managing two co-existing, often competing, language systems. This difference has profound, measurable consequences on the learning process, documented clearly in studies like the one involving L1 English/L2 Spanish speakers learning French, where they preferentially borrowed the complicating Spanish grammar instead of the helpful English one. This phenomenon, known as Cross-Linguistic Influence or ‘transfer,’ forces the L3 learner's brain to run a rapid, high-stakes cost-benefit analysis about which existing knowledge base to deploy. This effort reflects a fundamental principle of human cognition: cognitive economy, where the brain avoids redundancy by reusing existing knowledge.

The Great Debate: How Does the Brain Choose its Blueprint? The field is split over how transfer occurs:

  1. Typological Primacy Model: Argues for a ‘wholesale’ transfer—the brain makes a quick-and-dirty assessment of the new language's overall structure (its typology) and copies the entire grammatical system of the most similar known language (L1 or L2). This is the ‘big picture first’ approach.

  2. Linguistic Proximity Model and Scalpel Model: Suggest a continuous, granular, property-by-property negotiation. Influence is exerted by the language (L1 or L2) that has the most similar feature to the specific feature currently being processed in the L3.

Building Languages in the Lab: To test these competing theories and study the initial state of learning, scientists employ ingenious methodology: the artificial language paradigm. These miniature, custom-designed languages provide total control over input and allow researchers to create perfectly unambiguous contrasts between the learner's L1, L2, and the new L3. By using familiar words but new grammar (semi-artificial languages), researchers bypass the time-consuming process of memorizing vocabulary (the ‘lexico-semantic bottleneck’) and get straight to processing morphosyntax.

Learning vs. Acquisition: The Neural Evidence: This leads to a critical question rooted in Stephen Krashen’s work: are these lab studies capturing subconscious, intuitive acquisition (like a child absorbing their native tongue) or conscious, effortful learning (like cramming rules for an exam)?

Using EEG brain scans to measure neural activity, researchers look for the P600—the brain's automatic, implicit signature for grammatical errors in a native language. Surprisingly, early studies on artificial languages did not find the P600. Instead, they observed the P300. The P300 is a domain-general signal linked to attention, working memory, and processing unexpected patterns.

This means the brain’s initial response to a new grammar is not an automatic ‘copy-and-paste’ of a prior language; rather, L3 acquisition begins with the conscious recruitment of domain-general pattern-matching and attention.

The Next Frontier: We detail the sophisticated, large-scale, longitudinal study currently underway, designed to bridge the gap between conscious learning and subconscious acquisition. This research tracks participants over months to see if the P300 evolves into the automatic P600, while systematically measuring individual differences in working memory, inhibitory control, and implicit learning aptitude.

The study of the third tongue is evolving beyond linguistics; it has become a privileged window into one of the most fundamental questions about the human mind: how we manage, integrate, and reuse complex systems of knowledge.

Join us and delve into the science of the multilingual mind!

View sources and related content.

2025 ·  Behind the curtains: Methods used to investigate conceptual processing

Created using Google Gemini and NotebookLM.

How do scientists measure a thought? While the great philosophical questions about the nature of meaning have been debated for centuries, the last few decades have seen the development of a sophisticated scientific toolkit designed to turn these abstract queries into concrete, measurable data. In this episode of Codex Mentis, we go behind the curtains of cognitive science to explore the very methods used to investigate how the human brain processes language and constructs meaning.

Moving from the ‘what’ to the ‘how’, this programme offers a detailed review of the modern psycholinguist's toolkit. The journey begins with the foundational behavioural paradigms that capture cognition in milliseconds. Discover the logic behind the Lexical Decision Task, where a simple button press reveals the speed of word recognition, and the Semantic Priming paradigm, which uses subtle manipulations of time to dissociate the mind's automatic reflexes from its controlled, strategic operations.

From there, the discussion delves into the neuro-cognitive instruments that allow us to eavesdrop on the brain at work. Learn how Electroencephalography (EEG) and its famous N400 component provide a precise electrical timestamp for the brain's “sense-making” effort. Explore how Functional Magnetic Resonance Imaging (fMRI) creates detailed maps of the brain's “semantic system,” showing us where meaning is processed. And see how Eye-Tracking in the Visual World Paradigm provides a direct, observable trace of the brain making predictions in real time.

Finally, the episode demystifies the complex statistical techniques required to analyse this intricate data. We delve into the shift from older statistical methods to modern Linear Mixed-Effects Models, which are designed to handle the inherent variability between people and words. The conversation concludes with a crucial look at the foundations of trustworthy research, examining how scientists determine the sensitivity of their experiments and calculate the required sample sizes to ensure their findings are robust and reproducible. This episode provides a comprehensive guide to the ingenious procedures scientists employ to understand one of the most fundamental aspects of human experience: how we make sense of the world, one word at a time.

View sources and related content.

2025 ·  The architecture of meaning: Inside the words we use

Created using Google Gemini and NotebookLM.

What happens in your brain when you understand a simple word? It seems instantaneous, but this seemingly simple act is at the heart of one of the deepest mysteries of the human mind and has sparked one of the longest-running debates in cognitive science.

In this episode of Codex Mentis, we journey deep into the architecture of meaning to explore the battle between two powerful ideas. For decades, scientists were divided. Is your brain a vast, abstract dictionary, processing words like ‘kick’ by looking up amodal symbols and their connections to other symbols? Or is it a sophisticated simulator, where understanding ‘kick’ involves partially re-enacting the physical experience in your motor cortex?

We begin with a landmark finding—the ‘object orientation effect’—that seemed to provide a knockout punch for the simulation theory, only to see this cornerstone of embodied cognition crumble under the immense rigor of a massive, multi-lab replication study involving thousands of participants across 18 languages. This ‘failed’ replication didn't end the debate; it forced the entire field to evolve, moving beyond simple dichotomies and toward a more nuanced and profound understanding of the mind.

This episode unpacks the state-of-the-art ‘hybrid’ model of conceptual processing, which is at the forefront of modern cognitive science. Discover how your brain pragmatically and flexibly uses two complementary systems in a dynamic partnership. The first is a fast, efficient language system that operates on statistical patterns, much like a modern AI, providing a ‘shallow’ but rapid understanding of a word's context. The second is a slower, more resource-intensive sensorimotor system that provides ‘deep’ grounding by simulating a word's connection to our lived, bodily experience.

The episode delves into the groundbreaking research from Pablo Bernabeu's 2022 thesis, which reveals that the interplay between these two systems is not fixed but constantly adapts based on three critical levels:

  1. The task: The brain strategically deploys simulation only when a task demands deep semantic processing, conserving cognitive energy for shallower tasks.

  2. The word: Concrete concepts like ‘hammer’ rely more heavily on sensorimotor simulation than abstract concepts like ‘justice’.

  3. The individual: We explore the fascinating ‘task-relevance advantage,’ a consistent finding that a larger vocabulary isn't just about knowing more words, but about possessing the cognitive finesse to flexibly and efficiently deploy the right mental system for the job at hand.

We also pull back the curtain on the science itself, discussing the ‘replication crisis’ and the immense statistical power needed to reliably detect these subtle cognitive effects—often requiring over 1,000 participants for a single experiment. This methodological deep dive reveals why the science of the mind requires massive, collaborative efforts to move forward.

Finally, we look to the future, exploring how the recent explosion of Large Language Models (LLMs) provides a fascinating test case for these theories, and how new frontiers like interoception—our sense of our body's internal state—are expanding the very definition of embodiment to help explain our grasp of abstract concepts like ‘anxiety’ or ‘hope’.

This is a comprehensive exploration of the intricate, context-dependent dance between language and body that creates meaning in every moment. It will fundamentally change the way you think about the words you use every day.

View sources and related content.

2025 ·  Segmentation of ERPs involving several markers and time adjustments in BrainVision Analyzer

This live demonstration guides you through the process of segmenting event-related potentials (ERPs) in BrainVision Analyzer. The events of interest are represented by several markers, requiring some thought to time-lock each segmentation to the event onset.

2025 ·  Visualising EEG effects with topographic mapping in BrainVision Analyzer

This tutorial walks through the key steps: creating grand averages across participants, computing difference waves between experimental conditions, selecting appropriate map types, and defining time windows for visualisation.

2025 ·  Naming results files exported from Gorilla Experiment Builder

2024 ·  Reducing the impedance in electroencephalography using a blunt needle, electrolyte gel and wiggling

2024 ·  Briefing participants to prevent muscle artifacts in electroencephalography sessions

2021 ·  Linguistic and embodied systems in conceptual processing: Variation across individuals and items

2020 ·  Reproducibilidad en torno a una aplicación web

2020 ·  Workshop on Markdown, dashboards and Binder (see programme and materials)

2020 ·  Personal profile and experience at Lancaster University Department of Psychology

2020 ·  Embedding open research and reproducibility in the UG and PGT curricula

2019 ·  Part of application for Gorilla Grant

2019 ·  Part of application for Software Sustainability Institute Fellowship

2019 ·  Demonstration of procedure for bundled PSA Studies 002 and 003

Blog and Resources

Short essays, tutorials, inquiries, and functions for the implementation of experiments, data analysis and other purposes.
Some of the posts involving code were republished on R-bloggers, R Weekly, Data Science Central and dev.to.

Secure and scalable speech transcription for local and HPC

A production-ready local transcription workflow leveraging OpenAI's Whisper models that addresses the limitations of cloud-based solutions through complete data sovereignty, unlimited scale, reproducible processing and advanced quality control, while maintaining GDPR compliance.

4authors-year-doi-url: Minimal, numeric CSL style for documents with extreme space constraints

This style is designed to be as compact as possible while retaining the three most critical pieces of information for a reference: who (authors), when (year), and where to find it (DOI/URL).

Prototype workflow for semi-automatic processing of speech and co-speech gestures

Understanding the interplay between speech and gesture is crucial for linguistic and cognitive research. The current prototype, available on GitHub, aims to automate the analysis of temporal alignment between spoken demonstrative pronouns and pointing gestures in video recordings. By integrating computer vision (via Google’s MediaPipe) and speech recognition (using language-specific Vosk models) using Python, the workflow provides enriched video annotations and alignment data, offering valuable insights into deictic communication.

Reducing the impedance in electroencephalography using a blunt needle, electrolyte gel and wiggling

Reducing the impedance in electroencephalography (EEG) is crucial for capturing high-quality brain activity signals. This process involves ensuring that electrodes make optimal contact with the skin without harming the participant. Below are a few tips to achieve this using a blunt needle, electrolyte gel and gentle wiggling.

Passive constructions and asymmetries between languages

Researchers often make participants jump through hoops. Due to our personal blind spots, it seems easier to realise the full extent of these acrobatics when we consider the work of other researchers. In linguistic research, the acrobatics are often spurred by unnatural grammatical constructions.

A makeshift EEG lab

Say, you need to set up a makeshift EEG lab in an office? Easy-peasy—only, try to move the hardware as little as possible, especially laptops with dongles sticking out. The rest is a trail of snapshots devoid of captions, a sink, a shower room and other paraphernalia, as this is only an ancillary, temporary, extraordinary little lab, and all those staples are within reach in our mainstream lab (see Ledwidge et al., 2018; Luck, 2014).

R functions for checking and fixing vmrk files from BrainVision

Electroencephalography (EEG) has become a cornerstone for understanding the intricate workings of the human brain in the field of neuroscience. However, EEG software and hardware come with their own set of constraints, particularly in the management of markers, also known as triggers. This article aims to shed light on these limitations and future prospects of marker management in EEG studies, while also introducing R functions that can help deal with vmrk files from BrainVision.

Preventing muscle artifacts in electroencephalography sessions

Electroencephalographic (EEG) signals are often contaminated by muscle artifacts such as blinks, jaw clenching and (of course) yawns, which generate electrical activity that can obscure the brain signals of interest. These artifacts typically manifest as large, abrupt changes in the EEG signal, complicating data interpretation and analysis. To mitigate these issues, participants can be instructed during the preparatory phase of the session to minimize blinking and to keep their facial muscles relaxed. Additionally, researchers can emphasize the importance of staying still and provide practice sessions to help participants become aware of their movements, thereby reducing the likelihood of muscle artifacts affecting the EEG recordings.

Job: Part-time research assistant in experimental research

We are seeking to appoint a part-time research assistant to help us recruit participants and conduct an experiment. In the current project, led by Jorge González Alonso and funded by the Research Council of Norway, we investigate language learning and the neurophysiological basis of multilingualism. To this end, we are conducting an electroencephalography (EEG) experiment. Your work as a research assistant will be mentored and supervised primarily by Pablo Bernabeu, and secondarily by the head of our project and the directors of our lab.

rscopus_plus: An extension of the rscopus package

Sometimes it’s useful to do a bibliometric analysis. To this end, the rscopus_plus functions (Bernabeu, 2024) extend the R package rscopus (Muschelli, 2022) to administer the search quota and enable specific searches and comparisons. scopus_search_plus runs rscopus::scopus_search as many times as necessary based on the number of results and the search quota. scopus_search_DOIs gets DOIs from scopus_search_plus, which can then be imported into a reference manager, such as Zotero, to create a list of references.

How to end trial after timeout in jsPsych

I would like to ask for advice regarding a custom plugin for a serial reaction time task, that was created by @vekteo, and is available in Gorilla, where the code can be edited and tested. By default, trials are self-paced, but I would need them to time out after 2,000 ms. I am struggling to achieve this, and would be very grateful if someone could please advise me a bit.

A session logbook for a longitudinal study using conditional formatting in Excel

Longitudinal studies consist of several sessions, and often involve session session conductors. To facilitate the planning, registration and tracking of sessions, a session logbook becomes even more necessary than usual. To this end, an Excel workbook with conditional formatting can help automatise some formats and visualise the progress. Below is an example that is available on OneDrive. To fully access this workbook, it may be downloaded via File > Save as > Download a copy.

Motivating a preregistration (especially in experimental linguistics)

The best argument to motivate a preregistration may be that it doesn’t take any extra time. It just requires frontloading an important portion of the work. As a reward, the paper will receive greater trust from the reviewers and the readers at large. Preregistration is not perfect, but is a lesser evil that reduces the misuse of statistical analysis in science.

Do you speak a Scandinavian language(s) and English, but no other languages? Delta i et EEG-eksperiment

Ved å delta i vårt eksperiment og gjøre noen enkle oppgaver på en datamaskin, kan du bidra til forskning og tjene 250 kr i timen (gavekort). EEG er helt smertefritt. Eksperimentet foregår i Tromsø, ved UiT Norges Arktiske Universitet. Vi ser etter deltakere med følgende egenskaper: ☑ Alder 18–45 år; ☑ Snakker norsk som førstespråk og engelsk flytende. Utenom disse språkene, kan deltakerne også snakke svensk og dansk, men ikke andre språk (utover noen få ord);

Learning how to use Zotero

Is it worth learning how to use a reference management system such as Zotero? Maybe. The hours you invest in learning how to use Zotero (approx. 10 hours) are likely to pay off, as they will save you a lot of time that you would otherwise spend formatting, revising and correcting references. In addition, this skill would become part of your skill set. A great guide Free, online webinars in which you could participate and ask questions

FAQs on mixed-effects models

I am dealing with nested data, and I remember from an article by Clark (1973) that nested should be analysed using special models. I’ve looked into mixed-effects models, and I’ve reached a structure with random intercepts by subjects and by items. Is this fine? In early days, researchers would aggregate the data across these repeated measures to prevent the violation of the assumption of independence of observations, which is one of the most important assumptions in statistics.

FAIR standards for the creation of research materials, with examples

In the fast-paced world of scientific research, establishing minimum standards for the creation of research materials is essential. Whether it's stimuli, custom software for data collection, or scripts for statistical analysis, the quality and transparency of these materials significantly impact the reproducibility and credibility of research. This blog post explores the importance of adhering to FAIR (Findable, Accessible, Interoperable, Reusable) principles, and offers practical examples for researchers, with a focus on the cognitive sciences.

Two-second delay after logger in OpenSesame

The result shows a varying delay of around 2 seconds on average. It would be very helpful for us if we could cut down this delay, as it adds up. To try to achieve this, I reduced the number of variables logged, from the default 363 to 34 important variables. Unfortunately, this change did not result in a reduction of the delay.

Preprocessing the Norwegian Web as Corpus (NoWaC) in R

The present script can be used to preprocess data from a frequency list of the Norwegian as Web Corpus (NoWaC; Guevara, 2010). Before using the script, the frequency list should be downloaded from this URL. The list is described as ‘frequency list sorted primary alphabetic and secondary by frequency within each character’, and this is the direct URL. The download requires signing in to an institutional network. Last, the downloaded file should be unzipped.

A Python inline script for OpenSesame to send EEG triggers via serial port

The OpenSesame user base is skyrocketing but—of course—remains small in comparison to many other user bases that we are used to. Therefore, when developing an experiment in OpenSesame, there are still many opportunities to break the mould. When you need to do something beyond the standard operating procedure, it may take longer to find suitable resources than it takes when a more widespread tool is used. So, why would you still want to use OpenSesame?

How to correctly encode triggers in Python and send them to BrainVision through serial port (useful for OpenSesame and PsychoPy)

I'm sending the triggers in a binary format because Python requires this. For instance, to send the trigger 1, I run the code serialport.write(b'1’). I have succeeded in sending triggers in this way. However, I encounter two problems. First, the triggers are converted in a way I cannot entirely decipher. For instance, when I run the code serialport.write(b'1’), the trigger displayed in BrainVision Recorder is S 49, not S 1 as I would hope (please see Appendix below). Second, I cannot send two triggers with the same code one after the other. For instance, if I run serialport.write(b'1’), a trigger appears in BrainVision Recorder, but if I run the same afterwards (no matter how many times), no trigger appears. I tried to solve these problems by opening the parallel port in addition to the serial port, but the problems persist.

ggplotting power curves from the simr package

The R package ‘simr’ has greatly facilitated power analysis for mixed-effects models using Monte Carlo simulation (which involves running hundreds or thousands of tests under slight variations of the data). The powerCurve function is used to estimate the statistical power for various sample sizes in one go. Since the tests are run serially, they can take a VERY long time; approximately, the time it takes to run the model supplied once (say, a few hours) times the number of simulations (nsim, which should be higher than 200), and times the number of different sample sizes examined.

How to break down colour variable in sjPlot::plot_model into equally-sized bins

Whereas the direction of main effects can be interpreted from the sign of the estimate, the interpretation of interaction effects often requires plots. This task is facilitated by the R package sjPlot. For instance, using the plot_model function, I plotted the interaction between two continuous variables. library(lme4) #> Loading required package: Matrix library(sjPlot) #> Learn more about sjPlot with 'browseVignettes("sjPlot")'. library(ggplot2) theme_set(theme_sjplot()) # Create data partially based on code by Ben Bolker # from https://stackoverflow.

How to map more informative values onto fill argument of sjPlot::plot_model

Whereas the direction of main effects can be interpreted from the sign of the estimate, the interpretation of interaction effects often requires plots. This task is facilitated by the R package sjPlot. For instance, using the plot_model function, I plotted the interaction between a continuous variable and a categorical variable. The categorical variable was passed to the fill argument of plot_model. library(lme4) #> Loading required package: Matrix library(sjPlot) #> Install package "strengejacke" from GitHub (devtools::install_github("strengejacke/strengejacke")) to load all sj-packages at once!

How to visually assess the convergence of a mixed-effects model by plotting various optimizers

To assess whether convergence warnings render the results invalid, or on the contrary, the results can be deemed valid in spite of the warnings, Bates et al. (2023) suggest refitting models affected by convergence warnings with a variety of optimizers. The authors argue that, if the different optimizers produce practically-equivalent results, the results are valid. The allFit function from the ‘lme4’ package allows the refitting of models using a number of optimizers.

Intermixing stimuli from two loops randomly in OpenSesame

I’m developing a slightly tricky design in OpenSesame (a Python-based experiment builder). My stimuli comprise two kinds of sentences that contain different elements, and different numbers of elements. These sentences must be presented word by word. Furthermore, I need to attach triggers to some words in the first kind of sentences but not in the second kind. Last, these kinds of sentences must be intermixed within a block (or a sequence) of trials, because the first kind are targets and the second kind are fillers.

Simultaneously sampling from two variables in jsPsych

I am using jsPsych to create an experiment and I am struggling to sample from two variables simultaneously. Specifically, in each trial, I would like to present a primeWord and a targetWord by randomly sampling each of them from its own variable. I have looked into several resources—such as sampling without replacement, custom sampling and position indices—but to no avail. I’m a beginner at this, so it’s possible that one of these resources was relevant (especially the last one, I think).

Table joins with conditional “fuzzy” string matching in R

Here’s an example of fuzzy-matching strings in R that I shared on StackOverflow. In stringdist_join, the max_dist argument is used to constrain the degree of fuzziness. library(fuzzyjoin) library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union library(knitr) small_tab = data.frame(Food.Name = c('Corn', 'Squash', 'Peppers'), Food.Code = c(NA, NA, NA)) large_tab = data.

A new function to plot convergence diagnostics from lme4::allFit()

When a model has struggled to find enough information in the data to account for every predictor—especially for every random effect—, convergence warnings appear (Brauer & Curtin, 2018; Singmann & Kellen, 2019). In this article, I review the issue of convergence before presenting a new plotting function in R that facilitates the visualisation of the fixed effects fitted by different optimization algorithms (also dubbed optimizers).

Assigning participant-specific parameters automatically in OpenSesame

OpenSesame offers options to counterbalance properties of the stimulus across participants. However, in cases of more involved assignments of session parameters across participants, it becomes necessary to write a bit of Python code in an inline script, which should be placed at the top of the timeline. In such a script, the participant-specific parameters are loaded in from a csv file. Below is a minimal example of the csv file.

Pronominal object clitics in preverbal position are a hard nut to crack for Google Translate

Unlike English, some Romance languages not only allow—but sometimes require—pronominal object clitics in preverbal position (Hanson & Carlson, 2014; Labotka et al., 2023). That is, instead of saying La maestra ha detto il nome (Italian) ‘The teacher has said the name’, Italian allows Il nome lo ha detto la maestra (literally, ‘The name it has said the teacher’), which could translate as ‘The name has been said by the teacher’, ‘The teacher has said the name’, or even ‘It is the teacher that has said the name’.

Specifying version number in OSF download links

In the preparation of projects, files are often downloaded from OSF. It is good to document the URL addresses that were used for the downloads. These URLs can be provided in a code script (see example) or in a README file. Better yet, it’s possible to specify the version of each file in the URL. This specification helps reduce the possibility of inaccuracies later, should any files be modified afterwards.

Covariates are necessary to validate the variables of interest and to prevent bogus theories

The need for covariates—or nuisance variables—in statistical analyses is twofold. The first reason is purely statistical and the second reason is academic. First, the use of covariates is often necessary when the variable(s) of interest in a study may be connected to, and affected by, some satellite variables (Bottini et al., 2022; Elze et al., 2017; Sassenhagen & Alday, 2016). This complex scenario is the most common one due to the multivariate, dynamic, interactive nature of the real world.

Cannot open plots created with brms::mcmc_plot due to lack of discrete_range function

I would like to ask for advice regarding some plots that were created using brms::mcmc_plot(), and cannot be opened in R now. The plots were created last year using brms 2.17.0, and were saved in RDS objects. The problem I have is that I cannot open the plots in R now because I get an error related to a missing function. I would be very grateful if someone could please advise me if they can think of a possible reason or solution.

A table of results for Bayesian mixed-effects models: Grouping variables and specifying random slopes

Here I share the format applied to tables presenting the results of Bayesian models in Bernabeu (2022). The sample table presents a mixed-effects model that was fitted using the R package ‘brms’ (Bürkner et al., 2022).

A table of results for frequentist mixed-effects models: Grouping variables and specifying random slopes

Here I share the format applied to tables presenting the results of frequentist models in Bernabeu (2022). The sample table presents a mixed-effects model that was fitted using the R package ‘lmerTest’ (Kuznetsova et al., 2022).

Plotting two-way interactions from mixed-effects models using alias variables

Whereas the direction of main effects can be interpreted from the sign of the estimate, the interpretation of interaction effects often requires plots. This task is facilitated by the R package sjPlot (Lüdecke, 2022). In Bernabeu (2022), the sjPlot function called plot_model served as the basis for the creation of some custom functions. One of these functions is alias_interaction_plot, which allows the plotting of interactions between a continuous variable and a categorical variable.

Plotting two-way interactions from mixed-effects models using ten or six bins

Whereas the direction of main effects can be interpreted from the sign of the estimate, the interpretation of interaction effects often requires plots. This task is facilitated by the R package sjPlot (Lüdecke, 2022). In Bernabeu (2022), the sjPlot function called plot_model served as the basis for the creation of some custom functions. Two of these functions are deciles_interaction_plot and sextiles_interaction_plot. These functions allow the plotting of interactions between two continuous variables.

Why can't we be friends? Plotting frequentist (lmerTest) and Bayesian (brms) mixed-effects models

Frequentist and Bayesian statistics are sometimes regarded as fundamentally different philosophies. Indeed, can both qualify as philosophies or is one of them just a pointless ritual? Is frequentist statistics only about $p$ values? Are frequentist estimates diametrically opposed to Bayesian posterior distributions? Are confidence intervals and credible intervals irreconcilable? Will R crash if lmerTest and brms are simultaneously loaded?

Bayesian workflow: Prior determination, predictive checks and sensitivity analyses

This post presents a run-through of a Bayesian workflow in R. The content is closely based on Bernabeu (2022), which was in turn based on lots of other references, also cited here.

Avoiding (R) Markdown knitting errors using knit_deleting_service_files()

The function knit_deleting_service_files() helps avoid (R) Markdown knitting errors caused by files and folders remaining from previous knittings (e.g., manuscript.tex, ZHJhZnQtYXBhLlJtZA==.Rmd, manuscript.synctex.gz). The only obligatory argument for this function is the name of a .Rmd or .md file. The optional argument is a path to a directory containing this file. The function first offers szeleting potential service files and folders in the directory. A confirmation is required in the console (see screenshot below). Next, the document is knitted. Last, the function offers deleting potential service files and folders again.

Walking the line between reproducibility and efficiency in R Markdown: Three methods

As technology and research methods advance, the data sets tend to be larger and the methods more exhaustive. Consequently, the analyses take longer to run. This poses a challenge when the results are to be presented using R Markdown. One has to balance reproducibility and efficiency. On the one hand, it is desirable to keep the R Markdown document as self-contained as possible, so that those who may later examine the document can easily test and edit the code.

Tackling knitting errors in R Markdown

When knitting an R Markdown document after the first time, errors may sometimes appear. Three tips are recommended below.

  1. Close PDF reader window When the document is knitted through the ‘Knit’ button, a PDF reader window opens to present the result. Closing this window can help resolve errors. 2. Delete service files Every time the Rmd is knitted, some service files are created. Some of these files have the ‘.

Parallelizing simr::powercurve() in R

The powercurve function from the R package ‘simr’ (Green & MacLeod, 2016) can incur very long running times when the method used for the calculation of p values is Kenward-Roger or Satterthwaite (see Luke, 2017). Here I suggest three ways for cutting down this time. Where possible, use a high-performance (or high-end) computing cluster. This removes the need to use personal computers for these long jobs. In case you’re using the fixed() parameter of the powercurve function, and calculating the power for different effects, run these at the same time (‘in parallel’) on different machines, rather than one after another.

Brief Clarifications, Open Questions: Commentary on Liu et al. (2018)

Liu et al. (2018) present a study that implements the conceptual modality switch (CMS) paradigm, which has been used to investigate the modality-specific nature of conceptual representations (Pecher et al., 2003). Liu et al.‘s experiment uses event-related potentials (ERPs; similarly, see Bernabeu et al., 2017; Collins et al., 2011; Hald et al., 2011, 2013). In the design of the switch conditions, the experiment implements a corpus analysis to distinguish between purely-embodied modality switches and switches that are more liable to linguistic bootstrapping (also see Bernabeu et al.

Collaboration while using R Markdown

In a highly recommendable presentation available on Youtube, Michael Frank walks us through R Markdown. Below, I loosely summarise and partly elaborate on Frank's advice regarding collaboration among colleagues, some of whom may not be used to R Markdown (see relevant time point in Frank's presentation). The first way is using GitHub, which has a great version control system, and even allows the rendering of Markdown text, if the file is given the extension ‘.

Notes about punctuation in formal writing

When writing formal pieces, some pitfalls in the punctuation are easy to avoid once you know them. Punctuation marks such as the comma, the semi-colon, the colon and the period are useful for organising phrases and clauses, facilitating the reading, and disambiguating. However, these marks are also liable to underuse, as in the case of run-on sentences; misuse, as in the comma splice; and overuse, as it often happens with the Oxford comma.

Stray meetings in Microsoft Teams

Unwanted, stranded meetings, overlapping with a general one in a channel, can occur when people click on the Meet (now)/📷 button, instead of clicking on the same Join button in the chat field. This may especially happen to those who reach the channel first, or who cannot see the Join button in the chat field because this field has been taken up by messages.

R Markdown amidst Madison parks

This document is part of teaching materials created for the workshop ‘Open data and reproducibility v2.1: R Markdown, dashboards and Binder’, delivered at the CarpentryCon 2020 conference. The purpose of this specific document is to practise R Markdown, including basic features such as Markdown markup and code chunks, along with more special features such as cross-references for figures, tables, code chunks, etc. Since this conference was originally going to take place in Madison, let's look at some open data from the City of Madison.

How to engage Research Group Leaders in sustainable software practices

There is an increasing number of training courses introducing early career researchers to sustainable software practices but relatively little aimed at Research Group Leaders and Principal Investigators. Expecting group leaders to personally acquire such skills through training such as a two-day Carpentries workshop is unrealistic, as these require a significant time investment and are less directly applicable in the role of research director. In addition, many group leaders would not consider their group as outputting software, or are less aware of the full range of benefits that sustainable practice brings and will thus be less likely to signpost such training to their team members. Even where they do identify benefits, they may have concerns about releasing group software or may feel overwhelmed by the potential scale of the task, especially with respect to legacy projects.

Incentives for good research software practices

Software is increasingly becoming recognised as fundamental to research. In a 2014 survey of UK researchers undertaken by the Institute, 7 out of 10 researchers supported the view that it would be impossible to conduct research without software. As software continues to underpin more research activities, we must engage a variety of stakeholders to incentivise the uptake of best practice in software development to ensure the quality of research software keeps pace with the research it supports.

Data is present: Workshops and datathons

This project offers free activities to learn and practise reproducible data presentation. Pablo Bernabeu organises these events in the context of a Software Sustainability Institute Fellowship. Programming languages such as R and Python offer free, powerful resources for data processing, visualisation and analysis. Experience in these programs is highly valued in data-intensive disciplines. Original data has become a public good in many research fields thanks to cultural and technological advances. On the internet, we can find innumerable data sets from sources such as scientific journals and repositories (e.g., OSF), local and national governments, non-governmental organisations (e.g., data.world), etc. Activities comprise free workshops and datathons.

Event-related potentials: Why and how I used them

Event-related potentials (ERPs) offer a unique insight in the study of human cognition. Let's look at their reason-to-be for the purposes of research, and how they are defined and processed. Most of this content is based on my master's thesis, which I could fortunately conduct at the Max Planck Institute for Psycholinguistics (see thesis or conference paper). Electroencephalography The brain produces electrical activity all the time, which can be measured via electrodes on the scalp—a method known as electroencephalography (EEG).

Naive principal component analysis in R

Principal Component Analysis (PCA) is a technique used to find the core components that underlie different variables. It comes in very useful whenever doubts arise about the true origin of three or more variables. There are two main methods for performing a PCA: naive or less naive. In the naive method, you first check some conditions in your data which will determine the essentials of the analysis. In the less-naive method, you set those yourself based on whatever prior information or purposes you had. The ‘naive’ approach is characterized by a first stage that checks whether the PCA should actually be performed with your current variables, or if some should be removed. The variables that are accepted are taken to a second stage which identifies the number of principal components that seem to underlie your set of variables.

Review of the Landscape Model of reading: Composition, dynamics and application

Throughout the 1990s, two opposing theories were used to explain how people understand texts, later bridged by the Landscape Model of reading (van den Broek, Young, Tzeng, & Linderholm, 1999). A review is offered below, including a schematic representation of the Landscape Model. Memory-based view The memory-based view presented reading as an autonomous, unconscious, effortless process. Readers were purported to achieve an understanding of a text as a whole by combining the concepts, and implications readily afforded, in the text with their own background knowledge (Myers & O’Brien, 1998; O’Brien & Myers, 1999).

Denken met woorden

Het menselijk brein begrijpt de wereld om ons heen op een taalkundige en zintuiglijke manier. Pablo Bernabeu (Language and Communication) onderzocht waarom dat zo is.

At Greg, 8 am

The single dependent variable, RT, was accompanied by other variables which could be analyzed as independent variables. These included Group, Trial Number, and a within-subjects Condition. What had to be done first off, in order to take the usual table? The trials!

Modality switch effects emerge early and increase throughout conceptual processing: Evidence from ERPs

Research has extensively investigated whether conceptual processing is modality-specific—that is, whether meaning is processed to a large extent on the basis of perceptual and motor affordances (Barsalou, 2016). This possibility challenges long-established theories. It suggests a strong link between physical experience and language which is not borne out of the paradigmatic arbitrariness of words (see Lockwood, Dingemanse, & Hagoort, 2016). Modality-specificity also clashes with models of language that have no link to sensory and motor systems (Barsalou, 2016).

The case for data dashboards: First steps in R Shiny

Dashboards for data visualisation, such as R Shiny and Tableau, allow the interactive exploration of data by means of drop-down lists and checkboxes, with no coding required from the final users. These web applications run on internet browsers, allowing for three viewing modes, catered to both analysts and the public at large: (1) private viewing (useful during analysis), (2) selective sharing (used within work groups), and (3) internet publication. Among the available platforms, R Shiny and Tableau stand out due to being relatively accessible to new users. Apps serve a broad variety of purposes. In science and beyond, these apps allow us to go the extra mile in sharing data. Alongside files and code shared in repositories, we can present the data in a website, in the form of plots or tables. This facilitates the public exploration of each section of the data (groups, participants, trials…) to anyone interested, and allows researchers to account for their proceeding in the analysis.

EEG error: datasets missing channels

Most of the recordings are perfectly fine, but a few present a big error. Out of 64 original electrodes, only two appear. These are the right mastoid (RM) and the left eye sensor (LEOG). Both are bipolar electrodes. RM is to be re-referenced to the online reference electrode, while LEOG is to be re-referenced to the right eye electrode.

Modality exclusivity norms for 747 properties and concepts in Dutch: A replication of English

This study is a cross-linguistic, conceptual replication of Lynott and Connell’s (2009, 2013) modality exclusivity norms. The properties and concepts tested therein were translated into Dutch, and independently rated and analyzed (Bernabeu, 2018).

Conceptual modality switch effect measured at first word?

Traditionally, the second word presented (whether noun or adjective) has been the point of measure, both for RTs and ERPs. Yet, could it be better to measure at the first word?