Avoiding (R) Markdown knitting errors using knit_deleting_service_files()

The function knit_deleting_service_files() helps avoid (R) Markdown knitting errors caused by files and folders remaining from previous knittings (e.g., manuscript.tex, ZHJhZnQtYXBhLlJtZA==.Rmd, manuscript.synctex.gz). The only obligatory argument for this function is the name of a .Rmd or .md file. The optional argument is a path to a directory containing this file. The function first offers deleting potential service files and folders in the directory. A confirmation is required in the console (see screenshot below). Next, the document is knitted. Last, the function offers deleting potential service files and folders again.

A new function to plot convergence diagnostics from lme4::allFit()

When a model has struggled to find enough information in the data to account for every predictor---especially for every random effect---, convergence warnings appear (Brauer & Curtin, 2018; Singmann & Kellen, 2019). In this article, I review the issue of convergence before presenting a new plotting function in R that facilitates the visualisation of the fixed effects fitted by different optimization algorithms (also dubbed optimizers).

Walking the line between reproducibility and efficiency in R Markdown: Three methods

As technology and research methods advance, the data sets tend to be larger and the methods more exhaustive. Consequently, the analyses take longer to run. This poses a challenge when the results are to be presented using R Markdown. One has to balance reproducibility and efficiency. On the one hand, it is desirable to keep the R Markdown document as self-contained as possible, so that those who may later examine the document can easily test and edit the code.

Tackling knitting errors in R Markdown

When knitting an R Markdown document after the first time, errors may sometimes appear. Three tips are recommended below. 1. Close PDF reader window When the document is knitted through the ‘Knit’ button, a PDF reader window opens to present the result. Closing this window can help resolve errors. 2. Delete service files Every time the Rmd is knitted, some service files are created. Some of these files have the ‘.

Parallelizing simr::powercurve() in R

The powercurve function from the simr package in R (Green & MacLeod, 2016) can incur very long running times when the method used for the calculation of p values is Kenward-Roger or Satterthwaite (see Luke, 2017). Here I suggest three ways for cutting down this time. Where possible, use a high-performance (or high-end) computing cluster. This removes the need to use personal computers for these long jobs. In case you’re using the fixed() parameter of the powercurve function, and calculating the power for different effects, run these at the same time (‘in parallel’) on different machines, rather than one after another.

Brief Clarifications, Open Questions: Commentary on Liu et al. (2018)

Liu et al. (2018) present a study that implements the conceptual modality switch (CMS) paradigm, which has been used to investigate the modality-specific nature of conceptual representations (Pecher et al., 2003). Liu et al.‘s experiment uses event-related potentials (ERPs; similarly, see Bernabeu et al., 2017; Collins et al., 2011; Hald et al., 2011, 2013). In the design of the switch conditions, the experiment implements a corpus analysis to distinguish between purely-embodied modality switches and switches that are more liable to linguistic bootstrapping (also see Bernabeu et al.

Collaboration while using R Markdown

In a highly recommendable presentation available on Youtube, Michael Frank walks us through R Markdown. Below, I loosely summarise and partly elaborate on Frank's advice regarding collaboration among colleagues, some of whom may not be used to R Markdown (see relevant time point in Frank's presentation). The first way is using GitHub, which has a great version control system, and even allows the rendering of Markdown text, if the file is given the extension ‘.

Notes about punctuation in formal writing

When writing formal pieces, some pitfalls in the punctuation are easy to avoid once you know them. Punctuation marks such as the comma, the semi-colon, the colon and the period are useful for organising phrases and clauses, facilitating the reading, and disambiguating. However, these marks are also liable to underuse, as in the case of run-on sentences; misuse, as in the comma splice; and overuse, as it often happens with the Oxford comma.

Stray meetings in Microsoft Teams

Unwanted, stranded meetings, overlapping with a general one in a channel, can occur when people click on the Meet (now)/📷 button, instead of clicking on the same Join button in the chat field. This may especially happen to those who reach the channel first, or who cannot see the Join button in the chat field because this field has been taken up by messages.

R Markdown amidst Madison parks

This document is part of teaching materials created for the workshop 'Open data and reproducibility v2.1: R Markdown, dashboards and Binder', delivered at the CarpentryCon 2020 conference. The purpose of this specific document is to practise R Markdown, including basic features such as Markdown markup and code chunks, along with more special features such as cross-references for figures, tables, code chunks, etc. Since this conference was originally going to take place in Madison, let's look at some open data from the City of Madison.

What's in a fluke? The problem of trust and distrust

The label 'fluke' may in principle be skewed by the eye of the beholder, the mind of the perceiver and the availability or lack of data.

How to engage Research Group Leaders in sustainable software practices

Incentives for good research software practices

Data is present: Workshops and datathons

This project offers free activities to learn and practise reproducible data presentation. Pablo Bernabeu organises these events in the context of a Software Sustainability Institute Fellowship. Programming languages such as R and Python offer free, powerful resources for data processing, visualisation and analysis. Experience in these programs is highly valued in data-intensive disciplines. Original data has become a public good in many research fields thanks to cultural and technological advances. On the internet, we can find innumerable data sets from sources such as scientific journals and repositories (e.g., OSF), local and national governments, non-governmental organisations (e.g., data.world), etc. Activities comprise free workshops and datathons.

Event-related potentials: Why and how I used them

Event-related potentials (ERPs) offer a unique insight in the study of human cognition. Let's look at their reason-to-be for the purposes of research, and how they are defined and processed. Most of this content is based on my master's thesis, which I could fortunately conduct at the Max Planck Institute for Psycholinguistics (see thesis or conference paper). Electroencephalography The brain produces electrical activity all the time, which can be measured via electrodes on the scalp—a method known as electroencephalography (EEG).

Naive principal component analysis in R

Principal Component Analysis (PCA) is a technique used to find the core components that underlie different variables. It comes in very useful whenever doubts arise about the true origin of three or more variables. There are two main methods for performing a PCA: naive or less naive. In the naive method, you first check some conditions in your data which will determine the essentials of the analysis. In the less-naive method, you set those yourself based on whatever prior information or purposes you had. The 'naive' approach is characterized by a first stage that checks whether the PCA should actually be performed with your current variables, or if some should be removed. The variables that are accepted are taken to a second stage which identifies the number of principal components that seem to underlie your set of variables.

Review of the Landscape Model of reading: Composition, dynamics and application

Throughout the 1990s, two opposing theories were used to explain how people understand texts, later bridged by the Landscape Model of reading (van den Broek, Young, Tzeng, & Linderholm, 1999). A review is offered below, including a schematic representation of the Landscape Model. Memory-based view The memory-based view presented reading as an autonomous, unconscious, effortless process. Readers were purported to achieve an understanding of a text as a whole by combining the concepts, and implications readily afforded, in the text with their own background knowledge (Myers & O’Brien, 1998; O’Brien & Myers, 1999).

At Greg, 8 am

The single dependent variable, RT, was accompanied by other variables which could be analyzed as independent variables. These included Group, Trial Number, and a within-subjects Condition. What had to be done first off, in order to take the usual table? The trials!

Modality switch effects emerge early and increase throughout conceptual processing: Evidence from ERPs

Research has extensively investigated whether conceptual processing is modality-specific—that is, whether meaning is processed to a large extent on the basis of perceptual and motor affordances (Barsalou, 2016). This possibility challenges long-established theories. It suggests a strong link between physical experience and language which is not borne out of the paradigmatic arbitrariness of words (see Lockwood, Dingemanse, & Hagoort, 2016). Modality-specificity also clashes with models of language that have no link to sensory and motor systems (Barsalou, 2016).

The case for data dashboards: First steps in R Shiny

Dashboards for data visualisation, such as R Shiny and Tableau, allow the interactive exploration of data by means of drop-down lists and checkboxes, with no coding required from the final users. These web applications run on internet browsers, allowing for three viewing modes, catered to both analysts and the public at large: (1) private viewing (useful during analysis), (2) selective sharing (used within work groups), and (3) internet publication. Among the available platforms, R Shiny and Tableau stand out due to being relatively accessible to new users. Apps serve a broad variety of purposes. In science and beyond, these apps allow us to go the extra mile in sharing data. Alongside files and code shared in repositories, we can present the data in a website, in the form of plots or tables. This facilitates the public exploration of each section of the data (groups, participants, trials...) to anyone interested, and allows researchers to account for their proceeding in the analysis.

Modality exclusivity norms for 747 properties and concepts in Dutch: A replication of English

The Louisiana-Minnesota-Texas crisis across media and time: A big data exercise

Racism has long been ingrained in human societies. Ancient Greek Aristotle already claimed that non-Greeks were slaves by nature, as they easily submitted to despotic government (Reilly, Kaufman, & Bodino, 2002). This study focuses on racism in the United States, which extends from the foundation of the country, when black people were generally born into slavery, and were at any rate regarded as an inferior people. US racism stands out globally for two reasons. First, the country has played a hegemonic part in the World since soon after its foundation. Second, the US is regarded as the most advanced society technology-wise, as it sets the minutes for the technology sector worldwide. In spite of these advantages, the country has long suffered the plague of widespread racism. Indeed, the abolition of slavery in the mid-nineteenth century did not grant equal citizen rights to the black population. Over time, the black population started to confront this situation. Especially the mid-nineteenth century saw large uprisings and a patent division of different societal sectors, as reflected in literary works such as Ellison’s 'Invisible Man' (1952). Inequality and confrontation about racism has extended to date, and the costs thereof have been large in terms of lives and otherwise (Feagin, 2004).