WebVTT caption transcription app

This open-source, R-based web application allows the conversion of video captions (subtitles) from the Web Video Text Tracks (WebVTT) Format into plain texts. For this purpose, users upload a WebVTT file with the extension .vtt or .txt (examples available here and here). Automatically, metadata such as timestamps are removed, and the text is formatted into a paragraph. The result is displayed on the website, and can be downloaded as .docx and .txt documents. Overall, this application serves to improve the accessibility of video captions.

🌐  The web application can be launched here.

The data is only available to the user, and is deleted when the website is closed.

Questions and suggestions can be submitted as issues or emailed to . The app can be extended via pull requests.

Developer: Pablo Bernabeu (Dept. Psychology, Lancaster University). Licence: Creative Commons Attribution 4.0 International.

Code details

The core of the application is in the index.Rmd script, which uses ‘regular expressions’ to process the VTT file. In turn, that script draws on another one to enable the download of .docx documents. Last, the latter script in turn uses a Word template.


Avatar
Pablo Bernabeu
Postdoctoral fellow at UiT –
The Arctic University of Norway

After doing a research master's, I became a PhD student and graduate teaching assistant in Psychology at Lancaster University, where I investigated how conceptual processing—that is, the comprehension of the meaning of words—is supported by linguistic and sensorimotor brain systems, and how research on this topic is influenced by methodological aspects such as the operationalisation of variables and the sample size of experiments. Currently, I am a postdoctoral fellow at UiT The Arctic University of Norway, where I am investigating the behavioural and neural underpinnings of multilingualism. Throughout my research, I have used methods such as behavioural and electroencephalographic experiments, corpus analysis, statistics and programming. Research materials at https://osf.io/25u3x. CV available here.

comments powered by Disqus