Chapter 8 The rock R package

The rock R package implements the ROCK standard for qualitative data analysis. It is an extension to R, a program that was originally a statistical programming language. R is not only open source, but also has a flexible infastructure allowing easy extension with user-contributed packages. Therefore, R is quickly becoming a multipurpose scientific toolkit, and one of its tools is the rock package.

When using R, most people use RStudio, a so-called integrated development environment. It has many features that make using R much more userfriendly and efficient. In this book, where we refer to using R, we actually mean using R through RStudio. Both R and RStudio are Free/Libre Open Source Software (FLOSS) solutions. This means that they are free to download and install in perpetuity.

8.1 Downloading and installing R and RStudio

Because RStudio makes using R considerably more userfriendly (and pretty), in this book, we will always use R through RStudio. Therefore, throughout this book, when we refer to R, we actually mean using R through RStudio.

R can be downloaded from https://cloud.r-project.org/:4 click the “Download R for …” link that matches your operating system, and follow the instructions to download the right version. You don’t have to start R - it just needs to be installed on your system. RStudio will normally find it on its own.

RStudio can be downloaded from https://www.rstudio.com/products/rstudio/download/. Once it is installed, you can start it, in which case you should see something similar to what is shown in Figure 8.1.5

The RStudio integrated development interface (IDE).

Figure 8.1: The RStudio integrated development interface (IDE).

R itself lives in the bottom-left pane, the console. Here, you can interact directly with R. You can open R scripts in the top-left pane: these are text files with the commands you want R to execute. The top-right pane contains the Environment tab, which shows all loaded datasets and variables; the History tab, which shows the commands you used; and the Connections and Build tabs, which you will not need. The bottom-right pane contains a Files tab, showing files on your computer; a Plots tab, which shows plots you created; a Packages tab, which shows the packages you have installed; a Help tab, which shows help ages about specific functions; and a Viewer tab, which can show HTML content that was generated in R.

8.2 Downloading and installing the rock package

The rock package can be installed by going to the console (bottom-left tab) and typing:

This will connect to the Comprehensive R Archive Network (CRAN) and download and install the rock package. If you feel adventurous, you can instead install the one of the two development versions. One is the most current production version, and the other is the development (‘dev’) version. The most current production version will generally be as stable as versions on CRAN, and will contain more features. This version will contain all features discussed in this book. The dev version contains work on new features. This also means, however, that it may contain bugs.

To conveniently install the most recent production and dev versions, another package exists called remotes. You can install this using this command:

Then, to install the most up-to-date production version, use:

And to install the current dev version, use:

More information about the rock package can be found at its so-called pkgdown website, which is located at http://r-packages.gitlab.io/rock.

8.3 Functions in the rock package

8.3.1 clean_source and clean_sources

Sometimes, sources are a bit messy.6 In such cases, it can be efficient to preprocess them and perform some search and replace actions. This can be done for one or multiple source files using clean_source (for one file) and clean_sources (for multiple files; it basically just calls clean_transcript for multiple files).

For example, a researcher will often want every sentence, as transcribed, to be on its own line (as lines correspond to utterances). In fact, this is the basic function of the clean_source function: by default, if used without other arguments, they try to (more or less smartly) split a transcript such that each transcribed sentence (as marked by a period (.), a question mark (?), an exclamation mark (!), or an ellipsis ()) ends up on its own line. Before doing this, clean_source replaces all occurrences of exactly consecutive periods (..) with one period, all occurrences of four or more consecutive periods with three periods, and all occurrences of three or more newlines (\n) with two newlines.

But this function can also be used to perform additional (or other) replacements. For example, imagine that a transcriber used a dash at the beginning of a line, followed by a space, to indicate when a person starts talking, like this:

- Something said by one speaker
- Something said by another speaker

To easily group all utterances by the same person together, it would be convenient if this was expressed in the source file in a way that fits with ROCK’s conventions. That sequence of characters (actually a newline character (\n) followed by a dash (-) followed by a whitespace character (\s)) can be converted into section break ‘---turn-of-talk---’ with this command:

This will change those that bit of transcript into:

---turn-of-talk---
Something said by one speaker
---turn-of-talk---
Something said by another speaker

(You can copy-paste the command above into R and test this, assuming you have the rock package installed. Note that by default, R doesn’t print newline characters as newline characters. To show newline characters as newlines, wrap the command in the cat command.)

To also maintain the default replacements, more can be added by specifying them in argument extraReplacements instead of replacementsPre (or replacementsPost). For clean_source, as the first argument (input), either a character vector (like in the example above) or a path to a file can be specified, in which case the files contents will be read. If the second argument (outputFile) is specified, the result is saved to that file; if not, it is returned (and printed by R).

8.3.1.1 A word of caution

If you use this function to clean one or more transcripts, make sure that whenever you edit the outputFile, you save it under another name! Otherwise, rerunning the script to clean the transcripts will overwrite your edits. By default, the rock option “preventOverwriting” is set to TRUE, so by default, if a file already exists on disk, it is never overwritten. You can change this behavior for one function by specifying preventOverwriting=FALSE as a function argument. You can also change this for all functions by changing the option, with the following command:


  1. Yes, that page looks a bit outdated.

  2. It is easy to change RStudio’s appearance; simply open the options dialog by opening the Tools menu and then selecting the Global Options; in section Appearance, the theme can be selected.

  3. Well, they are messy more often than not, unfortunately.