Chapter 8 The ROCK standard
The ROCK standard has been developed as an open standard for qualitative data analysis. It follows the principles that also guided the development of the Markdown and YAML standards: prioritizing human-readability while retaining machine-readability. The aim of the ROCK is to provide a standard that enables researchers to exchange data and analyses in a format that is readable even without running any specific software. In other words, coded transcripts should be readable as is.
This open standard enables development of programs or scripts to perform specific functions that are not yet present in any of the existing applications that support the ROCK format. In addition, this enables all existing qualitative data analysis programs to import data files in this format and to export to this format.
In this chapter, the vocabulary explained in Chapter 7 is used to describe the ROCK standard. Qualitative data files that implement the ROCK standard can be recognized by their extension:
.rock. These files normally follow the conventions set out in this chapter.
Codes are by default any string of characters (specifically, lower or uppercase letters, digits, periods, underscores, larger-than signs, and dashes) in between two pairs of square brackets (
]]). This is described by the regular expression
\[\[([a-zA-Z0-9._>-]+)\]\]; note that the escaping backslashes must be escaped themselves by prepending a second backslash when specifying this regular expression in R. Codes are designated per utterance, or in other words, per line. As many codes can be specified per line as one wishes. For example, see these two lines (utterances):
So what went right [[reflection-positive]] What went wrong [[reflection-negative]]
The first line is coded with
reflection-positive, and the second line with code
8.0.2 Structuring inductive codes
When engaging in inductive coding (i.e. when not working with a prespecified code structure, but instead developing the code structure as one goes along; see the section below re: deductive coding), it can be desirable to structure the codes hierarchically. For example, perhaps a researcher wants to specify a parent code such as
reflection with two child codes such as
negative. This helps one to identify patterns in the data, and makes it possible to easily extract all utterances coded as any type of reflection. By default, the marker that can be used to structure inductive codes is the greater than sign (specified by the regular expression
>). For example, see the same fragment but coded in two levels:
So what went right [[reflection>positive]] What went wrong [[reflection>negative]]
When this source is parsed by
rock, it will recognize these deductive codes and their structure, and it will generate the corresponding hierarchical coding structure, as illustrated in the more extensive example below.
8.0.3 Specifying identifiers
It is often desirable to attach specific attributes to utterances. For example, one may want to compare the patterns in codes between different categories of participants, such as those who do and do not own a car, or those that listen to progressive metal versus those that listen to psychedelic trance. Instead of coding all utterances with all relevant attributes, instead, it is possible to specify identifier to easily link utterances to characteristics of the data provision (such as data providers, for example participants, or the moment of data collection, for example daytime or nighttime, or winter or summer, or the location of data collection, such as in a busy place or in a silent office).
This can be done by specifying identifiers. These are again specified using regular expressions. By default, two types of identifiers are specified: case identifiers and stanza identifiers. They are again specified using two pairs of square brackets, but this time, the opening brackets are immediately followed by a string of identifying characters (the ‘identifier identifier’, so to speak), followed by an equals sign, and then by the unique identifier. This may seem a bit abstract; it will become clearer as we look at the first example.
22.214.171.124 Case identifiers
Case identifiers can be used to link utterances to data providers, such as participants. Their ‘identifier identifier’ is
cid, and by default, their full regular expression is
\[\[cid=([a-zA-Z0-9._-]+)\]\]. A source excerpt coded with only case identifiers may look like this:
CAIAPHAS: No, wait! We need a more permanent solution to our problem. [[cid=1]] ANNAS: What then to do about Jesus of Nazareth? Miracle wonderman, hero of fools. [[cid=2]] PRIEST THREE: No riots, no army, no fighting, no slogans. [[cid=3]] CAIAPHAS: One thing I'll say for him -- Jesus is cool. [[cid=1]] ANNAS: We dare not leave him to his own devices. His half-witted fans will get out of control. [[cid=2]]
(Note that in this example, the names of the participants were retained; normally, the researcher would anonymize the transcripts so as to allow publication of the coded transcripts.)
rock parses this source, it will know that the first and fourth utterances belong to the same case, as do the second and fifth. The attributes specified for these cases will then be attached to these utterances (see the section about metadata below).
126.96.36.199 Stanza identifiers
A stanza is a unit of analysis in ENA analysis (see the glossary for the exact definition).
8.0.4 Specifying deductive coding structures
When a researcher works with a prespecified coding structure (i.e. engages in deductive coding), they only use codes that were determined a priori. Like in inductive coding, there are often multiple levels in such a coding structure, with the codes organised hierarchically. To efficiently be able to collapse codes to higher levels,
rock needs to know the deductive coding structure. This can be specified using YAML fragments in the sources. YAML fragments are, by default, delimited by two lines that each contain only three dashes (
---). Between those delimiters, YAML (a recursive acronym that stands for ‘YAML ain’t markup language’) can be specified. Specifically, in YAML terminology, each fragment should be a sequence of mappings that is named
The code tree specified in the section on inductive coding, for example, can be efficiently specified as a deductive coding structure like this:
--- codes: - id: reflection children: - id: positive - id: negative ---
If all children of a code are so-called ‘leaves’ (i.e. in the code tree, they have no children of their own^) they can be specified more efficiently:
--- codes: - id: reflection children: ["positive", "negative"] ---
rock parses the sources, it will collect all such code specifications and combined them into one coding three using each code’s identifiers. It is possible to specify a parent in other code specification fragment by adding the field
parentId. For example, in another source, we could add this fragment:
--- codes: - id: neutral parentId: reflection ---
This would add
neutral as a sibling to
8.0.5 Specifying metadata
8.1.1 Section breaks
So what went right What went wrong ---paragraph-break--- Was it a story or was it a song ---paragraph-break--- Was it over night Or did it take you long ---paragraph-break--- Was knowing your weakness what made you strong
Source excerpt as example of section breaks (lyrics from Smiley Faces by Gnarls Barclay)
CAIAPHAS No, wait! We need a more permanent solution to our problem. ANNAS What then to do about Jesus of Nazareth? Miracle wonderman, hero of fools. PRIEST THREE No riots, no army, no fighting, no slogans. CAIAPHAS One thing I'll say for him -- Jesus is cool. ANNAS We dare not leave him to his own devices. His half-witted fans will get out of control. PRIESTS But how can we stop him? His glamour increases by leaps every moment; he's top of the poll. CAIAPHAS I see bad things arising. The crowd crown him king; which the Romans would ban. I see blood and destruction, Our elimination because of one man. Blood and destruction because of one man. ALL (inside) Because, because, because of one man. CAIAPHAS Our elimination because of one man. ALL (inside) Because, because, because of one, 'cause of one, 'cause of one man. PRIEST THREE What then to do about this Jesus-mania? ANNAS How do we deal with a carpenter king? PRIESTS Where do we start with a man who is bigger Than John was when John did his baptism thing? CAIAPHAS Fools, you have no perception! The stakes we are gambling are frighteningly high! We must crush him completely, So like John before him, this Jesus must die. For the sake of the nation, this Jesus must die.
This Jesus Must Die by Andrew Lloyd Webber