Extended multilevel formula syntax and types of clusters
So far we have looked at simple hierachical structures. For example, we have considered repeated measures on subjects at two time points, or over the course of several days.
Multilevel models allow us take complex hierarchical structures into account when modelling data—for instance. we might have data from pupils nested within classes and schools, we might have two separate grouping variable that are crossed, such as subjects and word stimuli in an experiment.
Imagine that we are testing 3 subjects in an experiment, and each subject is shown each of 10 stimuli.
subject_id <- str_c("subj_", 1:3)
item_id <- 1:10
crossing(subject_id, item_id)
# A tibble: 30 x 2
subject_id item_id
<chr> <int>
1 subj_1 1
2 subj_1 2
3 subj_1 3
4 subj_1 4
5 subj_1 5
6 subj_1 6
7 subj_1 7
8 subj_1 8
9 subj_1 9
10 subj_1 10
# … with 20 more rows
In this case we might want to take into account that both subjects and items can be considered as sources of variance, and that the observations can be clustered by both subjects and item. Every subject and every item appears multiple times in the dataset, and each subject is presented with all the items. In this case, subjects and items are crossed.
In a traditional analysis, it is commonplace to aggregate over one of the clusters while computing estimates for the other. For instance, when computing subject-specific means, we might simply ignore the items by aggregating over them.
However, this can lead to aggregation artefacts, and systematically ignores aources of variation (Baayen, Davidson, and Bates 2008).
In a multilevel model, we can instead simultaneously model both sources of variability as separate cluster.
If we are predicting a response
variable in different conditions, we can include these crossed varying effects using the following notation.
response ~ 1 + condition + (1 | subject) + (1 | item)
or, equivalently
response ~ 1 + condition + (1 | subject + item)
Note that here the intercept is automatically added to the formula.
response ~ condition + (condition | subject) + (condition | item)
For example, Wagenmakers and Brown (2007) had subjects perform a lexical decision task in two conditions, speeded
and accuracy
, in which the instructions were to decide either as quickly or as accurately as posssible whether a string of letters was a word or a non-word. The response variable were response times and choices.
library(rtdists)
data(speed_acc)
speed_acc <- speed_acc %>%
as_tibble() |>
select(-censor, -block)
df_speed_acc <- speed_acc %>%
# remove rt under 180 ms and above 3000 ms
filter(rt > 0.18, rt < 3) %>%
# convert to character
mutate(across(c(stim_cat, response), as.character),
correct = as.numeric(stim_cat == response),
across(c(stim_cat, response), as_factor))
df_speed_acc
# A tibble: 31,355 x 8
id condition stim stim_cat frequency response rt correct
<fct> <fct> <fct> <fct> <fct> <fct> <dbl> <dbl>
1 1 speed 5015 nonword nw_low nonword 0.7 1
2 1 speed 3623 word very_low nonword 0.392 0
3 1 speed 6481 nonword nw_very_low nonword 0.46 1
4 1 speed 3305 word very_low word 0.455 1
5 1 speed 2244 word low nonword 0.505 0
6 1 speed 4468 nonword nw_high nonword 0.773 1
7 1 speed 1047 word high word 0.39 1
8 1 speed 5711 nonword nw_low word 0.587 0
9 1 speed 5036 nonword nw_low nonword 0.603 1
10 1 speed 1111 word high word 0.435 1
# … with 31,345 more rows
We are interested in the subject and stimulus IDs as sources of variability, and the differences in accuracy between instruction conditions.
Varying intercept model
Varying intercept and slope model
Scholastic aptitude tests are given multiple times to students (repeated observations nested within students), and students are nested within schools. Schools in turn can be nested within districts. Here, students are said to be nested within schools, meaning that each school contains students unique to that school.
A further example could be therapists working with several different patients.
For example, to model varying effects for therapists nested within patients, we can can use the following syntax.
response ~ condition + (condition | therapist/patient)
This is expanded to
response ~ condition + (condition | therapist) + (condition | therapist:patient)
The :
operator creates a new grouping factor that consists of the combined levels of therapist and patient.
This is explained in Bürkner (2018) and in the vignettes for the brms
package.
The syntax in brms
is very close to that introduced by lme4
(Bates et al. 2015). However, brms
extend the syntax so that nore complex models cab be expressed.
In general, the syntax has the form
response ~ pterms + (gterms | group)
The pterms
part contains the population-level effects. The gterms
part contains so called group-level effects that are assumed to vary within levels of a grouping variable
For example, (1 | group)
, is the simplest possible mixed-model formula, where each level of the grouping factor, group
, has its own varying intercept, which can be thought of in two ways:
as a deviation \(\alpha_j\) from the average intercept \(\alpha\). The \(\alpha_j\) terms have the distribution \(\alpha_j \sim \mathcal{N}(0, \sigma_{\alpha})\).
as a random variate drawn from the distribution \(\alpha_j \sim \mathcal{N}(\alpha, \sigma_{\alpha})\)
The two are equivalent (you can always remove the mean from a normal distribution and add it to a zero mean random variate), but 1) emphasizes the decomposition into two components, whereas 2) emphasizes the hierarchical structure.
In addition to the lme4
inspired syntax, brms
offers several extensions to the formula syntax.
Varying effects can be grouped using the gr()
function, with the argument by
specifying sub-populations of the groups. For each level of the by variable, a separate variance-covariance matrix will be fitted.
For example, the following code fits varying intercepts for patients, but separately for each treatment.
response ~ 1 + (1 | gr(patient, by = "treatment"))
Varying effects can also be members of several groups—this is taken care of with the mm()
function. As an example, pupils might change schools during the term, meaning that they are member of both school1 and school2. We can also take into account the amount of time spent at each school using the weights
argument.
~ x1 + (1 | mm(school1, school2, weights = cbind(w1, w2) response
If you are predicting family-specific (distributional) parameters, and you are including varying effects for a grouping variable, you might want to model a correlation between these effects. You can do by including the term |s|
, where s
can be any arbitrary identifier.
bf(response ~ 1 + (1 |s| subject),
sigma ~ 1 + (1 |s| subject))
If on the other hand you don’t want to estimate the correlation between varying intercepts and varying slopes, you can omit this by leaving out the identifier between the | |
.
response ~ 1 + condition + (1 || subject)
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/awellis/learnmultilevelmodels, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Ellis (2021, May 29). Learn multilevel models: Multilevel formula syntax. Retrieved from https://awellis.github.io/learnmultilevelmodels/more-multilevel-models.html
BibTeX citation
@misc{ellis2021multilevel, author = {Ellis, Andrew}, title = {Learn multilevel models: Multilevel formula syntax}, url = {https://awellis.github.io/learnmultilevelmodels/more-multilevel-models.html}, year = {2021} }