Intro to Bayesian Statistics

# Intro to Bayesian Statistics
## Part 3 <br/> Hierarchical models
### Andrew Ellis
### Bayesian multilevel modelling workshop 2021
### 2021-05-28

---

layout: true
  

<div class="my-footer">
<span>
<a href="https://awellis.github.io/learnmultilevelmodels/" target="_blank"><svg aria-hidden="true" role="img" viewBox="0 0 576 512" style="height:1em;width:1.12em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#0F4C81;overflow:visible;position:relative;"><path d="M280.37 148.26L96 300.11V464a16 16 0 0 0 16 16l112.06-.29a16 16 0 0 0 15.92-16V368a16 16 0 0 1 16-16h64a16 16 0 0 1 16 16v95.64a16 16 0 0 0 16 16.05L464 480a16 16 0 0 0 16-16V300L295.67 148.26a12.19 12.19 0 0 0-15.3 0zM571.6 251.47L488 182.56V44.05a12 12 0 0 0-12-12h-56a12 12 0 0 0-12 12v72.61L318.47 43a48 48 0 0 0-61 0L4.34 251.47a12 12 0 0 0-1.6 16.9l25.5 31A12 12 0 0 0 45.15 301l235.22-193.74a12.19 12.19 0 0 1 15.3 0L530.9 301a12 12 0 0 0 16.9-1.6l25.5-31a12 12 0 0 0-1.7-16.93z"/></svg></a> Graduate School workshop 2021
</span>
</div>

---

## Hierarchical Models

How can repeated measurements be taken into account?

---

---

## Another IQ Example

This time we have 3 subjects, and 3 measurements from each subject.

.footnote[
Lee, Michael D., and Eric-Jan Wagenmakers. Bayesian Cognitive Modeling: A Practical Course. Cambridge: Cambridge University Press, 2014. https://doi.org/10.1017/CBO9781139087759.
]

```r
IQwide <- tribble(
  ~A, ~B, ~C,
  110, 105, 115,
  105, 112, 108,
  102, 113, 130
)
```
]
.panel[.panel-name[Pivot longer]

```r
IQdata <- IQwide |> 
  pivot_longer(everything(), names_to = "Person", values_to = "IQ") |> 
  mutate(Person = as_factor(Person)) |> 
  arrange(Person)
```
]
.panel[.panel-name[Data]

```r
library(kableExtra)
IQdata |> 
  kbl() |> 
  scroll_box(width = "500px", height = "200px")
```

<div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:200px; overflow-x: scroll; width:500px; "><table>
 <thead>
  <tr>
   <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> Person </th>
   <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> IQ </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> A </td>
   <td style="text-align:right;"> 110 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> A </td>
   <td style="text-align:right;"> 105 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> A </td>
   <td style="text-align:right;"> 102 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> B </td>
   <td style="text-align:right;"> 105 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> B </td>
   <td style="text-align:right;"> 112 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> B </td>
   <td style="text-align:right;"> 113 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> C </td>
   <td style="text-align:right;"> 115 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> C </td>
   <td style="text-align:right;"> 108 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> C </td>
   <td style="text-align:right;"> 130 </td>
  </tr>
</tbody>
</table></div>

]
]

---

## Point Estimates

```r
se <- function(x) sd(x)/sqrt(length(x))

IQdata |> 
  group_by(Person) |> 
  summarise(mean = mean(IQ),
            sd = sd(IQ),
            se = se(IQ))
```

```
## # A tibble: 3 x 4
##   Person  mean    sd    se
##   <fct>  <dbl> <dbl> <dbl>
## 1 A       106.  4.04  2.33
## 2 B       110   4.36  2.52
## 3 C       118. 11.2   6.49
```

---

```r
IQdata |> 
  ggplot(aes(Person, IQ)) +
  geom_point()
```

---

- How would you analyse these data?

]

---

## Parameter estimation

Our goal is to estimate the 3 latent IQs, and the group mean. We assume that for each person, the measurements are normally distributed around that person's latent IQ. For simplicity, we assume a common standard deviation.

$$ IQ_{i} \sim \mathcal{N}(\mu_j, \sigma^2) $$

We have 3 possibilies:

- __No pooling:__ We can pretend that all people are independent of each other.

- __Complete pooling:__ We can pretend that all observations are from the same person.

- __Partial pooling:__ We can take into account that we have multiple observations for each person. We can use these to estimate each person's mean, but we assume that these 3 people are similar to each other, i.e. they are samples from a super-population of people. Therefore, information we have from one person can inform other people's estimates.

---

## Partial Pooling Model

The partial pooling model can be written as

$$
`\begin{aligned}
  \operatorname{IQ}_{i}  &\sim N \left(\alpha_{j}, \sigma^2 \right) \\
    \alpha_{j}  &\sim N \left(\mu_{\alpha}, \sigma^2_{\alpha} \right)
    \text{, for person j = 1,} \dots \text{,J}
\end{aligned}`
$$

where `$\mu_{\alpha}$` is the population level effect, and the `$\alpha_{j}$` are the latent IQ scores for each Person.

---

## No pooling

```r
library(brms)
get_prior(IQ ~ 0 + Person,
          data = IQdata)
```

```
##                 prior class    coef group resp dpar nlpar bound       source
##                (flat)     b                                          default
##                (flat)     b PersonA                             (vectorized)
##                (flat)     b PersonB                             (vectorized)
##                (flat)     b PersonC                             (vectorized)
##  student_t(3, 0, 7.4) sigma                                          default
```

---

## No pooling

```r
m_no_pool <- brm(IQ ~ 0 + Person,
                 data = IQdata,
                 file = "models/m_no_pool")
```

---

## Complete pooling

```r
get_prior(IQ ~ 1,
          data = IQdata)
```

```
##                   prior     class coef group resp dpar nlpar bound  source
##  student_t(3, 110, 7.4) Intercept                                  default
##    student_t(3, 0, 7.4)     sigma                                  default
```

---

## Complete pooling

```r
m_comp_pool <-  brm(IQ ~ 1,
                 data = IQdata,
                 file = "models/m_comp_pool")
```

---

## Partial pooling

```r
get_prior(IQ ~ 1 + (1 | Person),
                 data = IQdata)
```

```
##                   prior     class      coef  group resp dpar nlpar bound
##  student_t(3, 110, 7.4) Intercept                                       
##    student_t(3, 0, 7.4)        sd                                       
##    student_t(3, 0, 7.4)        sd           Person                      
##    student_t(3, 0, 7.4)        sd Intercept Person                      
##    student_t(3, 0, 7.4)     sigma                                       
##        source
##       default
##       default
##  (vectorized)
##  (vectorized)
##       default
```

---

## Partial pooling

---

## Partial pooling

```r
m_part_pool <-  brm(IQ ~ 1 + (1 | Person),
                 data = IQdata,
                 file = "models/m_part_pool")
```

---

```r
m_no_pool
```

```
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IQ ~ 0 + Person 
##    Data: IQdata (Number of observations: 9) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Population-Level Effects: 
##         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## PersonA   105.67      4.90    95.63   115.37 1.00     2708     1960
## PersonB   110.03      4.96   100.05   119.93 1.00     3531     2408
## PersonC   117.66      4.92   107.87   127.13 1.00     3645     2285
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     8.17      2.55     4.76    14.42 1.00     2152     2189
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

]

```r
m_comp_pool
```

```
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IQ ~ 1 
##    Data: IQdata (Number of observations: 9) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   110.93      2.71   105.68   116.47 1.00     2390     2126
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     8.78      2.22     5.64    14.04 1.00     2397     2354
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

]
.panel[.panel-name[Partial Pooling]

```r
m_part_pool
```

```
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: IQ ~ 1 + (1 | Person) 
##    Data: IQdata (Number of observations: 9) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~Person (Number of levels: 3) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     5.00      3.90     0.18    15.33 1.00     1187     1793
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   111.03      3.52   104.03   118.26 1.00     1537     1491
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     8.09      2.29     4.88    13.45 1.00     1733     2515
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```
]
]

---

## Partial Pooling

```r
f <- fixef(m_part_pool, summary = FALSE)
r <- ranef(m_part_pool, summary = FALSE)
```

```r
library(tidybayes)

get_variables(m_part_pool)
```

```
##  [1] "b_Intercept"           "sd_Person__Intercept"  "sigma"                
##  [4] "r_Person[A,Intercept]" "r_Person[B,Intercept]" "r_Person[C,Intercept]"
##  [7] "lp__"                  "accept_stat__"         "stepsize__"           
## [10] "treedepth__"           "n_leapfrog__"          "divergent__"          
## [13] "energy__"
```

```r
person_effects <- m_part_pool |>
  spread_draws(b_Intercept, r_Person[Person, Intercept]) |>
  # add the grand mean to the person-specific deviations
  mutate(mu = b_Intercept + r_Person)
```

---

```r
person_effects |> 
  median_qi(mu)
```

```
## # A tibble: 3 x 8
## # Groups:   Person [3]
##   Person Intercept    mu .lower .upper .width .point .interval
##   <chr>  <chr>     <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
## 1 A      Intercept  109.   100.   116.   0.95 median qi       
## 2 B      Intercept  111.   104.   118.   0.95 median qi       
## 3 C      Intercept  114.   106.   122.   0.95 median qi
```

```r
fixef(m_no_pool)
```

```
##         Estimate Est.Error      Q2.5    Q97.5
## PersonA 105.6677  4.903999  95.63449 115.3670
## PersonB 110.0306  4.962230 100.04958 119.9328
## PersonC 117.6587  4.917794 107.87280 127.1344
```

---

```r
person_effects |> 
  ggplot(aes(mu, Person)) +
  stat_halfeye(fill = "#A2BF8A") +
  geom_vline(xintercept = fixef(m_part_pool)[1, 1], color = "#839496", size = 1) +
  geom_vline(xintercept = fixef(m_part_pool)[1, 3:4], color = "#839496", linetype = 2) +
  labs(x = expression("Subject specific means"),
       y = "Subjects") +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) 
```
]
.panel[.panel-name[Plot]

<img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-24-1.png" width="100%" />
]
]

---

## Partial Pooling vs No Pooling

Due to the hierarchical nature of the model, individual estimate are pulled toward the group mean. This means that the will tend to be less extreme. This is known as "shrinkage".

```r
col <- viridis::viridis(3, begin = 0.2, end = 0.8)

person_effects |> 
  ggplot(aes(mu, Person, fill = Person)) +
  stat_halfeye(alpha = 0.6) +
  geom_vline(xintercept = fixef(m_part_pool)[1, 1], color = "#ECCC87", 
             size = 1) +
  geom_vline(xintercept = fixef(m_no_pool)[1, 1], color = col[1], 
             size = 1) +
  geom_vline(xintercept = fixef(m_no_pool)[2, 1], color = col[2], 
             size = 1) +
   geom_vline(xintercept = fixef(m_no_pool)[3, 1], color = col[3], 
              size = 1) +
  scale_fill_viridis_d(begin = 0.2, end = 0.8) +
  labs(x = expression("Subject specific means),
       y = "Subjects") +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) 
```
]
.panel[.panel-name[Plot]

<img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-26-1.png" width="100%" />
]
]

---

---

```r
library(tidyverse)

intervention <- rep(c('treat', 'control'), each = 5)
pre <- c(20, 10, 60, 20, 10, 50, 10, 40, 20, 10)
post <- c(70, 50, 90, 60, 50, 20, 10, 30, 50, 10)
```

```r
dwide <- tibble(id = factor(1:10), 
            intervention, pre, post) |> 
  mutate(diff = post - pre,
         id = as_factor(id), 
         intervention =  factor(intervention, levels = c("control", "treat")))
```
]
.panel[.panel-name[Dataframe]

```r
dwide |> 
  rmarkdown::paged_table()
```

<div data-pagedtable="false">
  <script data-pagedtable-source type="application/json">
{"columns":[{"label":["id"],"name":[1],"type":["fct"],"align":["left"]},{"label":["intervention"],"name":[2],"type":["fct"],"align":["left"]},{"label":["pre"],"name":[3],"type":["dbl"],"align":["right"]},{"label":["post"],"name":[4],"type":["dbl"],"align":["right"]},{"label":["diff"],"name":[5],"type":["dbl"],"align":["right"]}],"data":[{"1":"1","2":"treat","3":"20","4":"70","5":"50"},{"1":"2","2":"treat","3":"10","4":"50","5":"40"},{"1":"3","2":"treat","3":"60","4":"90","5":"30"},{"1":"4","2":"treat","3":"20","4":"60","5":"40"},{"1":"5","2":"treat","3":"10","4":"50","5":"40"},{"1":"6","2":"control","3":"50","4":"20","5":"-30"},{"1":"7","2":"control","3":"10","4":"10","5":"0"},{"1":"8","2":"control","3":"40","4":"30","5":"-10"},{"1":"9","2":"control","3":"20","4":"50","5":"30"},{"1":"10","2":"control","3":"10","4":"10","5":"0"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
  </script>
</div>

]]

---

```r
d <- dwide |> 
  select(-diff) |> 
  pivot_longer(cols = pre:post, names_to = "time", values_to = "score") |> 
  mutate(time = as_factor(time))

d |> 
  rmarkdown::paged_table()
```

<div data-pagedtable="false">
  <script data-pagedtable-source type="application/json">
{"columns":[{"label":["id"],"name":[1],"type":["fct"],"align":["left"]},{"label":["intervention"],"name":[2],"type":["fct"],"align":["left"]},{"label":["time"],"name":[3],"type":["fct"],"align":["left"]},{"label":["score"],"name":[4],"type":["dbl"],"align":["right"]}],"data":[{"1":"1","2":"treat","3":"pre","4":"20"},{"1":"1","2":"treat","3":"post","4":"70"},{"1":"2","2":"treat","3":"pre","4":"10"},{"1":"2","2":"treat","3":"post","4":"50"},{"1":"3","2":"treat","3":"pre","4":"60"},{"1":"3","2":"treat","3":"post","4":"90"},{"1":"4","2":"treat","3":"pre","4":"20"},{"1":"4","2":"treat","3":"post","4":"60"},{"1":"5","2":"treat","3":"pre","4":"10"},{"1":"5","2":"treat","3":"post","4":"50"},{"1":"6","2":"control","3":"pre","4":"50"},{"1":"6","2":"control","3":"post","4":"20"},{"1":"7","2":"control","3":"pre","4":"10"},{"1":"7","2":"control","3":"post","4":"10"},{"1":"8","2":"control","3":"pre","4":"40"},{"1":"8","2":"control","3":"post","4":"30"},{"1":"9","2":"control","3":"pre","4":"20"},{"1":"9","2":"control","3":"post","4":"50"},{"1":"10","2":"control","3":"pre","4":"10"},{"1":"10","2":"control","3":"post","4":"10"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
  </script>
</div>

---

We have 10 subjects.

```r
d |> 
  summarize(id = n_distinct(id))
```

```
## # A tibble: 1 x 1
##      id
##   <int>
## 1    10
```
]

with 2 observations per subject.

```r
d |> 
  group_by(id, intervention) |> 
  count() |> 
  rmarkdown::paged_table()
```

---

---

## Summarize Data

```r
se <- function(x) sd(x)/sqrt(length(x))
```

]
.panel[.panel-name[Summarize (long)]

```r
d |> 
  group_by(intervention, time) |> 
  summarise(mean = mean(score),
            sd = sd(score),
            se = se(score))
```

```
## # A tibble: 4 x 5
## # Groups:   intervention [2]
##   intervention time   mean    sd    se
##   <fct>        <fct> <dbl> <dbl> <dbl>
## 1 control      pre      26  18.2  8.12
## 2 control      post     24  16.7  7.48
## 3 treat        pre      24  20.7  9.27
## 4 treat        post     64  16.7  7.48
```

]

```r
dwide |> 
  group_by(intervention) |> 
  summarise(mean = mean(diff),
            sd = sd(diff),
            se = se(diff))
```

```
## # A tibble: 2 x 4
##   intervention  mean    sd    se
##   <fct>        <dbl> <dbl> <dbl>
## 1 control         -2 21.7   9.70
## 2 treat           40  7.07  3.16
```
]

]

---

```r
d |> 
  ggplot(aes(time, score, color = intervention)) +
  geom_line(aes(group = id), linetype = 1, size = 1) +
  geom_point(size = 4) +
  scale_color_viridis_d(end = 0.8) +
  theme_bw()
```

]

]
]

---

## Welch Test

```r
t.test(diff ~ intervention,
       data = dwide, 
       var.equal = FALSE)
```

```
## 
## 	Welch Two Sample t-test
## 
## data:  diff by intervention
## t = -4.1184, df = 4.8415, p-value = 0.009835
## alternative hypothesis: true difference in means between group control and group treat is not equal to 0
## 95 percent confidence interval:
##  -68.47507 -15.52493
## sample estimates:
## mean in group control   mean in group treat 
##                    -2                    40
```

---

## Mixed Model

```r
library(lme4)
lme_model <- lmer(score ~ intervention * time + (1|id), 
                  data = d)
```

]

```r
lme_model |> 
  sjPlot::tab_model()
```

<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm;  text-align:left; ">&nbsp;</th>
<th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">score</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">CI</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">26.00</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">10.08&nbsp;&ndash;&nbsp;41.92</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>0.001</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">intervention [treat]</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">&#45;2.00</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">&#45;24.52&nbsp;&ndash;&nbsp;20.52</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.862</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">time [post]</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">&#45;2.00</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">&#45;16.13&nbsp;&ndash;&nbsp;12.13</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.782</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">intervention [treat] *<br>time [post]</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">42.00</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">22.01&nbsp;&ndash;&nbsp;61.99</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>&lt;0.001</strong></td>
</tr>
<tr>
<td colspan="4" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td>
</tr>

<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">&sigma;<sup>2</sup></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">130.00</td>

<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N <sub>id</sub></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">10</td>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">20</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.481 / 0.796</td>
</tr>

</table>
]

$$
`\begin{aligned}
  \operatorname{score}_{i}  &\sim N \left(\alpha_{j[i]} + \beta_{1}(\operatorname{time}_{\operatorname{post}}), \sigma^2 \right) \\
    \alpha_{j}  &\sim N \left(\gamma_{0}^{\alpha} + \gamma_{1}^{\alpha}(\operatorname{intervention}_{\operatorname{treat}}) + \gamma_{2}^{\alpha}(\operatorname{intervention}_{\operatorname{treat}} \times \operatorname{time}_{\operatorname{post}}), \sigma^2_{\alpha_{j}} \right)
    \text{, for id j = 1,} \dots \text{,J}
\end{aligned}`
$$
]]

---

```r
library(afex)
mixed(score ~ intervention * time + (1|id), 
      type = 3, method = "KR",
      data = d)
```

```
## Fitting one lmer() model. [DONE]
## Calculating p-values. [DONE]
```

```
## Mixed Model Anova Table (Type 3 tests, KR-method)
## 
## Model: score ~ intervention * time + (1 | id)
## Data: d
##              Effect   df        F p.value
## 1      intervention 1, 8     3.41    .102
## 2              time 1, 8 13.88 **    .006
## 3 intervention:time 1, 8 16.96 **    .003
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
```

---

## What is going on here?

```r
mm <- model.matrix(~ intervention * time,
             data = d)
head(mm)
```

```
##   (Intercept) interventiontreat timepost interventiontreat:timepost
## 1           1                 1        0                          0
## 2           1                 1        1                          1
## 3           1                 1        0                          0
## 4           1                 1        1                          1
## 5           1                 1        0                          0
## 6           1                 1        1                          1
```

---

## Explanation

- We are predicting the mean of the outcome variable `score` with a linear model.

- The mean is decomposed into 
 + population-level effects: `intervention * time`
 + group-level effects (a deviation from the average for every person): `(1 | id)`

---

## Using `brms`

```r
library(brms)

priors1 <- get_prior(score ~ intervention * time + (1|id),
          data = d)
```
]
.panel[.panel-name[Priors]

```
## # A tibble: 9 x 4
##   prior                    class     coef                         group
##   <chr>                    <chr>     <chr>                        <chr>
## 1 ""                       b         ""                           ""   
## 2 ""                       b         "interventiontreat"          ""   
## 3 ""                       b         "interventiontreat:timepost" ""   
## 4 ""                       b         "timepost"                   ""   
## 5 "student_t(3, 25, 22.2)" Intercept ""                           ""   
## 6 "student_t(3, 0, 22.2)"  sd        ""                           ""   
## 7 ""                       sd        ""                           "id" 
## 8 ""                       sd        "Intercept"                  "id" 
## 9 "student_t(3, 0, 22.2)"  sigma     ""                           ""
```

]
]

---

```r
m2 <- brm(score ~ intervention*time + (1 | id),
          prior = prior(normal(0, 50), class = b),
          data = d,
          control = list(adapt_delta = 0.9),
          file =  "models/04-treat-time") 
```

---

```r
summary(m2)
```

```
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: score ~ intervention * time + (1 | id) 
##    Data: d (Number of observations: 20) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~id (Number of levels: 10) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)    14.08      6.31     2.05    27.59 1.00      806      955
## 
## Population-Level Effects: 
##                            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                     24.54      8.99     6.19    41.83 1.00     1699
## interventiontreat             -0.65     12.66   -25.31    24.65 1.00     1711
## timepost                      -0.75      8.50   -17.23    17.00 1.00     2771
## interventiontreat:timepost    39.38     11.61    15.30    61.64 1.00     2381
##                            Tail_ESS
## Intercept                      2366
## interventiontreat              2017
## timepost                       2203
## interventiontreat:timepost     2059
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma    13.55      3.75     8.16    21.99 1.00      921     1514
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

---

```r
m2 |> 
  mcmc_plot("b_")
```

---

```r
ce <- conditional_effects(m2, effects = "intervention:time")
```

```r
ce
```

---

```r
plot(ce, plot = FALSE)[[1]] + 
  scale_color_brewer(type = "qual") +
  scale_fill_brewer(type = "qual")
```