class: center, middle, inverse, title-slide # Intro to Bayesian Statistics ## Part 3
Hierarchical models ### Andrew Ellis ### Bayesian multilevel modelling workshop 2021 ### 2021-05-28 --- layout: true <!-- Home icon --> <div class="my-footer"> <span> <a href="https://awellis.github.io/learnmultilevelmodels/" target="_blank">
</a> Graduate School workshop 2021 </span> </div> <!-- Name (left) --> <!-- <div class="my-footer"> --> <!-- <span> --> <!-- Andrew Ellis - <a href="https://kogpsy.github.io/neuroscicomplab" target="_blank">kogpsy.github.io/neuroscicomplab</a> --> <!-- </span> --> <!-- </div> --> <!-- slide separator (for xaringan) --> --- ## Hierarchical Models <!-- library(lmerTest) --> <!-- options(contrasts=c('contr.sum', 'contr.poly')) --> <!-- anova(lmer(y ~ x + (1|subj),data=myDat),ddf="Kenward-Roger") --> <!-- library(afex) --> <!-- mixed(y ~ x + (1|subj), type=3,method="KR",data=myDat) --> How can repeated measurements be taken into account? -- --- class: middle .pull-left-narrow[ .huge-blue-number[1] ] .pull-right-wide[ .larger[ Estimating Means ] ] --- ## Another IQ Example This time we have 3 subjects, and 3 measurements from each subject. .footnote[ Lee, Michael D., and Eric-Jan Wagenmakers. Bayesian Cognitive Modeling: A Practical Course. Cambridge: Cambridge University Press, 2014. https://doi.org/10.1017/CBO9781139087759. ] .panelset[ .panel[.panel-name[Create data] ```r IQwide <- tribble( ~A, ~B, ~C, 110, 105, 115, 105, 112, 108, 102, 113, 130 ) ``` ] .panel[.panel-name[Pivot longer] ```r IQdata <- IQwide |> pivot_longer(everything(), names_to = "Person", values_to = "IQ") |> mutate(Person = as_factor(Person)) |> arrange(Person) ``` ] .panel[.panel-name[Data] ```r library(kableExtra) IQdata |> kbl() |> scroll_box(width = "500px", height = "200px") ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:200px; overflow-x: scroll; width:500px; "><table> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> Person </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> IQ </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> A </td> <td style="text-align:right;"> 110 </td> </tr> <tr> <td style="text-align:left;"> A </td> <td style="text-align:right;"> 105 </td> </tr> <tr> <td style="text-align:left;"> A </td> <td style="text-align:right;"> 102 </td> </tr> <tr> <td style="text-align:left;"> B </td> <td style="text-align:right;"> 105 </td> </tr> <tr> <td style="text-align:left;"> B </td> <td style="text-align:right;"> 112 </td> </tr> <tr> <td style="text-align:left;"> B </td> <td style="text-align:right;"> 113 </td> </tr> <tr> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 115 </td> </tr> <tr> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 108 </td> </tr> <tr> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 130 </td> </tr> </tbody> </table></div> ] ] --- ## Point Estimates ```r se <- function(x) sd(x)/sqrt(length(x)) IQdata |> group_by(Person) |> summarise(mean = mean(IQ), sd = sd(IQ), se = se(IQ)) ``` ``` ## # A tibble: 3 x 4 ## Person mean sd se ## <fct> <dbl> <dbl> <dbl> ## 1 A 106. 4.04 2.33 ## 2 B 110 4.36 2.52 ## 3 C 118. 11.2 6.49 ``` --- ```r IQdata |> ggplot(aes(Person, IQ)) + geom_point() ``` <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-6-1.png" width="100%" /> --- .discussion[ - How would you analyse these data? ]
03
:
00
--- ## Parameter estimation Our goal is to estimate the 3 latent IQs, and the group mean. We assume that for each person, the measurements are normally distributed around that person's latent IQ. For simplicity, we assume a common standard deviation. $$ IQ_{i} \sim \mathcal{N}(\mu_j, \sigma^2) $$ We have 3 possibilies: - __No pooling:__ We can pretend that all people are independent of each other. - __Complete pooling:__ We can pretend that all observations are from the same person. - __Partial pooling:__ We can take into account that we have multiple observations for each person. We can use these to estimate each person's mean, but we assume that these 3 people are similar to each other, i.e. they are samples from a super-population of people. Therefore, information we have from one person can inform other people's estimates. --- ## Partial Pooling Model The partial pooling model can be written as $$ `\begin{aligned} \operatorname{IQ}_{i} &\sim N \left(\alpha_{j}, \sigma^2 \right) \\ \alpha_{j} &\sim N \left(\mu_{\alpha}, \sigma^2_{\alpha} \right) \text{, for person j = 1,} \dots \text{,J} \end{aligned}` $$ where `\(\mu_{\alpha}\)` is the population level effect, and the `\(\alpha_{j}\)` are the latent IQ scores for each Person. --- ## No pooling ```r library(brms) get_prior(IQ ~ 0 + Person, data = IQdata) ``` ``` ## prior class coef group resp dpar nlpar bound source ## (flat) b default ## (flat) b PersonA (vectorized) ## (flat) b PersonB (vectorized) ## (flat) b PersonC (vectorized) ## student_t(3, 0, 7.4) sigma default ``` --- ## No pooling ```r m_no_pool <- brm(IQ ~ 0 + Person, data = IQdata, file = "models/m_no_pool") ``` --- ## Complete pooling ```r get_prior(IQ ~ 1, data = IQdata) ``` ``` ## prior class coef group resp dpar nlpar bound source ## student_t(3, 110, 7.4) Intercept default ## student_t(3, 0, 7.4) sigma default ``` --- ## Complete pooling ```r m_comp_pool <- brm(IQ ~ 1, data = IQdata, file = "models/m_comp_pool") ``` --- ## Partial pooling ```r get_prior(IQ ~ 1 + (1 | Person), data = IQdata) ``` ``` ## prior class coef group resp dpar nlpar bound ## student_t(3, 110, 7.4) Intercept ## student_t(3, 0, 7.4) sd ## student_t(3, 0, 7.4) sd Person ## student_t(3, 0, 7.4) sd Intercept Person ## student_t(3, 0, 7.4) sigma ## source ## default ## default ## (vectorized) ## (vectorized) ## default ``` --- ## Partial pooling <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-14-1.png" width="100%" /> --- ## Partial pooling ```r m_part_pool <- brm(IQ ~ 1 + (1 | Person), data = IQdata, file = "models/m_part_pool") ``` --- .panelset[ .panel[.panel-name[No Pooling] ```r m_no_pool ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 0 + Person ## Data: IQdata (Number of observations: 9) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## PersonA 105.67 4.90 95.63 115.37 1.00 2708 1960 ## PersonB 110.03 4.96 100.05 119.93 1.00 3531 2408 ## PersonC 117.66 4.92 107.87 127.13 1.00 3645 2285 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 8.17 2.55 4.76 14.42 1.00 2152 2189 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] .panel[.panel-name[Complete Pooling] ```r m_comp_pool ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 ## Data: IQdata (Number of observations: 9) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 110.93 2.71 105.68 116.47 1.00 2390 2126 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 8.78 2.22 5.64 14.04 1.00 2397 2354 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] .panel[.panel-name[Partial Pooling] ```r m_part_pool ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 + (1 | Person) ## Data: IQdata (Number of observations: 9) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Group-Level Effects: ## ~Person (Number of levels: 3) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 5.00 3.90 0.18 15.33 1.00 1187 1793 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 111.03 3.52 104.03 118.26 1.00 1537 1491 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 8.09 2.29 4.88 13.45 1.00 1733 2515 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] ] --- ## Partial Pooling ```r f <- fixef(m_part_pool, summary = FALSE) r <- ranef(m_part_pool, summary = FALSE) ``` ```r library(tidybayes) get_variables(m_part_pool) ``` ``` ## [1] "b_Intercept" "sd_Person__Intercept" "sigma" ## [4] "r_Person[A,Intercept]" "r_Person[B,Intercept]" "r_Person[C,Intercept]" ## [7] "lp__" "accept_stat__" "stepsize__" ## [10] "treedepth__" "n_leapfrog__" "divergent__" ## [13] "energy__" ``` ```r person_effects <- m_part_pool |> spread_draws(b_Intercept, r_Person[Person, Intercept]) |> # add the grand mean to the person-specific deviations mutate(mu = b_Intercept + r_Person) ``` --- ```r person_effects |> median_qi(mu) ``` ``` ## # A tibble: 3 x 8 ## # Groups: Person [3] ## Person Intercept mu .lower .upper .width .point .interval ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> ## 1 A Intercept 109. 100. 116. 0.95 median qi ## 2 B Intercept 111. 104. 118. 0.95 median qi ## 3 C Intercept 114. 106. 122. 0.95 median qi ``` ```r fixef(m_no_pool) ``` ``` ## Estimate Est.Error Q2.5 Q97.5 ## PersonA 105.6677 4.903999 95.63449 115.3670 ## PersonB 110.0306 4.962230 100.04958 119.9328 ## PersonC 117.6587 4.917794 107.87280 127.1344 ``` --- .panelset[ .panel[.panel-name[Plot Code] ```r person_effects |> ggplot(aes(mu, Person)) + stat_halfeye(fill = "#A2BF8A") + geom_vline(xintercept = fixef(m_part_pool)[1, 1], color = "#839496", size = 1) + geom_vline(xintercept = fixef(m_part_pool)[1, 3:4], color = "#839496", linetype = 2) + labs(x = expression("Subject specific means"), y = "Subjects") + theme(panel.grid = element_blank(), axis.ticks.y = element_blank(), axis.text.y = element_text(hjust = 0)) ``` ] .panel[.panel-name[Plot] <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-24-1.png" width="100%" /> ] ] --- ## Partial Pooling vs No Pooling Due to the hierarchical nature of the model, individual estimate are pulled toward the group mean. This means that the will tend to be less extreme. This is known as "shrinkage". .panelset[ .panel[.panel-name[Plot Code] ```r col <- viridis::viridis(3, begin = 0.2, end = 0.8) person_effects |> ggplot(aes(mu, Person, fill = Person)) + stat_halfeye(alpha = 0.6) + geom_vline(xintercept = fixef(m_part_pool)[1, 1], color = "#ECCC87", size = 1) + geom_vline(xintercept = fixef(m_no_pool)[1, 1], color = col[1], size = 1) + geom_vline(xintercept = fixef(m_no_pool)[2, 1], color = col[2], size = 1) + geom_vline(xintercept = fixef(m_no_pool)[3, 1], color = col[3], size = 1) + scale_fill_viridis_d(begin = 0.2, end = 0.8) + labs(x = expression("Subject specific means), y = "Subjects") + theme(panel.grid = element_blank(), axis.ticks.y = element_blank(), axis.text.y = element_text(hjust = 0)) ``` ] .panel[.panel-name[Plot] <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-26-1.png" width="100%" /> ] ] --- class: middle .pull-left-narrow[ .huge-blue-number[2] ] .pull-right-wide[ .larger[ Mixed Models ] ] --- .panelset[ .panel[.panel-name[Create dataset] ```r library(tidyverse) intervention <- rep(c('treat', 'control'), each = 5) pre <- c(20, 10, 60, 20, 10, 50, 10, 40, 20, 10) post <- c(70, 50, 90, 60, 50, 20, 10, 30, 50, 10) ``` ```r dwide <- tibble(id = factor(1:10), intervention, pre, post) |> mutate(diff = post - pre, id = as_factor(id), intervention = factor(intervention, levels = c("control", "treat"))) ``` ] .panel[.panel-name[Dataframe] ```r dwide |> rmarkdown::paged_table() ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["id"],"name":[1],"type":["fct"],"align":["left"]},{"label":["intervention"],"name":[2],"type":["fct"],"align":["left"]},{"label":["pre"],"name":[3],"type":["dbl"],"align":["right"]},{"label":["post"],"name":[4],"type":["dbl"],"align":["right"]},{"label":["diff"],"name":[5],"type":["dbl"],"align":["right"]}],"data":[{"1":"1","2":"treat","3":"20","4":"70","5":"50"},{"1":"2","2":"treat","3":"10","4":"50","5":"40"},{"1":"3","2":"treat","3":"60","4":"90","5":"30"},{"1":"4","2":"treat","3":"20","4":"60","5":"40"},{"1":"5","2":"treat","3":"10","4":"50","5":"40"},{"1":"6","2":"control","3":"50","4":"20","5":"-30"},{"1":"7","2":"control","3":"10","4":"10","5":"0"},{"1":"8","2":"control","3":"40","4":"30","5":"-10"},{"1":"9","2":"control","3":"20","4":"50","5":"30"},{"1":"10","2":"control","3":"10","4":"10","5":"0"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> ]] --- ```r d <- dwide |> select(-diff) |> pivot_longer(cols = pre:post, names_to = "time", values_to = "score") |> mutate(time = as_factor(time)) d |> rmarkdown::paged_table() ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["id"],"name":[1],"type":["fct"],"align":["left"]},{"label":["intervention"],"name":[2],"type":["fct"],"align":["left"]},{"label":["time"],"name":[3],"type":["fct"],"align":["left"]},{"label":["score"],"name":[4],"type":["dbl"],"align":["right"]}],"data":[{"1":"1","2":"treat","3":"pre","4":"20"},{"1":"1","2":"treat","3":"post","4":"70"},{"1":"2","2":"treat","3":"pre","4":"10"},{"1":"2","2":"treat","3":"post","4":"50"},{"1":"3","2":"treat","3":"pre","4":"60"},{"1":"3","2":"treat","3":"post","4":"90"},{"1":"4","2":"treat","3":"pre","4":"20"},{"1":"4","2":"treat","3":"post","4":"60"},{"1":"5","2":"treat","3":"pre","4":"10"},{"1":"5","2":"treat","3":"post","4":"50"},{"1":"6","2":"control","3":"pre","4":"50"},{"1":"6","2":"control","3":"post","4":"20"},{"1":"7","2":"control","3":"pre","4":"10"},{"1":"7","2":"control","3":"post","4":"10"},{"1":"8","2":"control","3":"pre","4":"40"},{"1":"8","2":"control","3":"post","4":"30"},{"1":"9","2":"control","3":"pre","4":"20"},{"1":"9","2":"control","3":"post","4":"50"},{"1":"10","2":"control","3":"pre","4":"10"},{"1":"10","2":"control","3":"post","4":"10"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> --- We have 10 subjects. .panelset[ .panel[.panel-name[Subjects] ```r d |> summarize(id = n_distinct(id)) ``` ``` ## # A tibble: 1 x 1 ## id ## <int> ## 1 10 ``` ] .panel[.panel-name[Trials per subject] with 2 observations per subject. ```r d |> group_by(id, intervention) |> count() |> rmarkdown::paged_table() ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["id"],"name":[1],"type":["fct"],"align":["left"]},{"label":["intervention"],"name":[2],"type":["fct"],"align":["left"]},{"label":["n"],"name":[3],"type":["int"],"align":["right"]}],"data":[{"1":"1","2":"treat","3":"2"},{"1":"2","2":"treat","3":"2"},{"1":"3","2":"treat","3":"2"},{"1":"4","2":"treat","3":"2"},{"1":"5","2":"treat","3":"2"},{"1":"6","2":"control","3":"2"},{"1":"7","2":"control","3":"2"},{"1":"8","2":"control","3":"2"},{"1":"9","2":"control","3":"2"},{"1":"10","2":"control","3":"2"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> ] ] --- .discussion[ How would you analyse these data? ]
05
:
00
--- ## Summarize Data .panelset[ .panel[.panel-name[Standard error function] ```r se <- function(x) sd(x)/sqrt(length(x)) ``` ] .panel[.panel-name[Summarize (long)] ```r d |> group_by(intervention, time) |> summarise(mean = mean(score), sd = sd(score), se = se(score)) ``` ``` ## # A tibble: 4 x 5 ## # Groups: intervention [2] ## intervention time mean sd se ## <fct> <fct> <dbl> <dbl> <dbl> ## 1 control pre 26 18.2 8.12 ## 2 control post 24 16.7 7.48 ## 3 treat pre 24 20.7 9.27 ## 4 treat post 64 16.7 7.48 ``` ] .panel[.panel-name[Summarize (wide)] ```r dwide |> group_by(intervention) |> summarise(mean = mean(diff), sd = sd(diff), se = se(diff)) ``` ``` ## # A tibble: 2 x 4 ## intervention mean sd se ## <fct> <dbl> <dbl> <dbl> ## 1 control -2 21.7 9.70 ## 2 treat 40 7.07 3.16 ``` ] ] --- .panelset[ .panel[.panel-name[Plot Code] ```r d |> ggplot(aes(time, score, color = intervention)) + geom_line(aes(group = id), linetype = 1, size = 1) + geom_point(size = 4) + scale_color_viridis_d(end = 0.8) + theme_bw() ``` ] .panel[.panel-name[Plot] <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-38-1.png" width="100%" /> ] ] --- ## Welch Test ```r t.test(diff ~ intervention, data = dwide, var.equal = FALSE) ``` ``` ## ## Welch Two Sample t-test ## ## data: diff by intervention ## t = -4.1184, df = 4.8415, p-value = 0.009835 ## alternative hypothesis: true difference in means between group control and group treat is not equal to 0 ## 95 percent confidence interval: ## -68.47507 -15.52493 ## sample estimates: ## mean in group control mean in group treat ## -2 40 ``` --- ## Mixed Model .panelset[ .panel[.panel-name[Model Code] ```r library(lme4) lme_model <- lmer(score ~ intervention * time + (1|id), data = d) ``` ] .panel[.panel-name[Model summary] ```r lme_model |> sjPlot::tab_model() ``` <table style="border-collapse:collapse; border:none;"> <tr> <th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; text-align:left; "> </th> <th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">score</th> </tr> <tr> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; text-align:left; ">Predictors</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">Estimates</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">CI</td> <td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal; ">p</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">26.00</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">10.08 – 41.92</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "><strong>0.001</strong></td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">intervention [treat]</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-2.00</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-24.52 – 20.52</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.862</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">time [post]</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-2.00</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">-16.13 – 12.13</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">0.782</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">intervention [treat] *<br>time [post]</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">42.00</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; ">22.01 – 61.99</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center; "><strong><0.001</strong></td> </tr> <tr> <td colspan="4" style="font-weight:bold; text-align:left; padding-top:.8em;">Random Effects</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">σ<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">130.00</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">τ<sub>00</sub> <sub>id</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">200.00</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">ICC</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.61</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">N <sub>id</sub></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">10</td> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">20</td> </tr> <tr> <td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">Marginal R<sup>2</sup> / Conditional R<sup>2</sup></td> <td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.481 / 0.796</td> </tr> </table> ] .panel[.panel-name[Model equations] $$ `\begin{aligned} \operatorname{score}_{i} &\sim N \left(\alpha_{j[i]} + \beta_{1}(\operatorname{time}_{\operatorname{post}}), \sigma^2 \right) \\ \alpha_{j} &\sim N \left(\gamma_{0}^{\alpha} + \gamma_{1}^{\alpha}(\operatorname{intervention}_{\operatorname{treat}}) + \gamma_{2}^{\alpha}(\operatorname{intervention}_{\operatorname{treat}} \times \operatorname{time}_{\operatorname{post}}), \sigma^2_{\alpha_{j}} \right) \text{, for id j = 1,} \dots \text{,J} \end{aligned}` $$ ]] --- ```r library(afex) mixed(score ~ intervention * time + (1|id), type = 3, method = "KR", data = d) ``` ``` ## Fitting one lmer() model. [DONE] ## Calculating p-values. [DONE] ``` ``` ## Mixed Model Anova Table (Type 3 tests, KR-method) ## ## Model: score ~ intervention * time + (1 | id) ## Data: d ## Effect df F p.value ## 1 intervention 1, 8 3.41 .102 ## 2 time 1, 8 13.88 ** .006 ## 3 intervention:time 1, 8 16.96 ** .003 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1 ``` --- ## What is going on here? ```r mm <- model.matrix(~ intervention * time, data = d) head(mm) ``` ``` ## (Intercept) interventiontreat timepost interventiontreat:timepost ## 1 1 1 0 0 ## 2 1 1 1 1 ## 3 1 1 0 0 ## 4 1 1 1 1 ## 5 1 1 0 0 ## 6 1 1 1 1 ``` --- ## Explanation - We are predicting the mean of the outcome variable `score` with a linear model. - The mean is decomposed into + population-level effects: `intervention * time` + group-level effects (a deviation from the average for every person): `(1 | id)` --- ## Using `brms` .panelset[ .panel[.panel-name[get_prior] ```r library(brms) priors1 <- get_prior(score ~ intervention * time + (1|id), data = d) ``` ] .panel[.panel-name[Priors] ``` ## # A tibble: 9 x 4 ## prior class coef group ## <chr> <chr> <chr> <chr> ## 1 "" b "" "" ## 2 "" b "interventiontreat" "" ## 3 "" b "interventiontreat:timepost" "" ## 4 "" b "timepost" "" ## 5 "student_t(3, 25, 22.2)" Intercept "" "" ## 6 "student_t(3, 0, 22.2)" sd "" "" ## 7 "" sd "" "id" ## 8 "" sd "Intercept" "id" ## 9 "student_t(3, 0, 22.2)" sigma "" "" ``` ] ] --- ```r m2 <- brm(score ~ intervention*time + (1 | id), prior = prior(normal(0, 50), class = b), data = d, control = list(adapt_delta = 0.9), file = "models/04-treat-time") ``` --- ```r summary(m2) ``` ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: score ~ intervention * time + (1 | id) ## Data: d (Number of observations: 20) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup samples = 4000 ## ## Group-Level Effects: ## ~id (Number of levels: 10) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 14.08 6.31 2.05 27.59 1.00 806 955 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS ## Intercept 24.54 8.99 6.19 41.83 1.00 1699 ## interventiontreat -0.65 12.66 -25.31 24.65 1.00 1711 ## timepost -0.75 8.50 -17.23 17.00 1.00 2771 ## interventiontreat:timepost 39.38 11.61 15.30 61.64 1.00 2381 ## Tail_ESS ## Intercept 2366 ## interventiontreat 2017 ## timepost 2203 ## interventiontreat:timepost 2059 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 13.55 3.75 8.16 21.99 1.00 921 1514 ## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- ```r m2 |> mcmc_plot("b_") ``` <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-48-1.png" width="100%" /> --- ```r ce <- conditional_effects(m2, effects = "intervention:time") ``` ```r ce ``` <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-50-1.png" width="100%" /> --- ```r plot(ce, plot = FALSE)[[1]] + scale_color_brewer(type = "qual") + scale_fill_brewer(type = "qual") ``` <img src="01-hierarchical-models-slides_files/figure-html/unnamed-chunk-51-1.png" width="100%" />