We have 3 data points, y1=85, y2=100, und y3=115. We can assume that they are normally distributed.
p(y|μ,σ)=1Zexp(−12(y−μ)2σ2)
Goal: We want to estimate the mean and standard deviation of a normal distribution
Z=σ√2π is a normalising constant.
We can compute the probability of one data point, and if the data points are i.i.d, then P(data|,μ,σ) is the product of the indidividual probabilities.
We have 3 data points, y1=85, y2=100, und y3=115. We can assume that they are normally distributed.
p(y|μ,σ)=1Zexp(−12(y−μ)2σ2)
Goal: We want to estimate the mean and standard deviation of a normal distribution
Z=σ√2π is a normalising constant.
We can compute the probability of one data point, and if the data points are i.i.d, then P(data|,μ,σ) is the product of the indidividual probabilities.
But how do we find μ und σ?
library(tidyverse)
sequence_length <- 100d1 <- crossing(y = seq(from = 50, to = 150, length.out = sequence_length), mu = c(87.8, 100, 112), sigma = c(7.35, 12.2, 18.4)) %>% mutate(density = dnorm(y, mean = mu, sd = sigma), mu = factor(mu, labels = str_c("mu==", c(87.8, 100, 112))), sigma = factor(sigma, labels = str_c("sigma==", c(7.35, 12.2, 18.4))))
theme_set( theme_bw() + theme(text = element_text(color = "white"), axis.text = element_text(color = "black"), axis.ticks = element_line(color = "white"), legend.background = element_blank(), legend.box.background = element_rect(fill = "white", color = "transparent"), legend.key = element_rect(fill = "white", color = "transparent"), legend.text = element_text(color = "black"), legend.title = element_text(color = "black"), panel.background = element_rect(fill = "white", color = "white"), panel.grid = element_blank()))
d1 %>% ggplot(aes(x = y)) + geom_ribbon(aes(ymin = 0, ymax = density), fill = "steelblue") + geom_vline(xintercept = c(85, 100, 115), linetype = 3, color = "white") + geom_point(data = tibble(y = c(85, 100, 115)), aes(y = 0.002), size = 2, color = "red") + scale_y_continuous(expression(italic(p)(italic(y)*"|"*mu*", "*sigma)), expand = expansion(mult = c(0, 0.05)), breaks = NULL) + ggtitle("Which distribution?") + coord_cartesian(xlim = c(60, 140)) + facet_grid(sigma ~ mu, labeller = label_parsed) + theme_bw() + theme(panel.grid = element_blank())
We will use some simulated data:
we have two groups. One group were given a smart drug, the other is the control (placebo) group. IQ values have a mean of 100, and an SD of 15
We would like to know if the smart group is smarter than the control group.
Kruschke, John. Doing Bayesian Data Analysis (2nd Ed.). Boston: Academic Press, 2015.
smart = tibble(IQ = c(101,100,102,104,102,97,105,105,98,101,100,123,105,103, 100,95,102,106,109,102,82,102,100,102,102,101,102,102, 103,103,97,97,103,101,97,104,96,103,124,101,101,100, 101,101,104,100,101), Group = "SmartDrug")placebo = tibble(IQ = c(99,101,100,101,102,100,97,101,104,101,102,102,100,105, 88,101,100,104,100,100,100,101,102,103,97,101,101,100,101, 99,101,100,100,101,100,99,101,100,102,99,100,99), Group = "Placebo")
TwoGroupIQ <- bind_rows(smart, placebo) %>% mutate(Group = fct_relevel(as.factor(Group), "Placebo"))
library(kableExtra)TwoGroupIQ %>% group_by(Group) %>% summarise(mean = mean(IQ), sd = sd(IQ)) %>% mutate(across(where(is.numeric), round, 2)) %>% kbl() %>% kable_styling(bootstrap_options = c("striped", "hover"))
Group | mean | sd |
---|---|---|
Placebo | 100.36 | 2.52 |
SmartDrug | 101.91 | 6.02 |
tibble(x = seq(from = 0, to = 10, by = .025)) %>% mutate(d = dt(x, df = 3)) %>% ggplot(aes(x = x, ymin = 0, ymax = d)) + geom_ribbon(fill = "skyblue3") + scale_x_continuous(expand = expansion(mult = c(0, 0.05))) + scale_y_continuous(expand = expansion(mult = c(0, 0.05)), breaks = NULL) + coord_cartesian(xlim = c(0, 8), ylim = c(0, 0.35)) + xlab(expression(sigma)) + labs(subtitle = "Half-student-t prior on standard deviation") + theme_bw(base_size = 14)
tibble(x = seq(from = 0, to = 200, by = .025)) %>% mutate(d = dnorm(x, mean = 102, sd = 3)) %>% ggplot(aes(x = x, ymin = 0, ymax = d)) + geom_ribbon(fill = "skyblue3") + scale_x_continuous(expand = expansion(mult = c(0, 0.05))) + scale_y_continuous(expand = expansion(mult = c(0, 0.05)), breaks = NULL) + coord_cartesian(xlim = c(50, 150), ylim = c(0, 0.15)) + xlab(expression(sigma)) + labs(subtitle = "Normal prior on mean") + theme_bw(base_size = 14)
summary(m1_prior)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 ## Data: d (Number of observations: 47) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 101.95 6.20 92.34 111.71 1.00 2048 1510## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 3.75 17.71 0.10 12.94 1.00 1757 1199## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
library(tidybayes)prior_pred_1 <- d %>% modelr::data_grid(Group) %>% add_predicted_draws(m1_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = d) + scale_color_brewer() + theme_tidybayes()
m1 <- brm(IQ ~ 1, prior = priors, data = d, file = "models/twogroupiq-1")
plot(m1)
summary(m1)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 ## Data: d (Number of observations: 47) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 101.92 0.83 100.31 103.53 1.00 3134 2075## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 6.04 0.64 4.96 7.47 1.00 3060 2458## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
samples <- posterior_samples(m1) %>% transmute(mu = b_Intercept, sigma = sigma)
library(tidybayes)samples %>% select(mu) %>% median_qi(.width = c(.50, .80, .95, .99))
## mu .lower .upper .width .point .interval## 1 101.9132 101.36505 102.4841 0.50 median qi## 2 101.9132 100.86039 102.9730 0.80 median qi## 3 101.9132 100.30502 103.5255 0.95 median qi## 4 101.9132 99.71696 104.1686 0.99 median qi
samples %>% select(mu) %>% median_qi(.width = c(.50, .80, .95, .99)) %>% ggplot(aes(x = mu, xmin = .lower, xmax = .upper)) + geom_pointinterval() + ylab("")
samples %>% select(mu) %>% ggplot(aes(x = mu)) + stat_halfeye()
samples %>% select(sigma) %>% ggplot(aes(x = sigma)) + stat_halfeye(point_interval = mode_hdi)
post_pred_1 <- d %>% modelr::data_grid(Group) %>% add_predicted_draws(m1) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = d) + scale_color_brewer()
post_pred_1
levels(TwoGroupIQ$Group)
## [1] "Placebo" "SmartDrug"
Using R's formula notation:
fit_ols <- lm(IQ ~ Group, data = TwoGroupIQ)
summary(fit_ols)
## ## Call:## lm(formula = IQ ~ Group, data = TwoGroupIQ)## ## Residuals:## Min 1Q Median 3Q Max ## -19.9149 -0.9149 0.0851 1.0851 22.0851 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 100.3571 0.7263 138.184 <2e-16 ***## GroupSmartDrug 1.5578 0.9994 1.559 0.123 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 4.707 on 87 degrees of freedom## Multiple R-squared: 0.02717, Adjusted R-squared: 0.01599 ## F-statistic: 2.43 on 1 and 87 DF, p-value: 0.1227
contrasts(TwoGroupIQ$Group)
## SmartDrug## Placebo 0## SmartDrug 1
mm1 <- model.matrix(~ Group, data = TwoGroupIQ)head(mm1)
## (Intercept) GroupSmartDrug## 1 1 1## 2 1 1## 3 1 1## 4 1 1## 5 1 1## 6 1 1
as_tibble(mm1) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## `(Intercept)` GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 1 1## 5 1 1## 6 1 1
as_tibble(mm1) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## `(Intercept)` GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 1 1## 5 1 1## 6 1 1
mm2 <- model.matrix(~ 0 + Group, data = TwoGroupIQ)as_tibble(mm2) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## GroupPlacebo GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 0 1## 5 0 1## 6 0 1
priors2 <- get_prior(IQ ~ 1 + Group, data = TwoGroupIQ)
priors2
## prior class coef group resp dpar nlpar bound## (flat) b ## (flat) b GroupSmartDrug ## student_t(3, 101, 2.5) Intercept ## student_t(3, 0, 2.5) sigma ## source## default## (vectorized)## default## default
priors3 <- get_prior(IQ ~ 0 + Group, data = TwoGroupIQ)
priors3
## prior class coef group resp dpar nlpar bound## (flat) b ## (flat) b GroupPlacebo ## (flat) b GroupSmartDrug ## student_t(3, 0, 2.5) sigma ## source## default## (vectorized)## (vectorized)## default
prior_pred_2 <- TwoGroupIQ %>% modelr::data_grid(Group) %>% add_predicted_draws(m2_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = TwoGroupIQ) + scale_color_brewer() + theme_tidybayes()
prior_pred_3 <- TwoGroupIQ %>% modelr::data_grid(Group) %>% add_predicted_draws(m3_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = TwoGroupIQ) + scale_color_brewer() + theme_tidybayes()
summary(m2)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 + Group ## Data: TwoGroupIQ (Number of observations: 89) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 100.51 0.69 99.18 101.90 1.00 3735 2921## GroupSmartDrug 1.23 0.88 -0.49 2.95 1.00 3633 3160## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 4.72 0.36 4.09 5.52 1.00 3002 2088## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
summary(m3)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 0 + Group ## Data: TwoGroupIQ (Number of observations: 89) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## GroupPlacebo 100.35 0.73 98.95 101.75 1.00 3914 3163## GroupSmartDrug 101.92 0.69 100.62 103.28 1.00 4057 2954## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 4.71 0.36 4.07 5.45 1.00 3342 2697## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
grid <- TwoGroupIQ %>% modelr::data_grid(Group)fits_IQ <- grid %>% add_fitted_draws(fit_eqvar)preds_IQ <- grid %>% add_predicted_draws(fit_eqvar)pp_eqvar <- TwoGroupIQ %>% ggplot(aes(x = IQ, y = Group)) + stat_halfeye(aes(x = .value), scale = 0.7, position = position_nudge(y = 0.1), data = fits_IQ, .width = c(.66, .95, 0.99)) + stat_interval(aes(x = .prediction), data = preds_IQ, .width = c(.66, .95, 0.99)) + scale_x_continuous(limits = c(75, 125)) + geom_point(data = TwoGroupIQ) + scale_color_brewer() + labs(title = "Equal variance model predictions") + theme_tidybayes()
We have 3 data points, y1=85, y2=100, und y3=115. We can assume that they are normally distributed.
p(y|μ,σ)=1Zexp(−12(y−μ)2σ2)
Goal: We want to estimate the mean and standard deviation of a normal distribution
Z=σ√2π is a normalising constant.
We can compute the probability of one data point, and if the data points are i.i.d, then P(data|,μ,σ) is the product of the indidividual probabilities.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
We have 3 data points, y1=85, y2=100, und y3=115. We can assume that they are normally distributed.
p(y|μ,σ)=1Zexp(−12(y−μ)2σ2)
Goal: We want to estimate the mean and standard deviation of a normal distribution
Z=σ√2π is a normalising constant.
We can compute the probability of one data point, and if the data points are i.i.d, then P(data|,μ,σ) is the product of the indidividual probabilities.
We have 3 data points, y1=85, y2=100, und y3=115. We can assume that they are normally distributed.
p(y|μ,σ)=1Zexp(−12(y−μ)2σ2)
Goal: We want to estimate the mean and standard deviation of a normal distribution
Z=σ√2π is a normalising constant.
We can compute the probability of one data point, and if the data points are i.i.d, then P(data|,μ,σ) is the product of the indidividual probabilities.
But how do we find μ und σ?
library(tidyverse)
sequence_length <- 100d1 <- crossing(y = seq(from = 50, to = 150, length.out = sequence_length), mu = c(87.8, 100, 112), sigma = c(7.35, 12.2, 18.4)) %>% mutate(density = dnorm(y, mean = mu, sd = sigma), mu = factor(mu, labels = str_c("mu==", c(87.8, 100, 112))), sigma = factor(sigma, labels = str_c("sigma==", c(7.35, 12.2, 18.4))))
theme_set( theme_bw() + theme(text = element_text(color = "white"), axis.text = element_text(color = "black"), axis.ticks = element_line(color = "white"), legend.background = element_blank(), legend.box.background = element_rect(fill = "white", color = "transparent"), legend.key = element_rect(fill = "white", color = "transparent"), legend.text = element_text(color = "black"), legend.title = element_text(color = "black"), panel.background = element_rect(fill = "white", color = "white"), panel.grid = element_blank()))
d1 %>% ggplot(aes(x = y)) + geom_ribbon(aes(ymin = 0, ymax = density), fill = "steelblue") + geom_vline(xintercept = c(85, 100, 115), linetype = 3, color = "white") + geom_point(data = tibble(y = c(85, 100, 115)), aes(y = 0.002), size = 2, color = "red") + scale_y_continuous(expression(italic(p)(italic(y)*"|"*mu*", "*sigma)), expand = expansion(mult = c(0, 0.05)), breaks = NULL) + ggtitle("Which distribution?") + coord_cartesian(xlim = c(60, 140)) + facet_grid(sigma ~ mu, labeller = label_parsed) + theme_bw() + theme(panel.grid = element_blank())
We will use some simulated data:
we have two groups. One group were given a smart drug, the other is the control (placebo) group. IQ values have a mean of 100, and an SD of 15
We would like to know if the smart group is smarter than the control group.
Kruschke, John. Doing Bayesian Data Analysis (2nd Ed.). Boston: Academic Press, 2015.
smart = tibble(IQ = c(101,100,102,104,102,97,105,105,98,101,100,123,105,103, 100,95,102,106,109,102,82,102,100,102,102,101,102,102, 103,103,97,97,103,101,97,104,96,103,124,101,101,100, 101,101,104,100,101), Group = "SmartDrug")placebo = tibble(IQ = c(99,101,100,101,102,100,97,101,104,101,102,102,100,105, 88,101,100,104,100,100,100,101,102,103,97,101,101,100,101, 99,101,100,100,101,100,99,101,100,102,99,100,99), Group = "Placebo")
TwoGroupIQ <- bind_rows(smart, placebo) %>% mutate(Group = fct_relevel(as.factor(Group), "Placebo"))
library(kableExtra)TwoGroupIQ %>% group_by(Group) %>% summarise(mean = mean(IQ), sd = sd(IQ)) %>% mutate(across(where(is.numeric), round, 2)) %>% kbl() %>% kable_styling(bootstrap_options = c("striped", "hover"))
Group | mean | sd |
---|---|---|
Placebo | 100.36 | 2.52 |
SmartDrug | 101.91 | 6.02 |
tibble(x = seq(from = 0, to = 10, by = .025)) %>% mutate(d = dt(x, df = 3)) %>% ggplot(aes(x = x, ymin = 0, ymax = d)) + geom_ribbon(fill = "skyblue3") + scale_x_continuous(expand = expansion(mult = c(0, 0.05))) + scale_y_continuous(expand = expansion(mult = c(0, 0.05)), breaks = NULL) + coord_cartesian(xlim = c(0, 8), ylim = c(0, 0.35)) + xlab(expression(sigma)) + labs(subtitle = "Half-student-t prior on standard deviation") + theme_bw(base_size = 14)
tibble(x = seq(from = 0, to = 200, by = .025)) %>% mutate(d = dnorm(x, mean = 102, sd = 3)) %>% ggplot(aes(x = x, ymin = 0, ymax = d)) + geom_ribbon(fill = "skyblue3") + scale_x_continuous(expand = expansion(mult = c(0, 0.05))) + scale_y_continuous(expand = expansion(mult = c(0, 0.05)), breaks = NULL) + coord_cartesian(xlim = c(50, 150), ylim = c(0, 0.15)) + xlab(expression(sigma)) + labs(subtitle = "Normal prior on mean") + theme_bw(base_size = 14)
summary(m1_prior)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 ## Data: d (Number of observations: 47) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 101.95 6.20 92.34 111.71 1.00 2048 1510## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 3.75 17.71 0.10 12.94 1.00 1757 1199## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
library(tidybayes)prior_pred_1 <- d %>% modelr::data_grid(Group) %>% add_predicted_draws(m1_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = d) + scale_color_brewer() + theme_tidybayes()
m1 <- brm(IQ ~ 1, prior = priors, data = d, file = "models/twogroupiq-1")
plot(m1)
summary(m1)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 ## Data: d (Number of observations: 47) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 101.92 0.83 100.31 103.53 1.00 3134 2075## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 6.04 0.64 4.96 7.47 1.00 3060 2458## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
samples <- posterior_samples(m1) %>% transmute(mu = b_Intercept, sigma = sigma)
library(tidybayes)samples %>% select(mu) %>% median_qi(.width = c(.50, .80, .95, .99))
## mu .lower .upper .width .point .interval## 1 101.9132 101.36505 102.4841 0.50 median qi## 2 101.9132 100.86039 102.9730 0.80 median qi## 3 101.9132 100.30502 103.5255 0.95 median qi## 4 101.9132 99.71696 104.1686 0.99 median qi
samples %>% select(mu) %>% median_qi(.width = c(.50, .80, .95, .99)) %>% ggplot(aes(x = mu, xmin = .lower, xmax = .upper)) + geom_pointinterval() + ylab("")
samples %>% select(mu) %>% ggplot(aes(x = mu)) + stat_halfeye()
samples %>% select(sigma) %>% ggplot(aes(x = sigma)) + stat_halfeye(point_interval = mode_hdi)
post_pred_1 <- d %>% modelr::data_grid(Group) %>% add_predicted_draws(m1) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = d) + scale_color_brewer()
post_pred_1
levels(TwoGroupIQ$Group)
## [1] "Placebo" "SmartDrug"
Using R's formula notation:
fit_ols <- lm(IQ ~ Group, data = TwoGroupIQ)
summary(fit_ols)
## ## Call:## lm(formula = IQ ~ Group, data = TwoGroupIQ)## ## Residuals:## Min 1Q Median 3Q Max ## -19.9149 -0.9149 0.0851 1.0851 22.0851 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 100.3571 0.7263 138.184 <2e-16 ***## GroupSmartDrug 1.5578 0.9994 1.559 0.123 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 4.707 on 87 degrees of freedom## Multiple R-squared: 0.02717, Adjusted R-squared: 0.01599 ## F-statistic: 2.43 on 1 and 87 DF, p-value: 0.1227
contrasts(TwoGroupIQ$Group)
## SmartDrug## Placebo 0## SmartDrug 1
mm1 <- model.matrix(~ Group, data = TwoGroupIQ)head(mm1)
## (Intercept) GroupSmartDrug## 1 1 1## 2 1 1## 3 1 1## 4 1 1## 5 1 1## 6 1 1
as_tibble(mm1) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## `(Intercept)` GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 1 1## 5 1 1## 6 1 1
as_tibble(mm1) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## `(Intercept)` GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 1 1## 5 1 1## 6 1 1
mm2 <- model.matrix(~ 0 + Group, data = TwoGroupIQ)as_tibble(mm2) %>% group_by(GroupSmartDrug) %>% slice_sample(n= 3)
## # A tibble: 6 x 2## # Groups: GroupSmartDrug [2]## GroupPlacebo GroupSmartDrug## <dbl> <dbl>## 1 1 0## 2 1 0## 3 1 0## 4 0 1## 5 0 1## 6 0 1
priors2 <- get_prior(IQ ~ 1 + Group, data = TwoGroupIQ)
priors2
## prior class coef group resp dpar nlpar bound## (flat) b ## (flat) b GroupSmartDrug ## student_t(3, 101, 2.5) Intercept ## student_t(3, 0, 2.5) sigma ## source## default## (vectorized)## default## default
priors3 <- get_prior(IQ ~ 0 + Group, data = TwoGroupIQ)
priors3
## prior class coef group resp dpar nlpar bound## (flat) b ## (flat) b GroupPlacebo ## (flat) b GroupSmartDrug ## student_t(3, 0, 2.5) sigma ## source## default## (vectorized)## (vectorized)## default
prior_pred_2 <- TwoGroupIQ %>% modelr::data_grid(Group) %>% add_predicted_draws(m2_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = TwoGroupIQ) + scale_color_brewer() + theme_tidybayes()
prior_pred_3 <- TwoGroupIQ %>% modelr::data_grid(Group) %>% add_predicted_draws(m3_prior) %>% ggplot(aes(y = Group, x = .prediction)) + stat_interval(.width = c(.50, .80, .95, .99)) + geom_point(aes(x = IQ), alpha = 0.4, data = TwoGroupIQ) + scale_color_brewer() + theme_tidybayes()
summary(m2)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 1 + Group ## Data: TwoGroupIQ (Number of observations: 89) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## Intercept 100.51 0.69 99.18 101.90 1.00 3735 2921## GroupSmartDrug 1.23 0.88 -0.49 2.95 1.00 3633 3160## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 4.72 0.36 4.09 5.52 1.00 3002 2088## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
summary(m3)
## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: IQ ~ 0 + Group ## Data: TwoGroupIQ (Number of observations: 89) ## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;## total post-warmup samples = 4000## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## GroupPlacebo 100.35 0.73 98.95 101.75 1.00 3914 3163## GroupSmartDrug 101.92 0.69 100.62 103.28 1.00 4057 2954## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS## sigma 4.71 0.36 4.07 5.45 1.00 3342 2697## ## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS## and Tail_ESS are effective sample size measures, and Rhat is the potential## scale reduction factor on split chains (at convergence, Rhat = 1).
grid <- TwoGroupIQ %>% modelr::data_grid(Group)fits_IQ <- grid %>% add_fitted_draws(fit_eqvar)preds_IQ <- grid %>% add_predicted_draws(fit_eqvar)pp_eqvar <- TwoGroupIQ %>% ggplot(aes(x = IQ, y = Group)) + stat_halfeye(aes(x = .value), scale = 0.7, position = position_nudge(y = 0.1), data = fits_IQ, .width = c(.66, .95, 0.99)) + stat_interval(aes(x = .prediction), data = preds_IQ, .width = c(.66, .95, 0.99)) + scale_x_continuous(limits = c(75, 125)) + geom_point(data = TwoGroupIQ) + scale_color_brewer() + labs(title = "Equal variance model predictions") + theme_tidybayes()