class: center, middle, title-slide, inverse, no-scribble layout: false <audio controls> <source src="media/primus_too_many_puppies.mp3" type="audio/mpeg"> <source src="media/primus_too_many_puppies.ogg" type="audio/ogg"/> </audio> # Comparing means adjusted for other predictors (analysis of covariance) ## Professor Andy Field <div> <img style="vertical-align:middle; width:30px; height:30px" src="media/twitter_60.png"> <span style="line-height:40px;">@profandyfield</span> </div> <div> <img style="vertical-align:middle; width:60px" src="media/youtube.png"> <span style="line-height:40px;">www.youtube.com/user/ProfAndyField/</span> </div> <div> <img style="vertical-align:middle; width:30px; height:30px" src="media/ds_com_fav.png"> <span style="line-height:40px;">www.discoveringstatistics.com</span> </div> <div> <img style="vertical-align:middle; width:30px; height:30px" src="media/milton_grey_fav.png"> <span style="line-height:40px;">www.milton-the-cat.rocks</span> </div> <div> <img style="vertical-align:middle; width:30px; height:30px" src="media/discovr_fav.png"> <span style="line-height:40px;">www.discovr.rocks</span> </div> ??? Music: Primus: Too many puppies h or ?: Toggle the help window j: Jump to next slide k: Jump to previous slide b: Toggle blackout mode m: Toggle mirrored mode. p: Toggle PresenterMode f: Toggle Fullscreen t: Reset presentation timer <number> + <Return>: Jump to slide <number> c: Create a clone presentation on a new window --- class: center class: center  ??? We've seen this map of the process of fitting models before --- class: center  ??? Today we focus back on the model itself to look at the form of the model we're fitting. The faded stuff still applies though - we'll look at bias, robust models, and of course samples and estimation and so on. But they are the same as for other models, what is different is the form of the model we're fitting. --- # Learning outcomes * Explain how to compare means adjusting for other predictors using a linear model + Linear model with a categorical and continuous predictor + a.k.a. analysis of covariance (ANCOVA) -- * Type I vs. Type III sums of squares -- * Interpreting the model + Main effects + Covariates --- # When and why ## Generally * To test for differences between group means when we know that an extraneous variable affects the outcome variable * Used to adjust the means for extraneous and confounding variables -- ## In experimental research * Reduce error variance (sometimes) + By explaining some of the unexplained variance (SS<sub>R</sub>) the error variance in the model can be reduced * Greater experimental control + By adjusting for known confounds, we can gain greater insight into the effect of the predictor variable(s) --- background-image: url("media/milton_bed_crop_2018.jpg") background-size: cover class: middle -- .whitebox[ # Extending the puppy example * A puppy therapy RCT + A no puppies control group + 15 minutes of puppy therapy + 30 minutes of puppy therapy * Outcome variable + Happiness (0 = unhappy to 10 = happy). * Covariate + Love of puppies (0 = no love, 7 = all the love) ] --- class: center # The statistical model  --- class: center # The statistical model  <br> .ong[.center[.eq_lrge[ $$ `\begin{aligned} \text{Happiness}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i + e_i \end{aligned}` $$ ]]] --- class: center # The statistical model  <br> .ong[.center[.eq_lrge[ $$ `\begin{aligned} \text{Happiness}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i + \hat{b}_3\text{Puppy love}_i + e_i \end{aligned}` $$ ]]] --- class: center # Partitioning variance  --- class: center # Partitioning variance  --- class: center # Partitioning variance  --- background-image: url("media/milton_20180308_202134.JPG") background-size: cover class: inverse # The data
--- # Data summary ## By group <table> <thead> <tr> <th style="text-align:left;"> Therapy group </th> <th style="text-align:center;"> Mean (happiness) </th> <th style="text-align:center;"> SD (happiness) </th> <th style="text-align:center;"> Mean (puppy love) </th> <th style="text-align:center;"> SD (puppy love) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No puppies </td> <td style="text-align:center;"> 3.22 </td> <td style="text-align:center;"> 1.79 </td> <td style="text-align:center;"> 3.44 </td> <td style="text-align:center;"> 2.07 </td> </tr> <tr> <td style="text-align:left;"> 15 mins </td> <td style="text-align:center;"> 4.88 </td> <td style="text-align:center;"> 1.46 </td> <td style="text-align:center;"> 3.12 </td> <td style="text-align:center;"> 1.73 </td> </tr> <tr> <td style="text-align:left;"> 30 mins </td> <td style="text-align:center;"> 4.85 </td> <td style="text-align:center;"> 2.12 </td> <td style="text-align:center;"> 2.00 </td> <td style="text-align:center;"> 1.63 </td> </tr> </tbody> </table> -- ## Overall <table> <thead> <tr> <th style="text-align:center;"> Mean (happiness) </th> <th style="text-align:center;"> SD (happiness) </th> <th style="text-align:center;"> Mean (puppy love) </th> <th style="text-align:center;"> SD (puppy love) </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 4.37 </td> <td style="text-align:center;"> 1.96 </td> <td style="text-align:center;"> 2.73 </td> <td style="text-align:center;"> 1.86 </td> </tr> </tbody> </table> --- background-image: none background-color: #000000 class: no-scribble <video width="100%" height="100%" controls id="my_video"> <source src="media/milton_just_as_boring.mp4" type="video/mp4"> </video> --- background-image: url("media/milton_20180606_113722_crop.jpg") background-size: cover # The model -- .whitebox[.center[.eq_lrge[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i + \hat{b}_3\text{Puppy love}_i \end{aligned}` $$ ]]] <br> -- <table> <thead> <tr> <th style="text-align:left;"> Therapy group </th> <th style="text-align:center;"> Long (30 mins vs. no puppies) </th> <th style="text-align:center;"> Short 1 (15 mins vs. no puppies) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: white !important;"> No Puppies </td> <td style="text-align:center;background-color: white !important;"> 0 </td> <td style="text-align:center;background-color: white !important;"> 0 </td> </tr> <tr> <td style="text-align:left;background-color: white !important;"> 15 mins </td> <td style="text-align:center;background-color: white !important;"> 0 </td> <td style="text-align:center;background-color: white !important;"> 1 </td> </tr> <tr> <td style="text-align:left;background-color: white !important;"> 30 mins </td> <td style="text-align:center;background-color: white !important;"> 1 </td> <td style="text-align:center;background-color: white !important;"> 0 </td> </tr> </tbody> </table> --- # What you'd expect the dummy variables to represent <table> <thead> <tr> <th style="text-align:left;"> Therapy group </th> <th style="text-align:center;"> Mean (happiness) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No puppies </td> <td style="text-align:center;"> 3.22 </td> </tr> <tr> <td style="text-align:left;"> 15 mins </td> <td style="text-align:center;"> 4.88 </td> </tr> <tr> <td style="text-align:left;"> 30 mins </td> <td style="text-align:center;"> 4.85 </td> </tr> </tbody> </table> -- <br> .ong[.eq_lrge[ $$ `\begin{aligned} \hat{b}_0 &= \bar{X}_\text{No puppies} = 3.22 \\ \hat{b}_1 &= 4.85-3.22 = 1.63 \\ \hat{b}_2 &= 4.88-3.22 = 1.66 \end{aligned}` $$ ]] --- # What you actually get ```r pupluv_lm <- lm(happiness ~ puppy_love + dose, data = pupluv_tib) broom::tidy(pupluv_lm, conf.int = TRUE) ``` <br> -- .center[ <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:right;"> conf.low </th> <th style="text-align:right;"> conf.high </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;background-color: yellow !important;"> 1.789 </td> <td style="text-align:right;"> 0.867 </td> <td style="text-align:right;"> 2.063 </td> <td style="text-align:right;"> 0.049 </td> <td style="text-align:right;"> 0.007 </td> <td style="text-align:right;"> 3.572 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;background-color: yellow !important;"> 0.416 </td> <td style="text-align:right;"> 0.187 </td> <td style="text-align:right;"> 2.227 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 0.032 </td> <td style="text-align:right;"> 0.800 </td> </tr> <tr> <td style="text-align:left;"> dose15 mins </td> <td style="text-align:right;background-color: yellow !important;"> 1.786 </td> <td style="text-align:right;"> 0.849 </td> <td style="text-align:right;"> 2.102 </td> <td style="text-align:right;"> 0.045 </td> <td style="text-align:right;"> 0.040 </td> <td style="text-align:right;"> 3.532 </td> </tr> <tr> <td style="text-align:left;"> dose30 mins </td> <td style="text-align:right;background-color: yellow !important;"> 2.225 </td> <td style="text-align:right;"> 0.803 </td> <td style="text-align:right;"> 2.771 </td> <td style="text-align:right;"> 0.010 </td> <td style="text-align:right;"> 0.575 </td> <td style="text-align:right;"> 3.875 </td> </tr> </tbody> </table> ] -- <br> .center[.ong[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= 1.789 + 2.225\text{ Long}_i + 1.786\text{ Short}_i + 0.416\text{ Puppy love}_i \end{aligned}` $$ ]] --- # Adjusting means .center[.ong[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= 1.789 + 2.225\text{ Long}_i + 1.786\text{ Short}_i + 0.416\text{ Puppy love}_i \end{aligned}` $$ ]] <br> <table> <thead> <tr> <th style="text-align:center;"> Mean (happiness) </th> <th style="text-align:center;"> SD (happiness) </th> <th style="text-align:center;"> Mean (puppy love) </th> <th style="text-align:center;"> SD (puppy love) </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 4.37 </td> <td style="text-align:center;"> 1.96 </td> <td style="text-align:center;"> 2.73 </td> <td style="text-align:center;"> 1.86 </td> </tr> </tbody> </table> -- ## Control group .center[.ong[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= 1.789 + 2.225\text{ Long}_i + 1.786\text{ Short}_i + 0.416\text{ Puppy love}_i \\ &= 1.789 + (2.225\times0) + (1.786\times0) + (0.416\text{ Puppy love}_i) \\ &= 1.789 + (0.416\times\bar{X}_\text{Puppy love}) \\ &= 1.789 + (0.416\times2.73) \\ &= 2.925 \end{aligned}` $$ ]] --- # Adjusting means ## 15 minute group .center[.ong[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= 1.789 + 2.225\text{ Long}_i + 1.786\text{ Short}_i + 0.416\text{ Puppy love}_i \\ &= 1.789 + (2.225\times0) + (1.786\times1) + (0.416\text{ Puppy love}_i) \\ &= 1.789 + 1.786 + (0.416\times2.73) \\ &= 4.71 \end{aligned}` $$ ]] -- ## 30 minute group .center[.ong[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= 1.789 + 2.225\text{ Long}_i + 1.786\text{ Short}_i + 0.416\text{ Puppy love}_i \\ &= 1.789 + (2.225\times1) + (1.786\times0) + (0.416\text{ Puppy love}_i) \\ &= 1.789 + 2.225 + (0.416\times2.73) \\ &= 5.15 \end{aligned}` $$ ]] --- class: center # The unadjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i \end{aligned}` $$ ]] <!-- --> --- class: center # The unadjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i \end{aligned}` $$ ]] <!-- --> --- class: center # The unadjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i \end{aligned}` $$ ]] <!-- --> --- class: center # The unadjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i \end{aligned}` $$ ]] <!-- --> --- class: center # The adjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i + \hat{b}_3\text{Puppy love}_i \end{aligned}` $$ ]] <!-- --> --- class: center # The adjusted model .ong[.center[ $$ `\begin{aligned} \hat{\text{Happiness}}_i &= \hat{b}_0 + \hat{b}_1\text{Long}_i + \hat{b}_2\text{Short}_i + \hat{b}_3\text{Puppy love}_i \end{aligned}` $$ ]] <!-- --> --- # Model parameters .pull-left[ <table> <thead> <tr> <th style="text-align:left;"> Therapy group </th> <th style="text-align:center;"> Mean (happiness) </th> <th style="text-align:center;"> Adjusted Mean (happiness) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No puppies </td> <td style="text-align:center;"> 3.22 </td> <td style="text-align:center;"> 2.93 </td> </tr> <tr> <td style="text-align:left;"> 15 mins </td> <td style="text-align:center;"> 4.88 </td> <td style="text-align:center;"> 4.71 </td> </tr> <tr> <td style="text-align:left;"> 30 mins </td> <td style="text-align:center;"> 4.85 </td> <td style="text-align:center;"> 5.15 </td> </tr> </tbody> </table> ] .pull-right[ .ong[.eq_lrge[ $$ `\begin{aligned} \hat{b}_1 &= 5.15-2.93 = 2.22 \\ \hat{b}_2 &= 4.71-2.93 = 1.78 \end{aligned}` $$ ]] ] -- <br> .center[ <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: white !important;"> (Intercept) </td> <td style="text-align:right;background-color: yellow !important;background-color: white !important;"> 1.789 </td> <td style="text-align:right;background-color: white !important;"> 0.867 </td> <td style="text-align:right;background-color: white !important;"> 2.063 </td> <td style="text-align:right;background-color: white !important;"> 0.049 </td> </tr> <tr> <td style="text-align:left;background-color: white !important;"> puppy_love </td> <td style="text-align:right;background-color: yellow !important;background-color: white !important;"> 0.416 </td> <td style="text-align:right;background-color: white !important;"> 0.187 </td> <td style="text-align:right;background-color: white !important;"> 2.227 </td> <td style="text-align:right;background-color: white !important;"> 0.035 </td> </tr> <tr> <td style="text-align:left;background-color: rgba(19, 108, 185, 0.4) !important;"> dose15 mins </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;background-color: yellow !important;"> 1.786 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 0.849 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 2.102 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 0.045 </td> </tr> <tr> <td style="text-align:left;background-color: rgba(19, 108, 185, 0.4) !important;"> dose30 mins </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;background-color: yellow !important;"> 2.225 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 0.803 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 2.771 </td> <td style="text-align:right;background-color: rgba(19, 108, 185, 0.4) !important;"> 0.010 </td> </tr> </tbody> </table> ] --- background-image: none background-color: #000000 class: no-scribble <video width="100%" height="100%" controls id="my_video"> <source src="media/milton_even_prettier_as_a_puppy.mp4" type="video/mp4"> </video> --- # The *F*-statistic with multiple predictors * The *F*-statistic is calculated using sums of squares * Type I (sequential) + The default in R + Each predictor is evaluated taking account of previous predictors + **The order of predictors matters!** * Type III + Each predictor is evaluated taking account of all other predictors + The order of predictors doesn’t matter * Type II and IV + These exist too but let's not confuse things ... --- background-image: url("media/milton_20180721_195630_crop.jpg") background-size: cover # .center[Type I sums of squares] .pull-left[ ```r lm(happiness ~ `puppy_love + dose`, data = pupluv_tib) |> anova() ``` .whitebox[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Mean Sq </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 6.73 </td> <td style="text-align:right;"> 6.73 </td> <td style="text-align:right;"> 2.22 </td> <td style="text-align:right;"> 0.15 </td> </tr> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 25.19 </td> <td style="text-align:right;"> 12.59 </td> <td style="text-align:right;"> 4.14 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 79.05 </td> <td style="text-align:right;"> 3.04 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> ]] -- .pull-right[ ```r lm(happiness ~ `dose + puppy_love`, data = pupluv_tib) |> anova() ``` .whitebox[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Mean Sq </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 16.84 </td> <td style="text-align:right;"> 8.42 </td> <td style="text-align:right;"> 2.77 </td> <td style="text-align:right;"> 0.08 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 15.08 </td> <td style="text-align:right;"> 15.08 </td> <td style="text-align:right;"> 4.96 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 79.05 </td> <td style="text-align:right;"> 3.04 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> ]] -- <br> .whitebox[ .warning[ .txt_lrge[<svg style="height: 1em; top:.04em; position: relative; fill: #CA3E34;" viewBox="0 0 576 512"><path d="M192,320h32V224H192Zm160,0h32V224H352ZM544,112H512a32.03165,32.03165,0,0,0-32,32v16H416V128h32a32.03165,32.03165,0,0,0,32-32V64a32.03165,32.03165,0,0,0-32-32H416a32.03165,32.03165,0,0,0-32,32H352a32.03165,32.03165,0,0,0-32,32v32H256V96a32.03165,32.03165,0,0,0-32-32H192a32.03165,32.03165,0,0,0-32-32H128A32.03165,32.03165,0,0,0,96,64V96a32.03165,32.03165,0,0,0,32,32h32v32H96V144a32.03165,32.03165,0,0,0-32-32H32A32.03165,32.03165,0,0,0,0,144V288a32.03165,32.03165,0,0,0,32,32H64v32a32.03165,32.03165,0,0,0,32,32h32v64a32.03165,32.03165,0,0,0,32,32h80a32.03165,32.03165,0,0,0,32-32V416a32.03165,32.03165,0,0,0-32-32h96a32.03165,32.03165,0,0,0-32,32v32a32.03165,32.03165,0,0,0,32,32h80a32.03165,32.03165,0,0,0,32-32V384h32a32.03165,32.03165,0,0,0,32-32V320h32a32.03165,32.03165,0,0,0,32-32V144A32.03165,32.03165,0,0,0,544,112ZM416,64h32V96H416ZM128,96V64h32V96ZM240,448H160V384h32v32h48Zm176,0H336V416h48V384h32ZM544,288H480v64H96V288H32V144H64V256H96V192h96V96h32v64H352V96h32v96h96v64h32V144h32Z"/></svg> WARNING! **Type I sums of squares: variable order matters!**] ]] --- background-image: url("media/milton_20180720_100533_crop.jpg") background-size: cover # Type III sums of squares -- .pull-left[ ```r lm(happiness ~ `puppy_love + dose`, data = pupluv_tib) |> car::Anova(type = 3) ``` .whitebox[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 12.94 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.26 </td> <td style="text-align:right;"> 0.05 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 15.08 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.96 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 25.19 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4.14 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 79.05 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> ] ] -- .pull-right[ ```r lm(happiness ~ `dose + puppy_love`, data = pupluv_tib) |> car::Anova(type = 3) ``` .whitebox[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 12.94 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.26 </td> <td style="text-align:right;"> 0.05 </td> </tr> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 25.19 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4.14 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 15.08 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.96 </td> <td style="text-align:right;"> 0.03 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 79.05 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> ]] -- <br> .whitebox[ .tip[ .txt_lrge[<svg style="height:1.5em; top:.04em; position: relative; fill: #2C5577;" viewBox="0 0 640 512"><path d="M512,176a16,16,0,1,0-16-16A15.9908,15.9908,0,0,0,512,176ZM576,32.72461V32l-.46094.3457C548.81445,12.30469,515.97461,0,480,0s-68.81445,12.30469-95.53906,32.3457L384,32v.72461C345.35156,61.93164,320,107.82422,320,160c0,.38086.10938.73242.11133,1.11328A272.01015,272.01015,0,0,0,96,304.26562V176A80.08413,80.08413,0,0,0,16,96a16,16,0,0,0,0,32,48.05249,48.05249,0,0,1,48,48V432a80.08413,80.08413,0,0,0,80,80H352a32.03165,32.03165,0,0,0,32-32,64.0956,64.0956,0,0,0-57.375-63.65625L416,376.625V480a32.03165,32.03165,0,0,0,32,32h32a32.03165,32.03165,0,0,0,32-32V316.77539A160.036,160.036,0,0,0,640,160C640,107.82422,614.64844,61.93164,576,32.72461ZM480,32a126.94015,126.94015,0,0,1,68.78906,20.4082L512,80H448L411.21094,52.4082A126.94015,126.94015,0,0,1,480,32Zm64,64v64a64,64,0,0,1-128,0V96l21.334,16h85.332ZM480,480H448V351.99609A15.99929,15.99929,0,0,0,425.5,337.377L303.1875,391.75a100.1169,100.1169,0,0,0-67.25-84.89062,7.96929,7.96929,0,0,0-10.09375,5.76562l-3.875,15.5625a8.16346,8.16346,0,0,0,5.375,9.5625C252,346.875,272,375.625,272,401.90625V448h48a32.03165,32.03165,0,0,1,32,32H144c-26.94531,0-48.13086-22.27344-47.99609-49.21875.63671-127.52734,101.31054-231.53516,227.36914-238.14063A160.02931,160.02931,0,0,0,480,320Zm0-192A128.14414,128.14414,0,0,1,352,160c0-32.16992,12.334-61.25391,32-83.76367V160a96,96,0,0,0,192,0V76.23633C595.666,98.74609,608,127.83008,608,160A128.14414,128.14414,0,0,1,480,288ZM432,160a16,16,0,1,0,16-16A15.9908,15.9908,0,0,0,432,160ZM162.94531,68.76953l39.71094,16.56055,16.5625,39.71094a5.32345,5.32345,0,0,0,9.53906,0l16.5586-39.71094,39.71484-16.56055a5.336,5.336,0,0,0,0-9.541l-39.71484-16.5586L228.75781,2.957a5.325,5.325,0,0,0-9.53906,0l-16.5625,39.71289-39.71094,16.5586a5.336,5.336,0,0,0,0,9.541Z"/></svg> TIP! If you want *F*-statistics and have several predictors, use Type III sums of squares] ]] --- # Bias in the *F*-statistic .pull-left[ * For the significance of *F*-statistics to be accurate we assume that the relationship between the covariate and outcome is similar across groups. * Known as **homogeneity of regression slopes** * When the assumption is met the resulting *F*-statistic can be assumed to follow the *F*-distribution and the corresponding *p*-value is accurate. * When the assumption is not met the *F*-statistic might not follow the *F*-distribution and the corresponding *p*-value is inaccurate * Only relevant for the *F*-statistic ] .pull-right[ <!-- --> ] ??? When the assumption of homogeneity of regression slopes is met the resulting F-statistic can be assumed to have the corresponding F-distribution; however, when the assumption is not met it can’t, meaning that the resulting F-statistic is being evaluated against a distribution different than the one that it actually has. Consequently, the Type I error rate of the test is inflated and the power to detect effects is not maximized (Hollingsworth, 1980). This is especially true when group sizes are unequal (Hamilton, 1977) and when the standardized regression slopes differ by more than 0.4 (Wu, 1984). --- ## Homogeneity of regression slopes .pull-left[ * For the significance of *F*-statistics to be accurate we assume that the relationship between the covariate and outcome is similar across groups. * Known as **homogeneity of regression slopes** * When the assumption is met the resulting *F*-statistic can be assumed to follow the *F*-distribution and the corresponding *p*-value is accurate. * When the assumption is not met the *F*-statistic might not follow the *F*-distribution and the corresponding *p*-value is inaccurate * Only relevant for the *F*-statistic ] .pull-right[ <!-- --> ] --- ## Heterogeneity of regression slopes .pull-left[ * For the significance of *F*-statistics to be accurate we assume that the relationship between the covariate and outcome is similar across groups. * Known as **homogeneity of regression slopes** * When the assumption is met the resulting *F*-statistic can be assumed to follow the *F*-distribution and the corresponding *p*-value is accurate. * When the assumption is not met the *F*-statistic might not follow the *F*-distribution and the corresponding *p*-value is inaccurate * Only relevant for the *F*-statistic ] .pull-right[ <!-- --> ] --- # Fitting the model ## Overall fit of each predictor ```r pupluv_lm <- lm(happiness ~ dose + puppy_love, data = pupluv_tib) car::Anova(pupluv_lm, type = 3) ``` .center[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 12.943 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.257 </td> <td style="text-align:right;"> 0.049 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 15.076 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4.959 </td> <td style="text-align:right;"> 0.035 </td> </tr> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 25.185 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4.142 </td> <td style="text-align:right;"> 0.027 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 79.047 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> ] -- <br> .infobox[
Love of puppies significantly predicted happiness *F*(1, 26) = 4.96, *p* = 0.035.
The dose of puppy therapy had a significant effect on happiness, *F*(2, 26) = 4.14, *p* = 0.027. ] --- ## Breaking down the overall effects (parameter estimates) ```r broom::tidy(pupluv_lm, conf.int = TRUE) ``` .center[ <table> <thead> <tr> <th style="text-align:left;"> term </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> std.error </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:right;"> conf.low </th> <th style="text-align:right;"> conf.high </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 1.789 </td> <td style="text-align:right;"> 0.867 </td> <td style="text-align:right;"> 2.063 </td> <td style="text-align:right;"> 0.049 </td> <td style="text-align:right;"> 0.007 </td> <td style="text-align:right;"> 3.572 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 0.416 </td> <td style="text-align:right;"> 0.187 </td> <td style="text-align:right;"> 2.227 </td> <td style="text-align:right;"> 0.035 </td> <td style="text-align:right;"> 0.032 </td> <td style="text-align:right;"> 0.800 </td> </tr> <tr> <td style="text-align:left;"> dose15 mins </td> <td style="text-align:right;"> 1.786 </td> <td style="text-align:right;"> 0.849 </td> <td style="text-align:right;"> 2.102 </td> <td style="text-align:right;"> 0.045 </td> <td style="text-align:right;"> 0.040 </td> <td style="text-align:right;"> 3.532 </td> </tr> <tr> <td style="text-align:left;"> dose30 mins </td> <td style="text-align:right;"> 2.225 </td> <td style="text-align:right;"> 0.803 </td> <td style="text-align:right;"> 2.771 </td> <td style="text-align:right;"> 0.010 </td> <td style="text-align:right;"> 0.575 </td> <td style="text-align:right;"> 3.875 </td> </tr> </tbody> </table> ] -- <br> .infobox[
Love of puppies significantly predicted happiness, *b* = 0.42 [0.03, 0.80], *t* = 2.23, *p* = 0.035. For every unit increase in puppy love, predicted happiness increased by 0.42 units.
The dose of puppy therapy also significantly predicted happiness. Compared to no puppy controls, happiness was significantly higher after both 15 minutes, *b* = 1.79 [0.04, 3.53], *t* = 2.10, *p* = 0.045, and 30 minutes, *b* = 2.22 [0.57, 3.88], *t* = 2.77, *p* = 0.010, of therapy. ] --- ## Adjusted means ```r modelbased::estimate_means(pupluv_lm, fixed = "puppy_love") ``` .center[ <table> <thead> <tr> <th style="text-align:left;"> dose </th> <th style="text-align:right;"> puppy_love </th> <th style="text-align:right;"> Mean </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> CI_low </th> <th style="text-align:right;"> CI_high </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No puppies </td> <td style="text-align:right;"> 2.73 </td> <td style="text-align:right;"> 2.93 </td> <td style="text-align:right;"> 0.60 </td> <td style="text-align:right;"> 1.70 </td> <td style="text-align:right;"> 4.15 </td> </tr> <tr> <td style="text-align:left;"> 15 mins </td> <td style="text-align:right;"> 2.73 </td> <td style="text-align:right;"> 4.71 </td> <td style="text-align:right;"> 0.62 </td> <td style="text-align:right;"> 3.44 </td> <td style="text-align:right;"> 5.99 </td> </tr> <tr> <td style="text-align:left;"> 30 mins </td> <td style="text-align:right;"> 2.73 </td> <td style="text-align:right;"> 5.15 </td> <td style="text-align:right;"> 0.50 </td> <td style="text-align:right;"> 4.12 </td> <td style="text-align:right;"> 6.18 </td> </tr> </tbody> </table> ] -- <br> .infobox[
At average levels of love of puppies, the mean happiness in the no puppy control group was *M* = 2.93 [1.70, 4.15] compared to *M* = 4.71 [3.44, 5.99] in the 15-minute group and *M* = 5.15 [4.12, 6.18] in the 30-minute group. ] --- background-image: none background-color: #000000 class: no-scribble <video width="100%" height="100%" controls id="my_video"> <source src="media/milton_tickly_tummy.mp4" type="video/mp4"> </video> --- class: inverse background-image: url("media/milton_20180906_163838_crop.jpg") background-size: cover # Testing assumptions -- .pull-left[ ```r library(ggfortify) ggplot2::autoplot(pupluv_lm, which = c(1, 3, 2, 4), colour = "#5c97bf", alpha = 0.5, size = 1) + theme_minimal() ``` ] .pull-right[ <!-- --> ] --- # Homogeneity of regression slopes ```r hors_lm <- lm(happiness ~ `puppy_love*dose`, data = pupluv_tib) car::Anova(hors_lm, type = 3) ``` -- <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 0.771 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.316 </td> <td style="text-align:right;"> 0.579 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 19.922 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 8.157 </td> <td style="text-align:right;"> 0.009 </td> </tr> <tr> <td style="text-align:left;"> dose </td> <td style="text-align:right;"> 36.558 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 7.484 </td> <td style="text-align:right;"> 0.003 </td> </tr> <tr> <td style="text-align:left;background-color: yellow !important;"> puppy_love:dose </td> <td style="text-align:right;background-color: yellow !important;"> 20.427 </td> <td style="text-align:right;background-color: yellow !important;"> 2 </td> <td style="text-align:right;background-color: yellow !important;"> 4.181 </td> <td style="text-align:right;background-color: yellow !important;"> 0.028 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 58.621 </td> <td style="text-align:right;"> 24 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> </tbody> </table> .tip[ <svg style="height:1.5em; top:.04em; position: relative; fill: #2C5577;" viewBox="0 0 640 512"><path d="M512,176a16,16,0,1,0-16-16A15.9908,15.9908,0,0,0,512,176ZM576,32.72461V32l-.46094.3457C548.81445,12.30469,515.97461,0,480,0s-68.81445,12.30469-95.53906,32.3457L384,32v.72461C345.35156,61.93164,320,107.82422,320,160c0,.38086.10938.73242.11133,1.11328A272.01015,272.01015,0,0,0,96,304.26562V176A80.08413,80.08413,0,0,0,16,96a16,16,0,0,0,0,32,48.05249,48.05249,0,0,1,48,48V432a80.08413,80.08413,0,0,0,80,80H352a32.03165,32.03165,0,0,0,32-32,64.0956,64.0956,0,0,0-57.375-63.65625L416,376.625V480a32.03165,32.03165,0,0,0,32,32h32a32.03165,32.03165,0,0,0,32-32V316.77539A160.036,160.036,0,0,0,640,160C640,107.82422,614.64844,61.93164,576,32.72461ZM480,32a126.94015,126.94015,0,0,1,68.78906,20.4082L512,80H448L411.21094,52.4082A126.94015,126.94015,0,0,1,480,32Zm64,64v64a64,64,0,0,1-128,0V96l21.334,16h85.332ZM480,480H448V351.99609A15.99929,15.99929,0,0,0,425.5,337.377L303.1875,391.75a100.1169,100.1169,0,0,0-67.25-84.89062,7.96929,7.96929,0,0,0-10.09375,5.76562l-3.875,15.5625a8.16346,8.16346,0,0,0,5.375,9.5625C252,346.875,272,375.625,272,401.90625V448h48a32.03165,32.03165,0,0,1,32,32H144c-26.94531,0-48.13086-22.27344-47.99609-49.21875.63671-127.52734,101.31054-231.53516,227.36914-238.14063A160.02931,160.02931,0,0,0,480,320Zm0-192A128.14414,128.14414,0,0,1,352,160c0-32.16992,12.334-61.25391,32-83.76367V160a96,96,0,0,0,192,0V76.23633C595.666,98.74609,608,127.83008,608,160A128.14414,128.14414,0,0,1,480,288ZM432,160a16,16,0,1,0,16-16A15.9908,15.9908,0,0,0,432,160ZM162.94531,68.76953l39.71094,16.56055,16.5625,39.71094a5.32345,5.32345,0,0,0,9.53906,0l16.5586-39.71094,39.71484-16.56055a5.336,5.336,0,0,0,0-9.541l-39.71484-16.5586L228.75781,2.957a5.325,5.325,0,0,0-9.53906,0l-16.5625,39.71289-39.71094,16.5586a5.336,5.336,0,0,0,0,9.541Z"/></svg> Homogeneity of regression slopes **cannot be assumed** because: 1. The dose*puppy_love interaction effect is significant 2. The plots (see earlier) show that the relationship between puppy love and happiness is different in the 30 minute group to the other two groups. ] --- background-image: url("media/milton_20190627_072357_crop.jpg") background-size: cover # Robust parameter estimates ```r pupluv_rob <- robust::lmRob(happiness ~ puppy_love + dose, data = pupluv_tib) summary(pupluv_rob) ``` .whitebox[.center[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Estimate </th> <th style="text-align:right;"> Std. Error </th> <th style="text-align:right;"> t value </th> <th style="text-align:right;"> Pr(>|t|) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 1.041 </td> <td style="text-align:right;"> 2.103 </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.625 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 0.633 </td> <td style="text-align:right;"> 0.556 </td> <td style="text-align:right;"> 1.139 </td> <td style="text-align:right;"> 0.265 </td> </tr> <tr> <td style="text-align:left;"> dose15 mins </td> <td style="text-align:right;"> 1.855 </td> <td style="text-align:right;"> 2.642 </td> <td style="text-align:right;"> 0.702 </td> <td style="text-align:right;"> 0.489 </td> </tr> <tr> <td style="text-align:left;"> dose30 mins </td> <td style="text-align:right;"> 1.400 </td> <td style="text-align:right;"> 1.750 </td> <td style="text-align:right;"> 0.800 </td> <td style="text-align:right;"> 0.431 </td> </tr> </tbody> </table> ]] -- <br> .infobox[
Robust estimates showed that love of puppies **did not** have a significant effect on happiness, *b* = 0.63, *t* = 1.14, *p* = 0.265. For every unit increase in puppy love (on the 0-7 scale), predicted happiness (on the 0-10 scale) increased by 0.63 units.
The dose of puppy therapy also **did not** significantly predicted happiness. Robust estimates showed that compared to no puppy controls, happiness was not significantly higher after 15 minutes, *b* = 1.86, *t* = 0.70, *p* = 0.489, or 30 minutes, *b* = 1.40, *t* = 0.80, *p* = 0.431, of therapy. ] --- background-image: url("media/milton_20190627_072357_crop.jpg") background-size: cover # Heteroscedasticity consistent standard errors ```r parameters::model_parameters(pupluv_lm, vcov = "HC4") ``` .whitebox[.center[ <table> <thead> <tr> <th style="text-align:left;"> Parameter </th> <th style="text-align:right;"> Coefficient </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> CI </th> <th style="text-align:right;"> CI_low </th> <th style="text-align:right;"> CI_high </th> <th style="text-align:right;"> t </th> <th style="text-align:right;"> df_error </th> <th style="text-align:right;"> p </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 1.789 </td> <td style="text-align:right;"> 0.544 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.671 </td> <td style="text-align:right;"> 2.908 </td> <td style="text-align:right;"> 3.288 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 0.003 </td> </tr> <tr> <td style="text-align:left;"> puppy_love </td> <td style="text-align:right;"> 0.416 </td> <td style="text-align:right;"> 0.190 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.025 </td> <td style="text-align:right;"> 0.807 </td> <td style="text-align:right;"> 2.187 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 0.038 </td> </tr> <tr> <td style="text-align:left;"> dose15 mins </td> <td style="text-align:right;"> 1.786 </td> <td style="text-align:right;"> 0.493 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.772 </td> <td style="text-align:right;"> 2.800 </td> <td style="text-align:right;"> 3.619 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> dose30 mins </td> <td style="text-align:right;"> 2.225 </td> <td style="text-align:right;"> 0.690 </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.807 </td> <td style="text-align:right;"> 3.642 </td> <td style="text-align:right;"> 3.226 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 0.003 </td> </tr> </tbody> </table> ]] -- <br> .infobox[
Love of puppies significantly predicted happiness, *b* = 0.42 [0.03, 0.81], *t* = 2.19, *p* = 0.038. For every unit increase in puppy love, predicted happiness increased by 0.42 units.
The dose of puppy therapy also significantly predicted happiness. Compared to no puppy controls, happiness was significantly higher after both 15 minutes, *b* = 1.79 [0.77, 2.80], *t* = 3.62, *p* = 0.001, and 30 minutes, *b* = 1.79 [0.77, 2.80], *t* = 3.62, *p* = 0.001, of therapy. ] --- # Summary * When we include both a categorical and continuous predictor, the categorical predictor compares means adjusted for the effect of the continuous predictor. + The effect of the categorical variable at average levels of the continuous predictor * Test the overall effect of categorical predictors using the *F*-statistic + Use Type III sums of squares (other things being equal) + Test for homogeneity of regression slopes * Break down the effects of categorical predictors using parameter estimates and their associated tests + Interpret in the same way as in previous lectures * Test the usual assumptions for the linear model * Apply a robust test if necessary