Linear mixed model with -xtmixed- vs. Repeated measures Anova

Anna Zakharova

Join Date: Apr 2014

Posts: 11
#1

Linear mixed model with -xtmixed- vs. Repeated measures Anova

19 Sep 2014, 05:27

Hi guys

I have one general question – when is it “better” to choose LMM vs. RM-Anova for data with repeated measurements? I have one particular problem, and hope somebody can explain it to me using this example.

I am interested in the influence of the level of complexity (1=medium, 2=high) and configurations (1=average, 2=modified) on the perceived level of aesthetics of an art piece. Each subject evaluates 2 pictures with different components, which were modified across the dimension described above, so each subject has 8 evaluations resulting in the following data structure:

I ran the following model:

after that I run:

Do you think this approach is suitable? I have trouble explaining that while the combination of complexity=2 and configuration=2 has the mean of M = 5.87 but in the model it has a negative effect of b = -.45, and I thought 2 2 means “relative to the reference category 1 1”?, but it does not make any sense as the mean of 1 1 is M = 4.67. How would you interpret this effect then? Is it that that the effect of increasing complexity “overshadows” the interaction effect?

Furthermore, what is the difference between running –xtmixed- and RM anova in this case? I know that LMM would be more suitable in case I had different number of evaluations per ID, right?

so if I run RM anova, it looks as following:

Now I am struggling with the question, which method I should use here? Is there anything I am not accounting for? Any help would be much appreciated! Thanks, Anna

1 Photo

Last edited by Anna Zakharova; 19 Sep 2014, 05:32.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

19 Sep 2014, 09:16

Whatever it was you attempted to insert after "I ran the following model", "after that I run:" and "so if I run RM anova, it looks as following:" do not show up at all. The screen shot of your output at the end is completely unreadable. Please, please put exhibits of code or results in code blocks using the advanced editor. (Press the underlined A button, then press the # button and paste your code/results between the delimiters that show up.) Right now, we simply can't see what you're trying to show us, and we can't help you.
Comment

Anna Zakharova

Join Date: Apr 2014
Posts: 11

19 Sep 2014, 12:48

Hi Clyde I apologize for uploading non-readable screenshots- it looked fine on my screen, thank you for the hint.

so one more time:

I have one general question – when is it “better” to choose LMM vs. RM-Anova for data with repeated measurements? I have one particular problem, and hope somebody can explain it to me using this example.

I am interested in the influence of the level of complexity (1=medium, 2=high) and configurations (1=average, 2=modified) on the perceived level of aesthetics of an art piece. Each subject evaluates 2 pictures with different components, which were modified across the dimension described above, so each subject has 8 evaluations resulting in the following data structure:

Code:

 list ID complexity configuration component aesthetics, sep (8)

      +-------------------------------------------------+
      |  ID   comple~y   config~n   compon~t   aesthe~s |
      |-------------------------------------------------|
   1. |   4          1          1          1          4 |
   2. |   4          1          1          2          3 |
   3. |   4          1          2          1          5 |
   4. |   4          1          2          2          2 |
   5. |   4          2          1          1          5 |
   6. |   4          2          1          2          5 |
   7. |   4          2          2          1          6 |
   8. |   4          2          2          2          5 |
      |-------------------------------------------------|
   9. |   6          2          2          .          . |
  10. |   6          1          1          2          7 |
  11. |   6          1          1          .          . |
  12. |   6          1          2          .          . |
  13. |   6          1          2          .          . |
  14. |   6          2          1          2          7 |
  15. |   6          2          1          .          . |
  16. |   6          2          2          .          . |
      |-------------------------------------------------|
  17. |   7          1          1          .          . |

To see whether Change in complexity/configurations has an influence on aesthetics perceptions I ran:

Code:

xtmixed aesthetics complexity##configuration component || ID :, var

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -3201.3606 
Iteration 1:   log likelihood = -3201.3606 

Computing standard errors:

Mixed-effects ML regression                     Number of obs      =      1961
Group variable: ID                              Number of groups   =       250

                                                Obs per group: min =         1
                                                               avg =       7.8
                                                               max =         8


                                                Wald chi2(4)       =    651.89
Log likelihood = -3201.3606                     Prob > chi2        =    0.0000

------------------------------------------------------------------------------------------
              aesthetics |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
            2.complexity |   1.531644   .0732134    20.92   0.000     1.388148    1.675139
         2.configuration |   .2503679   .0733646     3.41   0.001      .106576    .3941598
                         |
complexity#configuration |
                    2 2  |  -.4573005   .1037747    -4.41   0.000    -.6606952   -.2539057
                         |
               component |  -.0343586   .0519733    -0.66   0.509    -.1362243    .0675071
                   _cons |   4.604247    .101381    45.42   0.000     4.405544     4.80295
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
ID: Identity                 |
                  var(_cons) |   .3812044   .0497715      .2951357    .4923727
-----------------------------+------------------------------------------------
               var(Residual) |   1.318926    .045096      1.233436    1.410341
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) =   201.38 Prob >= chibar2 = 0.0000

and

Code:

 margins complexity##configuration

Predictive margins                                Number of obs   =       1961

Expression   : Linear prediction, fixed portion, predict()

------------------------------------------------------------------------------------------
                         |            Delta-method
                         |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
              complexity |
                      1  |   4.677415   .0536867    87.12   0.000     4.572191    4.782639
                      2  |   5.981458   .0536852   111.42   0.000     5.876237    6.086679
                         |
           configuration |
                      1  |   5.318237   .0535888    99.24   0.000     5.213205    5.423269
                      2  |   5.340071   .0537934    99.27   0.000     5.234638    5.445504
                         |
complexity#configuration |
                    1 1  |   4.552805   .0648777    70.18   0.000     4.425647    4.679963
                    1 2  |   4.803173   .0651668    73.71   0.000     4.675449    4.930898
                    2 1  |   6.084449   .0649192    93.72   0.000      5.95721    6.211689
                    2 2  |   5.877517   .0651479    90.22   0.000     5.749829    6.005204
------------------------------------------------------------------------------------------

Do you think this approach is suitable? I have trouble explaining that while the combination of complexity=2 and configuration=2 has the mean of M = 5.87 but in the model it has a negative effect of b = -.45, and I thought 2 2 means “relative to the reference category 1 1”?, but it does not make any sense as the mean of 1 1 is M = 4.67. How would you interpret this effect then? Is it that that the effect of increasing complexity “overshadows” the interaction effect?

Furthermore, what is the difference between running –xtmixed- and RM anova in this case? I know that LMM would be more suitable in case I had different number of evaluations per ID, right?

so if I run RM anova, it looks as following:

Code:

anova aesthetics complexity##configuration component ID, repeated (complexity configuration component)

                           Number of obs =    1961     R-squared     =  0.4616
                           Root MSE      = 1.14927     Adj R-squared =  0.3818

                  Source |  Partial SS    df       MS           F     Prob > F
  -----------------------+----------------------------------------------------
                   Model |  1933.17466   253  7.64100658       5.79     0.0000
                         |
              complexity |  835.180245     1  835.180245     632.32     0.0000
             configura~n |  .250595654     1  .250595654       0.19     0.6632
  complexity#configura~n |  26.3041669     1  26.3041669      19.91     0.0000
               component |  .551460671     1  .551460671       0.42     0.5183
                      ID |  1076.43375   249  4.32302709       3.27     0.0000
                         |
                Residual |  2254.64481  1707  1.32082297  
  -----------------------+----------------------------------------------------
                   Total |  4187.81948  1960  2.13664259  


Between-subjects error term:  ID
                     Levels:  250       (249 df)
     Lowest b.s.e. variable:  ID

Repeated variable: complexity
                                          Huynh-Feldt epsilon        =  1.0000
                                          Greenhouse-Geisser epsilon =  1.0000
                                          Box's conservative epsilon =  1.0000

                                            ------------ Prob > F ------------
                  Source |     df      F    Regular    H-F      G-G      Box
  -----------------------+----------------------------------------------------
              complexity |      1   632.32   0.0000   0.0000   0.0000   0.0000
                Residual |   1707
  ----------------------------------------------------------------------------

Repeated variable: configura~n
                                          Huynh-Feldt epsilon        =  1.0000
                                          Greenhouse-Geisser epsilon =  1.0000
                                          Box's conservative epsilon =  1.0000

                                            ------------ Prob > F ------------
                  Source |     df      F    Regular    H-F      G-G      Box
  -----------------------+----------------------------------------------------
             configura~n |      1     0.19   0.6632   0.6632   0.6632   0.6632
                Residual |   1707
  ----------------------------------------------------------------------------

Repeated variables: complexity#configura~n
                                          Huynh-Feldt epsilon        =  1.0000
                                          Greenhouse-Geisser epsilon =  1.0000
                                          Box's conservative epsilon =  1.0000

                                            ------------ Prob > F ------------
                  Source |     df      F    Regular    H-F      G-G      Box
  -----------------------+----------------------------------------------------
  complexity#configura~n |      1    19.91   0.0000   0.0000   0.0000   0.0000
                Residual |   1707
  ----------------------------------------------------------------------------

Repeated variable: component
                                          Huynh-Feldt epsilon        =  1.0000
                                          Greenhouse-Geisser epsilon =  1.0000
                                          Box's conservative epsilon =  1.0000

                                            ------------ Prob > F ------------
                  Source |     df      F    Regular    H-F      G-G      Box
  -----------------------+----------------------------------------------------
               component |      1     0.42   0.5183   0.5183   0.5183   0.5183
                Residual |   1707
  ----------------------------------------------------------------------------

Now I am struggling with the question, which method I should use here? Is there anything I am missing? Any help would be much appreciated! Thanks, Anna

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

19 Sep 2014, 13:37

I have trouble explaining that while the combination of complexity=2 and configuration=2 has the mean of M = 5.87 but in the model it has a negative effect of b = -.45, and I thought 2 2 means “relative to the reference category 1 1”?, but it does not make any sense as the mean of 1 1 is M = 4.67. How would you interpret this effect then?

The effect of complexity&configuration 2 2 relative to the reference category 1 1 includes not just the 2.complexity#2.configuration interaction term but also the 2.complexity and 2.configuration main effects. So if you look at the difference in the margins, 5.877517 - 4.552805 = 1.324712. And if you look at the coefficient sum 2.commplexity + 2.configuration +2.complexity#2.configuration, you will find you get 1.3247114. So everything aligns properly.

As for the choice between RM anova and mixed regression, I have a strong bias in favor of mixed regression because it is tolerant of missing data (though apparently you don't have this problem, as both models ran with the same N), and because it dispenses with stringent assumptions such as compound symmetry (sphericity), and therefore does not need approximate adjustments such as Greenhouse-Geisser and Huynh-Feldt. Moreover, mixed effects modeling can be applied to much more complicated situations than RM anova can--so if you are going to invest your time and energy learning one technique or the other, the payoff of knowing mixed regression seems much larger.

That said, when the restrictive assumptions of RM anova are met and the data are complete, the two underlying models are algebraically the same. Differences in the results they produce are due to different methods of estimation and different parameterizations of the model. And in some disciplines there may be a strong preference for one of the other.

All else equal, though, I lean towards mixed regression. There are others on this forum, however, who have a deeper understanding of the underpinnings of these models and might offer a more authoritative opinion.
Comment

Announcement