Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How does the 'mixed' command process the group without variation in an indicator when estimating the fixed effect of this indicator?

    Hi all,

    I have a question regarding the coefficient estimation for multilevel linear models.

    Considering a situation in a nested dataset where the observations of group A have the same value for the indicator X, and all other groups have a variation in this indicator. When estimating a MLM using the 'mixed' command, how is the fixed effect of the indicator X calculated? Are the observations (with the same X value) of group A included to calculate the coefficient of X?

    I checked the Stata manual but did not find answers. I appreciate any response and references!

    Best,
    Hong

  • #2
    Yes, those observations are included. -mixed- does not estimate within-group effects. So the absence of variation of X within a group of observations (or several groups of observations, as long as there is some variation in X somewhere in the data) calls for no special treatment. The presence of groups where X does not vary does diminish the effective sample size that powers the estimation of X, so your standard error is likely to be larger than it would be without such groups in the data. But that's just the natural consequence of the usual estimation. This X is not treated differently from any other.

    Comment


    • #3
      I agree with Clyde's response here. If you wanted mixed to give you a purely within-group estimate of X, you can simply include the group mean of X as a second predictor. Having done so, the coefficient for X is now the within-group estimate of Y on X adjusting for covariates. The coefficient for the group mean of X is the difference between the within and between group effect of Y on X.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        Yes, those observations are included. -mixed- does not estimate within-group effects. So the absence of variation of X within a group of observations (or several groups of observations, as long as there is some variation in X somewhere in the data) calls for no special treatment. The presence of groups where X does not vary does diminish the effective sample size that powers the estimation of X, so your standard error is likely to be larger than it would be without such groups in the data. But that's just the natural consequence of the usual estimation. This X is not treated differently from any other.
        Hi Clyde,

        Many thanks for your quick answer.

        I am still a bit confused. Observations within groups with unvaried X are ineffective samples and result in a reduced effective sample size. Does it mean that these observations do not contribute to the coefficient calculation of X, or are they the same as other observations for calculating the coefficient but are not counted as effective sample size, which makes the standard error bigger (only influences the significance)?

        Best,
        Hong

        Comment


        • #5
          Originally posted by Erik Ruzek View Post
          I agree with Clyde's response here. If you wanted mixed to give you a purely within-group estimate of X, you can simply include the group mean of X as a second predictor. Having done so, the coefficient for X is now the within-group estimate of Y on X adjusting for covariates. The coefficient for the group mean of X is the difference between the within and between group effect of Y on X.
          Thank you Erik!

          If I understand right, without adding the group mean of X as a second predictor, the coefficient of X is a combination of both within-group effects and between-group effects. Although observations within groups without varied X do not contribute to the within-group effects, they are still necessary for the between-group effects and overall effect (the coefficient) of X.

          Best,
          Hong

          Comment


          • #6
            Observations within groups with unvaried X are ineffective samples and result in a reduced effective sample size. Does it mean that these observations do not contribute to the coefficient calculation of X, or are they the same as other observations for calculating the coefficient but are not counted as effective sample size, which makes the standard error bigger (only influences the significance)?
            Sorry for not being clearer, and especially for using that language about "diminishing the effective sample size"--that's really not the right way of thinking about it.. The observations in non-X-varying grouops are absolutely the same as other observations for calculating the coefficient. And there is no actual "counting" for effective sample size: it's just that if you have many observations with the same value of X (whether they are in identified groups or not), that will make the variance of X smaller than if there were many different values, and the smaller variance of X, all else being equal, leads to higher standard errors and lower power.

            This would be true even in one-level models where there is no explicit grouping. Here's an illustration:
            Code:
            clear*
            sysuse auto
            
            //    USING THE ORIGINAL DATA
            summ mpg
            regress price mpg
            
            //    PICK HALF OF THE OBSERVATIONS AT RANDOM AND REPLACE THEM ALL BY THEIR
            //    MEAN VALUE
            set seed 1234
            gen double shuffle = runiform()
            sort shuffle
            summ mpg if _n*2 <= _N, meanonly
            replace mpg = r(mean) if _n*2 <= _N
            
            //    REPEAT THE CALCULATIONS WITH THE MODIFIED DATA
            summ mpg
            regress price mpg
            You can see that by replacing half of the values of mpg by the mean of those values, the mean of mpg remains unchanged, but the standard deviation has increased considerably. In the regression of price on mpg, we see that while the coefficient has changed slightly, a decrease in magnitude of about 9%, the standard error has gone up by about 60%.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Sorry for not being clearer, and especially for using that language about "diminishing the effective sample size"--that's really not the right way of thinking about it.. The observations in non-X-varying grouops are absolutely the same as other observations for calculating the coefficient. And there is no actual "counting" for effective sample size: it's just that if you have many observations with the same value of X (whether they are in identified groups or not), that will make the variance of X smaller than if there were many different values, and the smaller variance of X, all else being equal, leads to higher standard errors and lower power.

              This would be true even in one-level models where there is no explicit grouping. Here's an illustration:
              Code:
              clear*
              sysuse auto
              
              // USING THE ORIGINAL DATA
              summ mpg
              regress price mpg
              
              // PICK HALF OF THE OBSERVATIONS AT RANDOM AND REPLACE THEM ALL BY THEIR
              // MEAN VALUE
              set seed 1234
              gen double shuffle = runiform()
              sort shuffle
              summ mpg if _n*2 <= _N, meanonly
              replace mpg = r(mean) if _n*2 <= _N
              
              // REPEAT THE CALCULATIONS WITH THE MODIFIED DATA
              summ mpg
              regress price mpg
              You can see that by replacing half of the values of mpg by the mean of those values, the mean of mpg remains unchanged, but the standard deviation has increased considerably. In the regression of price on mpg, we see that while the coefficient has changed slightly, a decrease in magnitude of about 9%, the standard error has gone up by about 60%.
              It is very clear, thank you very much!

              Comment

              Working...
              X