Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins for unbalanced panel giving inaccurate estimates

    I found that margins gives an inaccurate estimate after xtreg, fe on an unbalanced panel, but it works fine if the panel is balanced. Does anyone know why? Here is an example in Stata 17:

    Code:
    *1. Create unbalanced panel
    webuse nlswork.dta, clear
    drop if year>69 | hours==.
    xtset idcode year
    drop if year==69 & L.year==.
    tab year
    
    *2. Test margins for unbalanced panel
    xtreg hours year, fe
    margins, by(year)
    bysort year: su hours
    
    *3. Create balanced panel
    xtset idcode year
    drop if year==68 & F.year==.
    
    *4. Test margins for balanced panel
    xtreg hours year, fe
    margins, by(year)
    bysort year: su hours
    Below is an excerpt from the output for margins on unbalanced panel, which does not tally with summary means.

    Code:
    . margins, by(year)
    
    Predictive margins                                       Number of obs = 2,224
    Model VCE: Conventional
    
    Expression: Linear prediction, predict()
    Over:       year
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            year |
             68  |   37.54182   .1774929   211.51   0.000     37.19394     37.8897
             69  |    38.1724   .2313402   165.01   0.000     37.71899    38.62582
    ------------------------------------------------------------------------------
    
    . bysort year: su hours
    
    ------------------------------------------------------------------------------------------------
    -> year = 68
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           hours |      1,374    37.35007    9.303987          1         84
    
    ------------------------------------------------------------------------------------------------
    -> year = 69
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           hours |        850    38.48235    7.161952          2         70

  • #2
    Code:
    *1. Create unbalanced panel
    webuse nlswork.dta, clear
    drop if year>69 | hours==.
    xtset idcode year
    drop if year==69 & L.year==.
    tab year
    
    *2. Test margins for unbalanced panel
    xtreg hours year, fe
    predict hhat
    margins, by(year)
    bysort year: su hours hhat
    Code:
    . predict hhat
    (option xb assumed; fitted values)
    
    . margins, by(year)
    
    Predictive margins                                       Number of obs = 2,224
    Model VCE: Conventional
    
    Expression: Linear prediction, predict()
    Over:       year
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            year |
             68  |   37.54182   .1774929   211.51   0.000     37.19394     37.8897
             69  |    38.1724   .2313402   165.01   0.000     37.71899    38.62582
    ------------------------------------------------------------------------------
    
    . bysort year: su hours hhat
    
    ------------------------------------------------------------------------------------------------
    -> year = 68
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           hours |      1,374    37.35007    9.303987          1         84
            hhat |      1,374    37.54182           0   37.54182   37.54182
    
    ------------------------------------------------------------------------------------------------
    -> year = 69
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           hours |        850    38.48235    7.161952          2         70
            hhat |        850    38.17241           0   38.17241   38.17241
    
    
    .

    Comment


    • #3
      Dear William, thank you very much for your response. Perhaps I can put the question a little differently - why is the estimate for hhat (37.54182) different from the actual mean (37.35007) when the panel is unbalanced, but not when it is balanced? Thank you!

      Comment


      • #4
        The short answer is, because mathematics.

        There's no reason to expect the means of the predictions to equal the means of the data within separate subgroups in general. The math works out that way in balanced panels. I'm sure my linear models textbooks provide a more complete answer, but I'm about 50 years away from them so I can't do better than that.

        Note that even with the unbalanced panel, the mean of the predictions is equal to the mean of the data across the entire population.
        Code:
        . margins
        
        Predictive margins                                       Number of obs = 2,224
        Model VCE: Conventional
        
        Expression: Linear prediction, predict()
        
        ------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               _cons |   37.78282   .1336238   282.76   0.000     37.52093    38.04472
        ------------------------------------------------------------------------------
        
        . su hours hhat
        
            Variable |        Obs        Mean    Std. dev.       Min        Max
        -------------+---------------------------------------------------------
               hours |      2,224    37.78282    8.564909          1         84
                hhat |      2,224    37.78282    .3064868   37.54182   38.17241

        Comment


        • #5
          Hi William, thank you so much, that helps to clear things up.

          Comment

          Working...
          X