Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman test with year dummies, V_b-V-B is not positiv definit

    Dear Statalist,

    I am doing a Hausman test in order to test the RE estimator against the FE estimator.
    The dependent variable in the model is log(hourly_wage) and I have a set of explanatory variables (22 regressors in the FE model and 29 in the RE model) - among these variables there are nine year dummies.
    My dataset is an unbalanced data set with approximately i=10.000 and T=10. The reason why the data set is unbalanced is that I only have observations for employed people with an hourly wage available in the data.

    When testing the RE against FE estimator I write the following in Stata:

    (Command 1)
    xtreg y d2002 d2003 d2004 d2005 d2006 d2007 d2008 d2009 x1-x22, fe
    estimates store fixed
    xtreg y d2002 d2003 d2004 d2005 d2006 d2007 d2008 d2009 x1-x13, re
    estimates store randon
    hausman fixed random, sigmamore

    When I run this I get chi2(22)=959.86 and Prob>chi2=0.0000 and the message "V_b-V_B is not positive definit".

    From Wooldridge "Econometric Analysis of cross section and panel data" from 2010 page 333 I have found following:
    "To summarize, we can estimate models that include aggregate time effects, time constant variables, and regressors that change across both i and t, by RE and FE estimation. But no matter how we compute a test statistic, we can only compare the coefficients on the regressors that change across both i and t. "

    I have made a joint F test to find that the nine year dummy variables are significantly different from 0 and hence should be included in the FE and RE models.
    What I read from the text stated from Wooldridge is that I cannot include my year-dummies in the two models when performing a Hausman test - or is this a wrong way to read it?
    If i run the same command as a above - but without the nine year dummies - i get the following:

    (Command 2)
    If i xtreg y x1-x22, fe
    estimates store fixed
    xtreg y x1-x13, re
    estimates store randon
    hausman fixed random, sigmamore

    chi2(13)=623.32 and Prob>chi2=0.0000.
    Hence the message with not positiv definit is gone.

    My question is how to handle this situation? Do I:
    (1) Keep the year dummies in the model as in (Command 1) above and write that I have a variance which is not positiv definit and therefor can't use the test? OR
    (2) Do I remove the dummy variables from the models as in (Command 2) and use the test and hereby go for the FE estimator?

    I hope that someone can help me answer this question or maybe just help me realize something relevant that I have missed in the process.
    Please let me know if there is anything crucial you need to know in order to help me solve this problem.

    Thank you.

    Kind Regards,
    Cath










  • #2
    Cathrine:
    you may want to take a look at the following thread: http://www.statalist.org/forums/foru...d-with-haumsna
    Otherwise, you may want to google with the following string: -augmented regression AND vince wiggins- for a possible solution to your problem.
    Saving you some time you should allocate to searching the web, the Stata thread I meant can be found at: http://www.stata.com/statalist/archi.../msg00053.html.
    Last edited by Carlo Lazzaro; 13 May 2015, 12:24.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      I should've been clearer in my book. You should obtain the FE and RE estimates including the year dummies, but you cannot include them in the comparison -- just like you cannot include the RE coefficients on the time-constant variables. It can be a bit cumbersome to do that using the traditional Hausman approach. An easier approach is to use the Mundlak device. Compute the time averages of all variables that have some variation across i and t, and add these to the RE estimation. Then do a joint test on just the time averages. It is just a test of exclusion restrictions. A good reason for using the regression-based test is that it is easy to make it cluster robust. It makes no sense to compute robust standard errors for FE and RE but then use a nonrobust Hausman test.

      My best discussions of this over the past couple of years (in my courses) do not appear on line, but the link contains something close. The relevant material starts on slide 43. In my example, I have only one variable that changes across i and t, and so only one time average is included, and then a cluster-robust t statistic can be used. In general, it is a cluster-robust Wald statistic. Note that the coefficients on the time-constant variables and year dummies are not tested.

      I hope this helps. JW

      Comment


      • #4
        Dear Wooldridge,

        Thank you for your comprehensive response.
        I followed your guidelines carefully and found the slideshow in your link very helpful - it was a great solution to my problem.
        Also, the section in your book ended up being very helpful after I got a deeper understanding of the problem.

        Thank you for taking the time to help me. And thank you for a great book.

        Kind regards,
        Cathrine

        Comment


        • #5
          Originally posted by Jeff Wooldridge View Post
          I should've been clearer in my book. You should obtain the FE and RE estimates including the year dummies, but you cannot include them in the comparison -- just like you cannot include the RE coefficients on the time-constant variables. It can be a bit cumbersome to do that using the traditional Hausman approach. An easier approach is to use the Mundlak device. Compute the time averages of all variables that have some variation across i and t, and add these to the RE estimation. Then do a joint test on just the time averages. It is just a test of exclusion restrictions. A good reason for using the regression-based test is that it is easy to make it cluster robust. It makes no sense to compute robust standard errors for FE and RE but then use a nonrobust Hausman test.

          My best discussions of this over the past couple of years (in my courses) do not appear on line, but the link contains something close. The relevant material starts on slide 43. In my example, I have only one variable that changes across i and t, and so only one time average is included, and then a cluster-robust t statistic can be used. In general, it is a cluster-robust Wald statistic. Note that the coefficients on the time-constant variables and year dummies are not tested.

          I hope this helps. JW
          Dear Mr Wooldridge just to clarify when you can use time dummies in both fe and re models, however when when you say you cannot include them in comparison:

          -do you mean:

          1) we run the hausman test, including the time dummy in both re and fe, but disregard the the coefficients of the time dummy when we display the comparative output between fe and re

          Code:
          xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
          > h_to_totalassets div_yield roce_w1 year2016 if inlist(year,2015,2016), fe
          
          Fixed-effects (within) regression               Number of obs     =        539
          Group variable: firmid                          Number of groups  =        282
          
          R-sq:                                           Obs per group:
               within  = 0.3234                                         min =          1
               between = 0.2051                                         avg =        1.9
               overall = 0.2155                                         max =          2
          
                                                          F(12,245)         =       9.76
          corr(u_i, Xb)  = -0.6938                        Prob > F          =     0.0000
          
          -------------------------------------------------------------------------------------
                    lntobinsq |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          --------------------+----------------------------------------------------------------
                     lnassets |  -.4007886   .0731027    -5.48   0.000    -.5447786   -.2567987
              FXDerivatives10 |   .1024007    .074275     1.38   0.169    -.0438983    .2486997
              IRDerivatives10 |  -.1037169     .07521    -1.38   0.169    -.2518575    .0444237
              bookleverage_w1 |    .207787   .1239924     1.68   0.095    -.0364401    .4520141
                       roa_w1 |   .0210163   .0056622     3.71   0.000     .0098636    .0321691
                    zscore_w1 |    .009197   .0148501     0.62   0.536    -.0200532    .0384471
                    cratio_w1 |  -.0691649   .0227068    -3.05   0.003    -.1138904   -.0244394
                      rnd_rev |  -.0119345    .008226    -1.45   0.148    -.0281371    .0042681
          cash_to_totalassets |   .3159089   .2088175     1.51   0.132    -.0953977    .7272156
                    div_yield |  -.0212811   .0032873    -6.47   0.000    -.0277562   -.0148061
                      roce_w1 |   .0004484   .0004762     0.94   0.347    -.0004894    .0013863
                     year2016 |   .0223946   .0147759     1.52   0.131    -.0067093    .0514986
                        _cons |   3.311415   .5163692     6.41   0.000     2.294325    4.328504
          --------------------+----------------------------------------------------------------
                      sigma_u |  .68593601
                      sigma_e |  .12732622
                          rho |   .9666914   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------------
          F test that all u_i=0: F(281, 245) = 7.06                    Prob > F = 0.0000
          
          . estimates store fixed
          
          . xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
          > h_to_totalassets div_yield roce_w1 year2016 if inlist(year,2015,2016), re
          
          Random-effects GLS regression                   Number of obs     =        539
          Group variable: firmid                          Number of groups  =        282
          
          R-sq:                                           Obs per group:
               within  = 0.2057                                         min =          1
               between = 0.7959                                         avg =        1.9
               overall = 0.7772                                         max =          2
          
                                                          Wald chi2(12)     =    1007.29
          corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
          
          -------------------------------------------------------------------------------------
                    lntobinsq |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          --------------------+----------------------------------------------------------------
                     lnassets |  -.0324741   .0110764    -2.93   0.003    -.0541835   -.0107648
              FXDerivatives10 |   .0633603   .0341651     1.85   0.064     -.003602    .1303226
              IRDerivatives10 |  -.0627061   .0355131    -1.77   0.077    -.1323106    .0068983
              bookleverage_w1 |   .2832981   .0710122     3.99   0.000     .1441168    .4224794
                       roa_w1 |   .0412027   .0038409    10.73   0.000     .0336746    .0487307
                    zscore_w1 |   .0689105   .0054544    12.63   0.000       .05822     .079601
                    cratio_w1 |  -.1147203   .0125482    -9.14   0.000    -.1393142   -.0901263
                      rnd_rev |   .0113815   .0029564     3.85   0.000      .005587     .017176
          cash_to_totalassets |   .3379642   .1460328     2.31   0.021     .0517453    .6241831
                    div_yield |  -.0313023   .0028731   -10.89   0.000    -.0369335   -.0256712
                      roce_w1 |   .0006734   .0003966     1.70   0.089    -.0001039    .0014507
                     year2016 |  -.0000337   .0122384    -0.00   0.998    -.0240206    .0239532
                        _cons |   .3886327   .0832307     4.67   0.000     .2255037    .5517618
          --------------------+----------------------------------------------------------------
                      sigma_u |  .22941333
                      sigma_e |  .12732622
                          rho |  .76450624   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------------
          
          . hausman fixed .
          
                           ---- Coefficients ----
                       |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                       |     fixed          .          Difference          S.E.
          -------------+----------------------------------------------------------------
              lnassets |   -.4007886    -.0324741       -.3683145        .0722587
          FXDerivat~10 |    .1024007     .0633603        .0390404        .0659509
          IRDerivat~10 |   -.1037169    -.0627061       -.0410108        .0662975
          booklevera~1 |     .207787     .2832981       -.0755111        .1016435
                roa_w1 |    .0210163     .0412027       -.0201863        .0041602
             zscore_w1 |     .009197     .0689105       -.0597135        .0138121
             cratio_w1 |   -.0691649    -.1147203        .0455554        .0189247
               rnd_rev |   -.0119345     .0113815        -.023316        .0076763
          cash_to_to~s |    .3159089     .3379642       -.0220553        .1492622
             div_yield |   -.0212811    -.0313023        .0100212        .0015975
               roce_w1 |    .0004484     .0006734        -.000225        .0002635
              year2016 |    .0223946    -.0000337        .0224283        .0082794
          ------------------------------------------------------------------------------
                                     b = consistent under Ho and Ha; obtained from xtreg
                      B = inconsistent under Ha, efficient under Ho; obtained from xtreg
          
              Test:  Ho:  difference in coefficients not systematic
          
                           chi2(12) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                    =      112.45
                          Prob>chi2 =      0.0000
                          (V_b-V_B is not positive definite)
          
          .


          and then remove the year dummy (year2016 in my case) when comparing the coefficuent estimates of the 2

          or

          2) run the hausman without the time dummies in the first place

          Code:
          . xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
          > h_to_totalassets div_yield roce_w1 if inlist(year,2015,2016), re
          
          Random-effects GLS regression                   Number of obs     =        539
          Group variable: firmid                          Number of groups  =        282
          
          R-sq:                                           Obs per group:
               within  = 0.2055                                         min =          1
               between = 0.7960                                         avg =        1.9
               overall = 0.7773                                         max =          2
          
                                                          Wald chi2(11)     =    1013.56
          corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
          
          -------------------------------------------------------------------------------------
                    lntobinsq |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          --------------------+----------------------------------------------------------------
                     lnassets |  -.0323747   .0109744    -2.95   0.003     -.053884   -.0108653
              FXDerivatives10 |   .0632071   .0340612     1.86   0.063    -.0035517     .129966
              IRDerivatives10 |  -.0625031   .0353601    -1.77   0.077    -.1318076    .0068013
              bookleverage_w1 |   .2834209   .0708196     4.00   0.000     .1446171    .4222247
                       roa_w1 |    .041279   .0038357    10.76   0.000     .0337612    .0487969
                    zscore_w1 |   .0689087   .0054195    12.71   0.000     .0582867    .0795307
                    cratio_w1 |  -.1146159   .0125258    -9.15   0.000    -.1391661   -.0900658
                      rnd_rev |   .0113927   .0029476     3.87   0.000     .0056156    .0171699
          cash_to_totalassets |    .337644   .1457822     2.32   0.021     .0519162    .6233718
                    div_yield |  -.0313312   .0028682   -10.92   0.000    -.0369528   -.0257095
                      roce_w1 |   .0006744   .0003958     1.70   0.088    -.0001013    .0014501
                        _cons |    .387308   .0829653     4.67   0.000     .2246989    .5499171
          --------------------+----------------------------------------------------------------
                      sigma_u |  .22881994
                      sigma_e |  .12766146
                          rho |  .76262169   (fraction of variance due to u_i)
          -------------------------------------------------------------------------------------
          
          . hausman fixed .
          
                           ---- Coefficients ----
                       |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                       |     fixed          .          Difference          S.E.
          -------------+----------------------------------------------------------------
              lnassets |   -.3381877    -.0323747        -.305813        .0594705
          FXDerivat~10 |    .0952812     .0632071         .032074        .0660569
          IRDerivat~10 |   -.1214566    -.0625031       -.0589535        .0655615
          booklevera~1 |    .1692325     .2834209       -.1141884        .0989407
                roa_w1 |     .020826      .041279       -.0204531        .0041834
             zscore_w1 |     .007085     .0689087       -.0618237        .0137973
             cratio_w1 |   -.0695064    -.1146159        .0451095        .0190097
               rnd_rev |   -.0109449     .0113927       -.0223376         .007675
          cash_to_to~s |     .347916      .337644         .010272        .1487747
             div_yield |   -.0214717    -.0313312        .0098594        .0016189
               roce_w1 |    .0003598     .0006744       -.0003146        .0002605
          ------------------------------------------------------------------------------
                                     b = consistent under Ho and Ha; obtained from xtreg
                      B = inconsistent under Ha, efficient under Ho; obtained from xtreg
          
              Test:  Ho:  difference in coefficients not systematic
          
                           chi2(11) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                    =      108.52
                          Prob>chi2 =      0.0000
          I would prefer to keep the time dummies in though ie 1) just want to know if thats ok?

          Thanks so much, I'm a big fan of your work.

          Comment

          Working...
          X