Hausman test with year dummies, V_b-V-B is not positiv definit

Cathrine Hansen

Join Date: May 2015

Posts: 2
#1

Hausman test with year dummies, V_b-V-B is not positiv definit

13 May 2015, 07:11

Dear Statalist,

I am doing a Hausman test in order to test the RE estimator against the FE estimator.
The dependent variable in the model is log(hourly_wage) and I have a set of explanatory variables (22 regressors in the FE model and 29 in the RE model) - among these variables there are nine year dummies.
My dataset is an unbalanced data set with approximately i=10.000 and T=10. The reason why the data set is unbalanced is that I only have observations for employed people with an hourly wage available in the data.

When testing the RE against FE estimator I write the following in Stata:

(Command 1)
xtreg y d2002 d2003 d2004 d2005 d2006 d2007 d2008 d2009 x1-x22, fe
estimates store fixed
xtreg y d2002 d2003 d2004 d2005 d2006 d2007 d2008 d2009 x1-x13, re
estimates store randon
hausman fixed random, sigmamore

When I run this I get chi2(22)=959.86 and Prob>chi2=0.0000 and the message "V_b-V_B is not positive definit".

From Wooldridge "Econometric Analysis of cross section and panel data" from 2010 page 333 I have found following:
"To summarize, we can estimate models that include aggregate time effects, time constant variables, and regressors that change across both i and t, by RE and FE estimation. But no matter how we compute a test statistic, we can only compare the coefficients on the regressors that change across both i and t. "

I have made a joint F test to find that the nine year dummy variables are significantly different from 0 and hence should be included in the FE and RE models.
What I read from the text stated from Wooldridge is that I cannot include my year-dummies in the two models when performing a Hausman test - or is this a wrong way to read it?
If i run the same command as a above - but without the nine year dummies - i get the following:

(Command 2)
If i xtreg y x1-x22, fe
estimates store fixed
xtreg y x1-x13, re
estimates store randon
hausman fixed random, sigmamore

chi2(13)=623.32 and Prob>chi2=0.0000.
Hence the message with not positiv definit is gone.

My question is how to handle this situation? Do I:
(1) Keep the year dummies in the model as in (Command 1) above and write that I have a variance which is not positiv definit and therefor can't use the test? OR
(2) Do I remove the dummy variables from the models as in (Command 2) and use the test and hereby go for the FE estimator?

I hope that someone can help me answer this question or maybe just help me realize something relevant that I have missed in the process.
Please let me know if there is anything crucial you need to know in order to help me solve this problem.

Thank you.

Kind Regards,
Cath
Tags: hausman test, panel data, positiv definit, year dummy
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17671
#2

13 May 2015, 11:12

Cathrine:
you may want to take a look at the following thread: http://www.statalist.org/forums/foru...d-with-haumsna
Otherwise, you may want to google with the following string: -augmented regression AND vince wiggins- for a possible solution to your problem.
Saving you some time you should allocate to searching the web, the Stata thread I meant can be found at: http://www.stata.com/statalist/archi.../msg00053.html.

Last edited by Carlo Lazzaro; 13 May 2015, 11:24.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2120
#3

13 May 2015, 12:57

I should've been clearer in my book. You should obtain the FE and RE estimates including the year dummies, but you cannot include them in the comparison -- just like you cannot include the RE coefficients on the time-constant variables. It can be a bit cumbersome to do that using the traditional Hausman approach. An easier approach is to use the Mundlak device. Compute the time averages of all variables that have some variation across i and t, and add these to the RE estimation. Then do a joint test on just the time averages. It is just a test of exclusion restrictions. A good reason for using the regression-based test is that it is easy to make it cluster robust. It makes no sense to compute robust standard errors for FE and RE but then use a nonrobust Hausman test.

My best discussions of this over the past couple of years (in my courses) do not appear on line, but the link contains something close. The relevant material starts on slide 43. In my example, I have only one variable that changes across i and t, and so only one time average is included, and then a cluster-robust t statistic can be used. In general, it is a cluster-robust Wald statistic. Note that the coefficients on the time-constant variables and year dummies are not tested.

I hope this helps. JW

http://www.iza.org
1 like
Comment
Cathrine Hansen

Join Date: May 2015

Posts: 2
#4

02 Jun 2015, 06:41

Dear Wooldridge,

Thank you for your comprehensive response.
I followed your guidelines carefully and found the slideshow in your link very helpful - it was a great solution to my problem.
Also, the section in your book ended up being very helpful after I got a deeper understanding of the problem.

Thank you for taking the time to help me. And thank you for a great book.

Kind regards,
Cathrine
Comment

Prathvajeeth Rajmohan

Join Date: Aug 2017
Posts: 70

28 Aug 2017, 06:28

Originally posted by Jeff Wooldridge View Post

I should've been clearer in my book. You should obtain the FE and RE estimates including the year dummies, but you cannot include them in the comparison -- just like you cannot include the RE coefficients on the time-constant variables. It can be a bit cumbersome to do that using the traditional Hausman approach. An easier approach is to use the Mundlak device. Compute the time averages of all variables that have some variation across i and t, and add these to the RE estimation. Then do a joint test on just the time averages. It is just a test of exclusion restrictions. A good reason for using the regression-based test is that it is easy to make it cluster robust. It makes no sense to compute robust standard errors for FE and RE but then use a nonrobust Hausman test.

My best discussions of this over the past couple of years (in my courses) do not appear on line, but the link contains something close. The relevant material starts on slide 43. In my example, I have only one variable that changes across i and t, and so only one time average is included, and then a cluster-robust t statistic can be used. In general, it is a cluster-robust Wald statistic. Note that the coefficients on the time-constant variables and year dummies are not tested.

I hope this helps. JW

Dear Mr Wooldridge just to clarify when you can use time dummies in both fe and re models, however when when you say you cannot include them in comparison:

-do you mean:

1) we run the hausman test, including the time dummy in both re and fe, but disregard the the coefficients of the time dummy when we display the comparative output between fe and re

Code:

xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
> h_to_totalassets div_yield roce_w1 year2016 if inlist(year,2015,2016), fe

Fixed-effects (within) regression               Number of obs     =        539
Group variable: firmid                          Number of groups  =        282

R-sq:                                           Obs per group:
     within  = 0.3234                                         min =          1
     between = 0.2051                                         avg =        1.9
     overall = 0.2155                                         max =          2

                                                F(12,245)         =       9.76
corr(u_i, Xb)  = -0.6938                        Prob > F          =     0.0000

-------------------------------------------------------------------------------------
          lntobinsq |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
           lnassets |  -.4007886   .0731027    -5.48   0.000    -.5447786   -.2567987
    FXDerivatives10 |   .1024007    .074275     1.38   0.169    -.0438983    .2486997
    IRDerivatives10 |  -.1037169     .07521    -1.38   0.169    -.2518575    .0444237
    bookleverage_w1 |    .207787   .1239924     1.68   0.095    -.0364401    .4520141
             roa_w1 |   .0210163   .0056622     3.71   0.000     .0098636    .0321691
          zscore_w1 |    .009197   .0148501     0.62   0.536    -.0200532    .0384471
          cratio_w1 |  -.0691649   .0227068    -3.05   0.003    -.1138904   -.0244394
            rnd_rev |  -.0119345    .008226    -1.45   0.148    -.0281371    .0042681
cash_to_totalassets |   .3159089   .2088175     1.51   0.132    -.0953977    .7272156
          div_yield |  -.0212811   .0032873    -6.47   0.000    -.0277562   -.0148061
            roce_w1 |   .0004484   .0004762     0.94   0.347    -.0004894    .0013863
           year2016 |   .0223946   .0147759     1.52   0.131    -.0067093    .0514986
              _cons |   3.311415   .5163692     6.41   0.000     2.294325    4.328504
--------------------+----------------------------------------------------------------
            sigma_u |  .68593601
            sigma_e |  .12732622
                rho |   .9666914   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------
F test that all u_i=0: F(281, 245) = 7.06                    Prob > F = 0.0000

. estimates store fixed

. xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
> h_to_totalassets div_yield roce_w1 year2016 if inlist(year,2015,2016), re

Random-effects GLS regression                   Number of obs     =        539
Group variable: firmid                          Number of groups  =        282

R-sq:                                           Obs per group:
     within  = 0.2057                                         min =          1
     between = 0.7959                                         avg =        1.9
     overall = 0.7772                                         max =          2

                                                Wald chi2(12)     =    1007.29
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

-------------------------------------------------------------------------------------
          lntobinsq |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
           lnassets |  -.0324741   .0110764    -2.93   0.003    -.0541835   -.0107648
    FXDerivatives10 |   .0633603   .0341651     1.85   0.064     -.003602    .1303226
    IRDerivatives10 |  -.0627061   .0355131    -1.77   0.077    -.1323106    .0068983
    bookleverage_w1 |   .2832981   .0710122     3.99   0.000     .1441168    .4224794
             roa_w1 |   .0412027   .0038409    10.73   0.000     .0336746    .0487307
          zscore_w1 |   .0689105   .0054544    12.63   0.000       .05822     .079601
          cratio_w1 |  -.1147203   .0125482    -9.14   0.000    -.1393142   -.0901263
            rnd_rev |   .0113815   .0029564     3.85   0.000      .005587     .017176
cash_to_totalassets |   .3379642   .1460328     2.31   0.021     .0517453    .6241831
          div_yield |  -.0313023   .0028731   -10.89   0.000    -.0369335   -.0256712
            roce_w1 |   .0006734   .0003966     1.70   0.089    -.0001039    .0014507
           year2016 |  -.0000337   .0122384    -0.00   0.998    -.0240206    .0239532
              _cons |   .3886327   .0832307     4.67   0.000     .2255037    .5517618
--------------------+----------------------------------------------------------------
            sigma_u |  .22941333
            sigma_e |  .12732622
                rho |  .76450624   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. hausman fixed .

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed          .          Difference          S.E.
-------------+----------------------------------------------------------------
    lnassets |   -.4007886    -.0324741       -.3683145        .0722587
FXDerivat~10 |    .1024007     .0633603        .0390404        .0659509
IRDerivat~10 |   -.1037169    -.0627061       -.0410108        .0662975
booklevera~1 |     .207787     .2832981       -.0755111        .1016435
      roa_w1 |    .0210163     .0412027       -.0201863        .0041602
   zscore_w1 |     .009197     .0689105       -.0597135        .0138121
   cratio_w1 |   -.0691649    -.1147203        .0455554        .0189247
     rnd_rev |   -.0119345     .0113815        -.023316        .0076763
cash_to_to~s |    .3159089     .3379642       -.0220553        .1492622
   div_yield |   -.0212811    -.0313023        .0100212        .0015975
     roce_w1 |    .0004484     .0006734        -.000225        .0002635
    year2016 |    .0223946    -.0000337        .0224283        .0082794
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                 chi2(12) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      112.45
                Prob>chi2 =      0.0000
                (V_b-V_B is not positive definite)

.

and then remove the year dummy (year2016 in my case) when comparing the coefficuent estimates of the 2

or

2) run the hausman without the time dummies in the first place

Code:

. xtreg lntobinsq lnassets FXDerivatives10 IRDerivatives10  bookleverage_w1 roa_w1 zscore_w1 cratio_w1 rnd_rev cas
> h_to_totalassets div_yield roce_w1 if inlist(year,2015,2016), re

Random-effects GLS regression                   Number of obs     =        539
Group variable: firmid                          Number of groups  =        282

R-sq:                                           Obs per group:
     within  = 0.2055                                         min =          1
     between = 0.7960                                         avg =        1.9
     overall = 0.7773                                         max =          2

                                                Wald chi2(11)     =    1013.56
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

-------------------------------------------------------------------------------------
          lntobinsq |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
           lnassets |  -.0323747   .0109744    -2.95   0.003     -.053884   -.0108653
    FXDerivatives10 |   .0632071   .0340612     1.86   0.063    -.0035517     .129966
    IRDerivatives10 |  -.0625031   .0353601    -1.77   0.077    -.1318076    .0068013
    bookleverage_w1 |   .2834209   .0708196     4.00   0.000     .1446171    .4222247
             roa_w1 |    .041279   .0038357    10.76   0.000     .0337612    .0487969
          zscore_w1 |   .0689087   .0054195    12.71   0.000     .0582867    .0795307
          cratio_w1 |  -.1146159   .0125258    -9.15   0.000    -.1391661   -.0900658
            rnd_rev |   .0113927   .0029476     3.87   0.000     .0056156    .0171699
cash_to_totalassets |    .337644   .1457822     2.32   0.021     .0519162    .6233718
          div_yield |  -.0313312   .0028682   -10.92   0.000    -.0369528   -.0257095
            roce_w1 |   .0006744   .0003958     1.70   0.088    -.0001013    .0014501
              _cons |    .387308   .0829653     4.67   0.000     .2246989    .5499171
--------------------+----------------------------------------------------------------
            sigma_u |  .22881994
            sigma_e |  .12766146
                rho |  .76262169   (fraction of variance due to u_i)
-------------------------------------------------------------------------------------

. hausman fixed .

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed          .          Difference          S.E.
-------------+----------------------------------------------------------------
    lnassets |   -.3381877    -.0323747        -.305813        .0594705
FXDerivat~10 |    .0952812     .0632071         .032074        .0660569
IRDerivat~10 |   -.1214566    -.0625031       -.0589535        .0655615
booklevera~1 |    .1692325     .2834209       -.1141884        .0989407
      roa_w1 |     .020826      .041279       -.0204531        .0041834
   zscore_w1 |     .007085     .0689087       -.0618237        .0137973
   cratio_w1 |   -.0695064    -.1146159        .0451095        .0190097
     rnd_rev |   -.0109449     .0113927       -.0223376         .007675
cash_to_to~s |     .347916      .337644         .010272        .1487747
   div_yield |   -.0214717    -.0313312        .0098594        .0016189
     roce_w1 |    .0003598     .0006744       -.0003146        .0002605
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                 chi2(11) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      108.52
                Prob>chi2 =      0.0000

I would prefer to keep the time dummies in though ie 1) just want to know if thats ok?

Thanks so much, I'm a big fan of your work.

Announcement

Hausman test with year dummies, V_b-V-B is not positiv definit

Comment

Comment

Comment

Comment