Hausman error test for fixed/random effects: (V_b-V_B is not positive definite)

Brunn Aguilar

Join Date: Oct 2023

Posts: 3
#1

Hausman error test for fixed/random effects: (V_b-V_B is not positive definite)

22 Oct 2023, 11:43

Hello dear community, I am currently struggling to do some corrections to my Static Panel Regression:

The previous works in the same field throw a fixed effect model as the go to model. However, whenever i do my tests i get this:

I need some help cause according to this test, i should go with random effects (Opening the door for Unobservable Heterogeneity and more complex stuff that should not happen in this analysis)
Tags: fixed effects, hausman, panel data, random effects, stata
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#2

22 Oct 2023, 15:42

The previous works in the same field throw a fixed effect model as the go to model. However, whenever i do my tests

You do not have to justify using the fixed effects estimator. It is valid under the random effects assumption as well. See the discussion and matrix in https://www.stata.com/support/faqs/s...effects-model/.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#3

23 Oct 2023, 01:55

Brunn:
welcome to this forum.
As an aside to Andrew's helpful reply, you can give the community-contributed -xtoverid- a shot after -xtreg,re-. If the bull is rejected (and providd that a panel-wise effect does exist), you should go -fe-.
On more general note, the fact that the VCE matrix is not positive definite, may depend on a poor specification of the model.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#4

23 Oct 2023, 08:08

The lack of positive definiteness is almost always because of the presence of time dummies or other variables that do not vary by unit. In these cases, the degrees-of-freedom are often incorrectly calculated. As Carlo said, you can apply xtoverid or you can use the Mundlak equation. But I'm with Andrew: you should not let the outcome of the Hausman test (even when properly computed) to "tell you" to use RE. The default should be fixed effects, and if you can learn something from FE you should go with it. You try to argue for RE typically when your FE estimates are imprecise.
2 likes
Comment
Brunn Aguilar

Join Date: Oct 2023

Posts: 3
#5

23 Oct 2023, 19:20

Thanks everyone for the response.
Im still confused because my adviser has told me that i should go with Fe too. My main concern is that other than the model's author i dont have any other evidence to sustain my argument of using FE over RE (Considering hausman recommended re).
Comment
Brunn Aguilar

Join Date: Oct 2023

Posts: 3
#6

23 Oct 2023, 19:35

In response to Carlo these are my results: I see both F test and Chi test with a Less than 0.05 P Val.
I thought that meant that my model was appropiate, what do you mean by poor specification?

Also, these are the three tests the authors used in their model:

Is there any way I can do something similar? My advisor told me I should test for unobserved heterogeneity in order to bypass the Re hausman test
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#7

23 Oct 2023, 20:28

This has already been said, but I just want to reiterate: The hausman test does not "recommend" RE. That is not how the test works. If the hausman test is not significant, RE may be more accurate with less data than FE, but there won't be any systematic differences in the results between the two models, so you should conclude that either estimator would be appropriate. The accuracy of the two models will converge as your sample size increases - and should still be pretty good for FE with smaller sample sizes.

Please take another look at your model results in #6. Notice there aren't any substantive differences in the estimated results or the conclusions you might draw. As a practical matter, you usually have to justify RE to your reviewers, but FE is known to be more consistent than RE and doesn't require the same kind of justification (see #2). Why make the case for RE if FE gives almost exactly the same results? Alternatively, if they give different results, you should be worried about RE, not FE.

FE is the gold standard - and the hausman test assumes as much. That is why people keep telling you to use it.

Last edited by Daniel Schaefer; 23 Oct 2023, 20:31.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17709

24 Oct 2023, 00:05

Brunn:
the only relevant issue when using -fe- when the null (where the null is: -re- is the way to go) is not rejected by -hausman- rests on -fe- minor efficiency (that is, less precise standard errors). The -fe- estimator, as Daniel highlighted, remains consistent even if -hausman- does not reject the null.
As far as the (mis)specification of the functional form of the regressand is concerned, I meant something like -linktest- (that works automatically after -regress- only):

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage c.age##c.age, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1087                                         min =          1
     Between = 0.1006                                         avg =        6.1
     Overall = 0.0865                                         max =         15

                                                F(2,4709)         =     507.42
corr(u_i, Xb) = 0.0440                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0539076    .004307    12.52   0.000     .0454638    .0623515
             |
 c.age#c.age |  -.0005973    .000072    -8.30   0.000    -.0007384   -.0004562
             |
       _cons |    .639913   .0624195    10.25   0.000     .5175415    .7622845
-------------+----------------------------------------------------------------
     sigma_u |   .4039153
     sigma_e |  .30245467
         rho |  .64073314   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1092                                         min =          1
     Between = 0.1033                                         avg =        6.1
     Overall = 0.0881                                         max =         15

                                                F(2,4709)         =     523.09
corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.569185   .7085064     3.63   0.000     1.180181    3.958189
   sq_fitted |    -.47432   .2153021    -2.20   0.028    -.8964128   -.0522272
       _cons |  -1.290258    .580562    -2.22   0.026    -2.428431   -.1520844
-------------+----------------------------------------------------------------
     sigma_u |    .403403
     sigma_e |  .30238578
         rho |  .64025357   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

       F(  1,  4709) =    4.85
            Prob > F =    0.0276

.

In this toy-example the model is clearly misspecified (too few predictors) and the test on -sq_fitted-, as expected, rejects the null.

Kind regards,
Carlo
(Stata 19.0)

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#9

24 Oct 2023, 12:39

Brunn: I see you have N = 12 and T = 119. This is much more like multiple time series than traditional large-N, small-T panel. I recommend using xtscc with fixed effects, and computing the Newey-West standard errors with a lag of maybe 4-10. Random effects isn't even statistically justified. The specification tests you reported in the paper simply reveal that there is heterogeneity (F test) but does not tell you whether it's correlated with x(i,t). The B-P test gives zero because the estimated variance is negative, so that's not informative, either. And I suspect a nonrobust Hausman test was used, so that's also of limited value.

With small N, large T, you really should include cross-sectional fixed effects. No test can tell you that.
1 like
Comment

Announcement