Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman Test result

    Click image for larger version

Name:	hausman.JPG
Views:	1
Size:	66.0 KB
ID:	1573516





    This is the output of my Hausman test.. what does it mean by model fitted on these data does not meet asymptotic assumptions.

  • #2
    It means what the error message says. The statistic is supposed to be asymptotically Chi-squared distributed (a non-negative random variable), but the calculated statistic is negative.

    Comment


    • #3
      There is an issue here then. How do I take care of this problem?
      Using xtoverid is the solution?

      Comment


      • #4
        Anuradha:
        yes, the community-contributed command -xtoverid- is a good alternative (checking the -re- specification only is enough; if -xtoverid- outcome reaches statistical significance, you should switch to -fe- specification).
        Please remember that it does not support -fvvralist- notation.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Okay Sir Carlo Lazzaro . How do I report it in the methodology part of my paper. The Hausman Test did not work and its a standard test. I can't write I used xtoverid as its only a stata command.

          Comment


          • #6
            Anuradha:
            the -xtoverid- helpfile gives you the full reference of this community-contributed command, that you can well include in your research report/paper:

            Schaffer, M.E., Stillman, S. 2010. xtoverid: Stata module to calculate tests of overidentifying restrictions after xtreg, xtivreg, xtivreg2 and xthtaylor
            http://ideas.repec.org/c/boc/bocode/s456779.html
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Apart from citing the user written command -xtoverid- as Carlo explained above (user written commands should be cited, they are research as any other), you can also see in the help file, and to use the key reference on which -xtoverid- is based.

              Arellano, M. 1993. On the testing of correlated effects with panel data. Journal of Econometrics, Vol. 59, Nos. 1-2, pp. 87-97.

              Comment


              • #8
                Woow. Thank you so much Joro Kolev Carlo Lazzaro .

                Comment


                • #9
                  The xtoverid is a bit of a black box. I like using the Mundlak approach, where one includes the time averages of all time-varying variables, estimates the equation by random effects, and tests the time averages. This reproduces the fixed effects estimates on all time-varying variables. Plus, one can see which time averages are important. I cover the general unbalanced case in my 2019 Journal of Econometrics paper on correlated random effects models.

                  Comment


                  • #10
                    Indeed Professor Wooldridge, this would be the easier approach here. And I was about to propose this to Anuradha Saikia, but then after I tried it, I remembered that one does not achieve exact equivalence for non-balanced panels.

                    What I tried is another version of the Mundlak's approach which I explain in the paper attached (which I submitted to Stata Journal in year 2008, I think I got a referee who did not understand the issue and I was too young to care fighting the powers that be, I had more fun stuff to do back in those days). The other version of the Mundlak's approach is just estimating the equation that you describe by OLS, then the slopes on the time averages show you the difference between the Fixed Effects and the Between estimator, and the slopes on the time varying covariates have to be equal to the Fixed Effects estimates (but only in balanced panels, so I got quite some difference). So a test of joint significance of the slopes on the time averages is a Hausman test of FE vs BE model.

                    Anyways, even if we go with your version of the Mundlak's approach, we still get some slight differences for unbalanced panels:

                    Code:
                    .  webuse nlswork, clear
                    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                    
                    . xtset idcode
                           panel variable:  idcode (unbalanced)
                    
                    .  xtreg ln_w  age ttl_exp tenure 2.race grade, fe
                    note: 2.race omitted because of collinearity
                    note: grade omitted because of collinearity
                    
                    Fixed-effects (within) regression               Number of obs     =     28,099
                    Group variable: idcode                          Number of groups  =      4,697
                    
                    R-sq:                                           Obs per group:
                         within  = 0.1443                                         min =          1
                         between = 0.2745                                         avg =        6.0
                         overall = 0.1924                                         max =         15
                    
                                                                    F(3,23399)        =    1315.26
                    corr(u_i, Xb)  = 0.1651                         Prob > F          =     0.0000
                    
                    ------------------------------------------------------------------------------
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             age |  -.0030427   .0008644    -3.52   0.000    -.0047369   -.0013484
                         ttl_exp |    .029036   .0014505    20.02   0.000      .026193     .031879
                          tenure |   .0116574   .0009249    12.60   0.000     .0098444    .0134704
                                 |
                            race |
                          black  |          0  (omitted)
                           grade |          0  (omitted)
                           _cons |   1.547951   .0181798    85.15   0.000     1.512317    1.583584
                    -------------+----------------------------------------------------------------
                         sigma_u |   .3751722
                         sigma_e |  .29556813
                             rho |  .61703248   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    F test that all u_i=0: F(4696, 23399) = 7.64                 Prob > F = 0.0000
                    
                    . qui for var age ttl_exp tenure: egen meanX = mean(X), by(idcode)
                    
                    .  xtreg ln_w  age ttl_exp tenure mean* 2.race grade, re
                    
                    Random-effects GLS regression                   Number of obs     =     28,099
                    Group variable: idcode                          Number of groups  =      4,697
                    
                    R-sq:                                           Obs per group:
                         within  = 0.1443                                         min =          1
                         between = 0.4329                                         avg =        6.0
                         overall = 0.3250                                         max =         15
                    
                                                                    Wald chi2(8)      =    7538.32
                    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                    
                    ------------------------------------------------------------------------------
                         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             age |  -.0030268   .0008614    -3.51   0.000    -.0047152   -.0013385
                         ttl_exp |   .0290337   .0014457    20.08   0.000     .0262003    .0318672
                          tenure |   .0116424   .0009222    12.62   0.000     .0098349      .01345
                         meanage |  -.0026319     .00142    -1.85   0.064    -.0054151    .0001513
                     meanttl_exp |  -.0008391   .0025701    -0.33   0.744    -.0058764    .0041982
                      meantenure |   .0165731   .0024676     6.72   0.000     .0117366    .0214095
                                 |
                            race |
                          black  |   -.062727   .0103071    -6.09   0.000    -.0829286   -.0425254
                           grade |   .0701835   .0020152    34.83   0.000     .0662339    .0741332
                           _cons |    .709563   .0346377    20.49   0.000     .6416744    .7774516
                    -------------+----------------------------------------------------------------
                         sigma_u |  .27539065
                         sigma_e |  .29556813
                             rho |  .46470444   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    So what we see above is that the estimates for unbalanced panels are a bit different, and this is hard to explain to a novice. (That asymptotic results do not exactly hold in finite samples.)

                    For my interpretation of the Mundlak's approach the differences were even harder to explain:

                    Code:
                    . reg ln_w  age ttl_exp tenure mean* 2.race grade, noheader
                    ------------------------------------------------------------------------------
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             age |   -.002911   .0011444    -2.54   0.011    -.0051542   -.0006679
                         ttl_exp |    .028964    .001921    15.08   0.000     .0251989    .0327292
                          tenure |   .0115805   .0012274     9.44   0.000     .0091748    .0139861
                         meanage |  -.0022723   .0013232    -1.72   0.086    -.0048658    .0003212
                     meanttl_exp |  -.0023782   .0022615    -1.05   0.293    -.0068108    .0020545
                      meantenure |   .0171518   .0017178     9.98   0.000     .0137848    .0205188
                                 |
                            race |
                          black  |  -.0806084   .0052841   -15.25   0.000    -.0909655   -.0702513
                           grade |   .0708902   .0011003    64.43   0.000     .0687336    .0730468
                           _cons |   .7061531   .0197196    35.81   0.000     .6675017    .7448045
                    ------------------------------------------------------------------------------




                    Originally posted by Jeff Wooldridge View Post
                    The xtoverid is a bit of a black box. I like using the Mundlak approach, where one includes the time averages of all time-varying variables, estimates the equation by random effects, and tests the time averages. This reproduces the fixed effects estimates on all time-varying variables. Plus, one can see which time averages are important. I cover the general unbalanced case in my 2019 Journal of Econometrics paper on correlated random effects models.
                    Attached Files

                    Comment


                    • #11
                      Joro Kolev : with the following I get exactly the same coefficients with FE and with Correlated Random Effects.
                      I think that you have forgotten to select only the "complete" observations: those for which no value is missing for any of the variables concerned
                      This is done by the selection indicator.
                      Code:
                      webuse nlswork
                      xtset idcode
                      
                      xtreg ln_w  age ttl_exp tenure 2.race grade, fe
                      
                      gen s = (ln_wage != .) & (age != .) & (ttl_exp != .) & (tenure != .) & (race != .) & (grade != .)
                      egen agebar = mean(age) if s, by(idcode)
                      egen ttl_expbar = mean(ttl_exp) if s, by(idcode)
                      egen tenurebar = mean(tenure) if s, by(idcode)
                      egen racebar = mean(race) if s, by(idcode)
                      egen gradebar = mean(grade) if s, by(idcode)
                      
                      xtreg ln_wage  age ttl_exp tenure 2.race grade agebar ttl_expbar tenurebar racebar gradebar, re
                      On Edit:
                      Code:
                      egen racebar = mean(race) if s, by(idcode)
                      should be
                      Code:
                      egen racebar = mean(2.race) if s, by(idcode)
                      I
                      Last edited by Eric de Souza; 21 Sep 2020, 08:46.

                      Comment


                      • #12
                        Thank you, Eric. I was about to write the same thing. I emphasize this point in my paper. I admit that it tripped me up for several years. Also important is that, if the model includes time dummies or any aggregate time variables, their time averages must also be included.

                        In fact, using different observations to compute the time averages is not even consistent, in general, when the complete cases FE estimator is.

                        Comment


                        • #13
                          Addition to #11 (posted by me):
                          Ssince 2.race and grade are time constant, racebar and gradebar can be dropped and, in fact, are dropped. It was because racebar was not dropped that I realised my mistake and edited my previous post (#11)
                          I have also added time dummies to illustrate the point made by Jeff Wooldridge (#12)
                          Code:
                          log using CRE_unbalanced_panel.log, replace
                          webuse nlswork
                          keep if (year == 68) | (year == 69) | (year ==70) | (year == 71)
                          tab year, gen(year)
                          xtset idcode
                          
                          xtreg ln_w  age ttl_exp tenure 2.race grade year2-year4, fe
                          
                          gen s = (ln_wage != .) & (age != .) & (ttl_exp != .) & (tenure != .) & (race != .) & (grade != .)
                          egen agebar = mean(age) if s, by(idcode)
                          egen ttl_expbar = mean(ttl_exp) if s, by(idcode)
                          egen tenurebar = mean(tenure) if s, by(idcode)
                          egen year1bar = mean(year1) if s, by (idcode)
                          egen year2bar = mean(year2) if s, by (idcode)
                          egen year3bar = mean(year3) if s, by (idcode)
                          egen year4bar = mean(year4) if s, by (idcode)
                          
                          
                          xtreg ln_wage  age ttl_exp tenure 2.race grade year2-year4 agebar ttl_expbar tenurebar year2bar year3bar year4bar, re
                          
                          
                          clear
                          log close
                          Last edited by Eric de Souza; 21 Sep 2020, 10:39.

                          Comment


                          • #14
                            Thank you Professor Jeff Wooldridge Eric de Souza for enlightening me to new ways of looking into the problem. I am quite novice and still learning. Prof Wooldridge if you could share the paper you mentioned .
                            In my case I have a strongly balanced panel though. Everything goes smooth until the Hausman Test and I guess the iteration is not done properly in the command.

                            Comment


                            • #15
                              @Jeff Wooldridge @Eric de Souza @Joro Kolev I add "if e(sample)"options on the code of @Eric de Souza to ensure the same observations used in xtreg, fe robust and xtreg, re robust. And also get get exactly the same coefficients with FE and with Correlated Random Effects.Maybe adding "if e(sample)" a simpler way to guarantee the complete observations just as @Jeff Wooldridge said.This kind of method also tells us the observations we use to FE and CRE should be exactly the same.
                              Code:
                              webuse nlswork
                              xtset idcode
                              xtreg ln_w  age ttl_exp tenure 2.race grade, fe r
                              Fixed-effects (within) regression               Number of obs     =     28,099
                              Group variable: idcode                          Number of groups  =      4,697
                              
                              R-sq:                                           Obs per group:
                                   within  = 0.1443                                         min =          1
                                   between = 0.2745                                         avg =        6.0
                                   overall = 0.1924                                         max =         15
                              
                                                                              F(3,4696)         =     544.06
                              corr(u_i, Xb)  = 0.1651                         Prob > F          =     0.0000
                              
                                                           (Std. Err. adjusted for 4,697 clusters in idcode)
                              ------------------------------------------------------------------------------
                                           |               Robust
                                   ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                       age |     -0.003      0.001    -2.35   0.019       -0.006      -0.001
                                   ttl_exp |      0.029      0.002    12.72   0.000        0.025       0.034
                                    tenure |      0.012      0.001     7.93   0.000        0.009       0.015
                                           |
                                      race |
                                    black  |      0.000  (omitted)
                                     grade |      0.000  (omitted)
                                     _cons |      1.548      0.027    56.78   0.000        1.494       1.601
                              -------------+----------------------------------------------------------------
                                   sigma_u |   .3751722
                                   sigma_e |  .29556813
                                       rho |  .61703248   (fraction of variance due to u_i)
                              ------------------------------------------------------------------------------
                              
                              egen agebar = mean(age) , by(idcode)
                              egen ttl_expbar = mean(ttl_exp) , by(idcode)
                              egen tenurebar = mean(tenure) , by(idcode)
                              egen racebar = mean(race) , by(idcode)
                              egen gradebar = mean(grade) , by(idcode)
                              
                              xtreg ln_wage  age ttl_exp tenure 2.race grade agebar ttl_expbar tenurebar racebar gradebar if e(sample), re r
                              Random-effects GLS regression                   Number of obs     =     28,099
                              Group variable: idcode                          Number of groups  =      4,697
                              
                              R-sq:                                           Obs per group:
                                   within  = 0.1443                                         min =          1
                                   between = 0.4339                                         avg =        6.0
                                   overall = 0.3252                                         max =         15
                              
                                                                              Wald chi2(9)      =    4529.55
                              corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                              
                                                           (Std. Err. adjusted for 4,697 clusters in idcode)
                              ------------------------------------------------------------------------------
                                           |               Robust
                                   ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                       age |     -0.003      0.001    -2.34   0.019       -0.006      -0.000
                                   ttl_exp |      0.029      0.002    12.72   0.000        0.025       0.034
                                    tenure |      0.012      0.001     7.92   0.000        0.009       0.015
                                           |
                                      race |
                                    black  |     -0.114      0.025    -4.50   0.000       -0.163      -0.064
                                     grade |      0.070      0.002    31.31   0.000        0.066       0.075
                                    agebar |     -0.003      0.002    -1.57   0.116       -0.006       0.001
                                ttl_expbar |     -0.001      0.003    -0.28   0.777       -0.007       0.005
                                 tenurebar |      0.017      0.003     6.12   0.000        0.011       0.022
                                   racebar |      0.053      0.024     2.22   0.027        0.006       0.099
                                  gradebar |      0.000  (omitted)
                                     _cons |      0.655      0.044    14.86   0.000        0.569       0.742
                              -------------+----------------------------------------------------------------
                                   sigma_u |  .27510497
                                   sigma_e |  .29556813
                                       rho |   .4641881   (fraction of variance due to u_i)
                              ------------------------------------------------------------------------------
                              Best regards.

                              Raymond Zhang
                              Stata 17.0,MP

                              Comment

                              Working...
                              X