Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hausman test problem

    Hi everyone, I'm doing panel data analysis and I have a problem with the Hausman test, trying to decide what model to use (fixed or random effects). I have a panel dataset of 36 enterprises observed over 11 years and some variables do not change over the time period (for example, age or country). After doing the two models, the result of the Hausman test gives an error "(V_b-V_B is not positive definite)". I searched on this forum and found that this can be corrected by using the code "hausman fe re, sigmamore". However, when I do this I get another error: "Note: the rank of the differenced variance matrix (5) does not equal the number of coefficients being tested (6); be sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale."
    How can I fix this and find out which model I should use? I will paste here the models and the Hausman test:

    FE MODEL
    . xtreg FATTURATO PRICEEARNINGSRATIO DIPENDENTI EBITDAMARGINCALCOLATO DECALCOLATO AGE SIZE INTERNAZIONALIZZAZIONE APPAREL ACTIVEWEAR PRODUTTORE RETAILER USA CANADA SPAIN GERMANY UK SWEDEN CHINA JAPAN BRAZIL ESG, fe
    note: AGE omitted because of collinearity.
    note: INTERNAZIONALIZZAZIONE omitted because of collinearity.
    note: APPAREL omitted because of collinearity.
    note: ACTIVEWEAR omitted because of collinearity.
    note: PRODUTTORE omitted because of collinearity.
    note: RETAILER omitted because of collinearity.
    note: USA omitted because of collinearity.
    note: CANADA omitted because of collinearity.
    note: SPAIN omitted because of collinearity.
    note: GERMANY omitted because of collinearity.
    note: UK omitted because of collinearity.
    note: SWEDEN omitted because of collinearity.
    note: CHINA omitted because of collinearity.
    note: JAPAN omitted because of collinearity.
    note: BRAZIL omitted because of collinearity.

    Fixed-effects (within) regression Number of obs = 348
    Group variable: IMPRESE Number of groups = 36

    R-squared: Obs per group:
    Within = 0.5051 min = 1
    Between = 0.4472 avg = 9.7
    Overall = 0.4280 max = 11

    F(6,306) = 52.06
    corr(u_i, Xb) = -0.4286 Prob > F = 0.0000

    ----------------------------------------------------------------------------------------
    FATTURATO | Coefficient Std. err. t P>|t| [95% conf. interval]
    -----------------------+----------------------------------------------------------------
    PRICEEARNINGSRATIO | -197.9916 3100.351 -0.06 0.949 -6298.696 5902.713
    DIPENDENTI | 57.43419 6.08029 9.45 0.000 45.46972 69.39866
    EBITDAMARGINCALCOLATO | 31113.91 24054.5 1.29 0.197 -16219.25 78447.07
    DECALCOLATO | 4304.45 2474.183 1.74 0.083 -564.116 9173.016
    AGE | 0 (omitted)
    SIZE | 3105437 420402.1 7.39 0.000 2278192 3932682
    INTERNAZIONALIZZAZIONE | 0 (omitted)
    APPAREL | 0 (omitted)
    ACTIVEWEAR | 0 (omitted)
    PRODUTTORE | 0 (omitted)
    RETAILER | 0 (omitted)
    USA | 0 (omitted)
    CANADA | 0 (omitted)
    SPAIN | 0 (omitted)
    GERMANY | 0 (omitted)
    UK | 0 (omitted)
    SWEDEN | 0 (omitted)
    CHINA | 0 (omitted)
    JAPAN | 0 (omitted)
    BRAZIL | 0 (omitted)
    ESG | 36822.1 10378.83 3.55 0.000 16399.19 57245.01
    _cons | -4.52e+07 5945794 -7.61 0.000 -5.69e+07 -3.35e+07
    -----------------------+----------------------------------------------------------------
    sigma_u | 6010216.5
    sigma_e | 1585529
    rho | .9349347 (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------
    F test that all u_i=0: F(35, 306) = 40.51 Prob > F = 0.0000

    .
    . est store fe

    RE MODEL
    . xtreg FATTURATO PRICEEARNINGSRATIO DIPENDENTI EBITDAMARGINCALCOLATO DECALCOLATO AGE SIZE INTERNAZIONALIZZAZIONE APPAREL ACTIVEWEAR PRODUTTORE RETAILER USA CANADA SPAIN GERMANY UK SWEDEN CHINA JAPAN BRAZIL ESG, re

    Random-effects GLS regression Number of obs = 348
    Group variable: IMPRESE Number of groups = 36

    R-squared: Obs per group:
    Within = 0.4906 min = 1
    Between = 0.7545 avg = 9.7
    Overall = 0.7447 max = 11

    Wald chi2(21) = 394.23
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    ----------------------------------------------------------------------------------------
    FATTURATO | Coefficient Std. err. z P>|z| [95% conf. interval]
    -----------------------+----------------------------------------------------------------
    PRICEEARNINGSRATIO | 301.8645 3269.911 0.09 0.926 -6107.043 6710.772
    DIPENDENTI | 38.03384 5.38719 7.06 0.000 27.47514 48.59254
    EBITDAMARGINCALCOLATO | 36073.14 25064.33 1.44 0.150 -13052.03 85198.32
    DECALCOLATO | 3243.409 2527.883 1.28 0.199 -1711.151 8197.969
    AGE | -708.8511 30974.06 -0.02 0.982 -61416.89 59999.19
    SIZE | 3589315 420789.4 8.53 0.000 2764583 4414047
    INTERNAZIONALIZZAZIONE | 15850.67 29495.34 0.54 0.591 -41959.14 73660.48
    APPAREL | 2198274 1793538 1.23 0.220 -1316995 5713544
    ACTIVEWEAR | -557853.2 2131438 -0.26 0.794 -4735395 3619688
    PRODUTTORE | -7137317 2759723 -2.59 0.010 -1.25e+07 -1728359
    RETAILER | -3437686 1647709 -2.09 0.037 -6667136 -208235.5
    USA | 2193952 2846215 0.77 0.441 -3384527 7772430
    CANADA | 3989085 3644156 1.09 0.274 -3153329 1.11e+07
    SPAIN | -1091059 6702710 -0.16 0.871 -1.42e+07 1.20e+07
    GERMANY | 1292573 4132827 0.31 0.754 -6807620 9392765
    UK | -168793.8 3778204 -0.04 0.964 -7573938 7236350
    SWEDEN | 7001420 4384444 1.60 0.110 -1591932 1.56e+07
    CHINA | -1464027 3316651 -0.44 0.659 -7964544 5036490
    JAPAN | -715719.3 3480870 -0.21 0.837 -7538098 6106660
    BRAZIL | -769948.4 3529017 -0.22 0.827 -7686694 6146798
    ESG | 34389.96 10735.18 3.20 0.001 13349.4 55430.52
    _cons | -5.14e+07 7051590 -7.28 0.000 -6.52e+07 -3.75e+07
    -----------------------+----------------------------------------------------------------
    sigma_u | 3061539.5
    sigma_e | 1585529
    rho | .78851537 (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------

    .
    . est store re

    . hausman fe re

    ---- Coefficients ----
    | (b) (B) (b-B) sqrt(diag(V_b-V_B))
    | fe re Difference Std. err.
    -------------+----------------------------------------------------------------
    PRICEEARNI~O | -197.9916 301.8645 -499.8561 .
    DIPENDENTI | 57.43419 38.03384 19.40035 2.819239
    EBITDAMARG~O | 31113.91 36073.14 -4959.234 .
    DECALCOLATO | 4304.45 3243.409 1061.041 .
    SIZE | 3105437 3589315 -483878.4 .
    ESG | 36822.1 34389.96 2432.141 .
    ------------------------------------------------------------------------------
    b = Consistent under H0 and Ha; obtained from xtreg.
    B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

    Test of H0: Difference in coefficients not systematic

    chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 65.82
    Prob > chi2 = 0.0000
    (V_b-V_B is not positive definite)

    . hausman fe re, sigmamore

    Note: the rank of the differenced variance matrix (5) does not equal the number of coefficients being tested (6); be sure this
    is what you expect, or there may be problems computing the test. Examine the output of your estimators for anything
    unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale.

    ---- Coefficients ----
    | (b) (B) (b-B) sqrt(diag(V_b-V_B))
    | fe re Difference Std. err.
    -------------+----------------------------------------------------------------
    PRICEEARNI~O | -197.9916 301.8645 -499.8561 322.2244
    DIPENDENTI | 57.43419 38.03384 19.40035 3.535794
    EBITDAMARG~O | 31113.91 36073.14 -4959.234 4655.019
    DECALCOLATO | 4304.45 3243.409 1061.041 696.7127
    SIZE | 3105437 3589315 -483878.4 146436.8
    ESG | 36822.1 34389.96 2432.141 2396.759
    ------------------------------------------------------------------------------
    b = Consistent under H0 and Ha; obtained from xtreg.
    B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

    Test of H0: Difference in coefficients not systematic

    chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
    = 16.50
    Prob > chi2 = 0.0056
    Last edited by Chiara Vigano; 24 Jul 2021, 04:58.

  • #2
    Chiara:
    welcome to this forum.
    As we know, -fe- estimator wipes out time-invariant variables (like countries).
    Code:
    As you did not use CODE delimiters, that is the #-shaped button available from the Advanced editor toolbar
    and, as such, your copy and paste is hard to read, it seems that you can go for a more parsimonious model (check the quasi-extreme collinearity of your predictors via the command -estat vce,corr- after -xtreg-).
    As far as the -hausman- test is concerned, its output is often puzzling: the reason for the last warning message that you received may actually rest on the (too) different scale of your coefficients.
    Try to re-scale and re-run -xtreg- and -hausman-.
    Eventually, I assume that you've already detected the absence of evidence of heteroskedasticity and/or autocorrelation of the epsilon and, as such, you can safely go with default standard errors.
    Admittedly, my guess is that with 36 panels and 11 years, you should go -vce(cluster panelid)-. That said, if you go clustered, you should switch from -hausmam- to the community-contributed command -xtoverid- (just type -search xtoverid- from within Stata to spot and install it), as the latter only support non-default standard errors.
    Last but not least, what is your supervisor's take/advide/guidance about all this stuff, since, if I'm not mistaken, this panel data regression is part of your dissertation?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you very much Carlo. I think I solved the problem for the Hausman test. After standardizing the variables, the Hausman test (using fe re, sigmamore) is now this:

      Code:
      . hausman fe re, sigmamore
      
                       ---- Coefficients ----
                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                   |       fe           re         Difference       Std. err.
      -------------+----------------------------------------------------------------
              zesg |     .111642     .1042679        .0073741        .0072668
      zpriceearn~s |   -.0008808     .0013428       -.0022236        .0014334
       zdipendenti |    .6954335     .4605272        .2349064        .0428126
      zebitdamar~n |     .037722     .0437345       -.0060125        .0056437
      zdecalcolato |    .0475317     .0358152        .0117165        .0076934
             zsize |    .4227441     .4886146       -.0658705        .0199345
      ------------------------------------------------------------------------------
                                b = Consistent under H0 and Ha; obtained from xtreg.
                 B = Inconsistent under Ha, efficient under H0; obtained from xtreg.
      
      Test of H0: Difference in coefficients not systematic
      
          chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                  =  41.03
      Prob > chi2 = 0.0000
      This means that the fixed model is the one I should use.
      No, I haven't checked for heteroskedasticity because I saw that many people do this after choosing the right model with the Hausman test. I tried xttest3:

      Code:
      . xttest3
      
      Modified Wald test for groupwise heteroskedasticity
      in fixed effect regression model
      
      H0: sigma(i)^2 = sigma^2 for all i
      
      chi2 (36)  =   19636.03
      Prob>chi2 =      0.0000
      and it clearly indicates signs of heteroskedasticity. I read this thread https://www.statalist.org/forums/for...d-effect-model and saw that the way to fix this is to use xtreg .... fe, cluster (imprese). However, it does not work and says "invalid cluster".
      Code:
      xtreg zfatturato zesg zpriceearnings zdipendenti zebitdamargin zdecalcolato zage zsize zinternazionalizzazione zapparel zacti
      > vewear zproduttore zretailer zusa zcanada zspain zgermany zuk zsweden zchina zjapan zbrazil zsouthafrica, fe, cluster (IMPRES
      > E)
      invalid 'cluster' 
      r(198);
      Is this the right code to use? Or am I missing something? I cannot find anything on the Internet.

      Comment


      • #4
        Chiara:
        checking for heteroskedasticity/autocorrelation after -hausman- is not correct, as the diagnosis of these nuisances should be done before -hausman-.
        As per the outcome of the community-contributed module (see the FAQ on how you're kindly requested to declare it in your post) -xttest3-, there's evidence of heteroskedasticity.
        Unlike -regress-, both -robust- and vce(cluster panelid)- do the very same job in Stata (tehrefore you do not have to worry about autocorrelation, because the fix would be the same).
        Thanks for using CODE delimiters, that make apparent why you've received the error message from Stata about -cluster-.
        The right code s:
        Code:
        xtreg zfatturato zesg zpriceearnings zdipendenti zebitdamargin zdecalcolato zage zsize zinternazionalizzazione zapparel zactivewear zproduttore zretailer zusa zcanada zspain zgermany zuk zsweden zchina zjapan zbrazil zsouthafrica, fe vce(cluster IMPRESE)
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you, now the code works, however the xttest3 outcome is the same. How is this possible?
          I'm sorry, but I'm very new to this and this is my first panel analysis. I thought the procedure was to identify the right model with the Hausman test, and then check for heteroskedasticity and cross sectional independence. Is it possible to fix the model for heteroskedasticity after I chose the fixed effects one? Or should I go back and do other tests on the normal regression?

          Comment


          • #6
            Chiara:
            because the heteroskedasticity test focuses on residuals (and do not consider the clustered-robust standard errors).
            Hence, there's no gain in repeating it after invoking non-default standard error.
            As previously replied, you have to test for heteroskedasticity/autocorrelation before comparing -fe- vs -re-.
            Again, what is your supervisor's guidance on that?
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X