Panel regression: Confusion about FE vs RE estimation

Ingo Brooks

Join Date: Jan 2018

Posts: 13
#1

Panel regression: Confusion about FE vs RE estimation

27 Jan 2021, 10:27

I would like to better understand the bias in the coefficient estimates of panel regressions when the RE assumption is violated. For this purpose, I consider the following setup:
The dependent variable Y_it is equal to Y_it= b₁X_t + u_it.

X_t is a macro variable that only changes over time t but not across subjects i.

Disturbance u_it = a_i + v_it is strictly exogeneous to X_t. It contains a subject specific fixed effect a_iand an orthogonal white noise disturbance term v_it.

Furthermore, there is a subject specific variable Z_it = c₁ a_i + w_it which is correlated with the subject specific fixed effects.

For the situation described above, I would then like to analyze under what circumstances the panel regression Y_it= b₁X_t + b₂ Z_it + a_i + u_itcan be estimated with the RE estimator and when it needs to be estimated with the FE estimator in order to obtain consistent coefficient estimates.

To study this, I perform a single simulation run as follows:

Code:

cls clear all * Panel setting local N = 50 local T = 100 * Setting for sigma_z local sigma_z = 0.05 * Span the panel tempfile Data_N set obs `N' gen ID = _n gen a_i = rnormal(0, 1) // fixed effects a_i save "`Data_N'" drop _all set obs `T' gen t = _n gen X_t = rnormal(2, 1) // c0=2, sigma(epsilon)=1 cross using "`Data_N'" order ID t xtset ID t, generic * Compute Y_it gen Y_it = 2*X_t + a_i + rnormal(0,1) * Generate subject characteristic Z_it gen Z_it = 0.5*a_i + rnormal(0, `sigma_z') * Panel regression & Hausman test xtreg Y_it X_t Z_it, fe estimates store FE xtreg Y_it X_t Z_it, re estimates store RE predict a_i_hat, u predict resid, ue hausman FE RE, sigmamore corr X_t Z_it a_i_hat resid

What confuses me is the fact that the Hausman test rejects the null hypothesis of the RE assumption at any conventional level. However, the coefficient estimate for b₂ is much closer to the true value of 2 in case of the RE estimation than in case of the FE estimation. Put differently, it looks as if the RE estimator in this case would be less biased than the FE estimator, but the Hausman test clearly favors FE estimation. Why is this? If considering such a situation in practice, does it really pay off to rely on FE estimation? Do I miss something?

Thanks a lot for your thoughts and insights on this.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#2

27 Jan 2021, 10:50

Ingo:
I find difficult to reply positively without taking a look at what Stata gave you back (as the FAQ recommend posters to do).
That said, your code assume that there are no heteroskedasticity and/or autocorrelation or again across panels correlation issues.
More substantively, since you decided to deal with a T>N panel dataset, -xtreg- is not the wat to go, as it was conceived for short panel datasets (ie, those with N>T).
Take a look at -xtgls- and -xtregar-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Ingo Brooks

Join Date: Jan 2018
Posts: 13

27 Jan 2021, 11:32

Thanks, Carlo, for your thoughts. - I thought that since my code block can be easily run within Stata that should do the trick. However, here are the results from a specific simulation run, where I set N=500 and T=100, such that I consider a situation with N>T:

Code:


. clear all

. 
. 
. * Panel setting
.   local N = 500  

.   local T = 100

.   
. * Setting for sigma_z
.   local sigma_z = 0.05  

.   
. * Span the panel
.   tempfile Data_N

.   set obs `N'
number of observations (_N) was 0, now 500

.   gen ID = _n

.   gen a_i = rnormal(0, 1)   // fixed effects a_i

.   save "`Data_N'"
file C:\Users\INGO~1.BRO\AppData\Local\Temp\ST_6b94_000001.tmp saved

.   drop _all

.   set obs `T'
number of observations (_N) was 0, now 100

.   gen t = _n

.   gen X_t = rnormal(2, 1)  // c0=2, sigma(epsilon)=1

.   cross using "`Data_N'"

.   order ID t

.   xtset ID t, generic
       panel variable:  ID (strongly balanced)
        time variable:  t, 1 to 100
                delta:  1 unit

.   
. * Compute Y_it
.   gen Y_it = 2*X_t + a_i + rnormal(0,1)

.   
. * Generate subject characteristic Z_it
.   gen Z_it = 0.5*a_i + rnormal(0, `sigma_z')

.   
. * Panel regression & Hausman test
.   xtreg Y_it X_t Z_it, fe

Fixed-effects (within) regression               Number of obs     =     50,000
Group variable: ID                              Number of groups  =        500

R-sq:                                           Obs per group:
     within  = 0.8090                                         min =        100
     between = 0.9882                                         avg =      100.0
     overall = 0.6751                                         max =        100

                                                F(2,49498)        =  104835.26
corr(u_i, Xb)  = -0.0071                        Prob > F          =     0.0000

------------------------------------------------------------------------------
        Y_it |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         X_t |   1.996059   .0043592   457.90   0.000     1.987515    2.004603
        Z_it |  -.0298832   .0897779    -0.33   0.739    -.2058491    .1460826
       _cons |   .0026067   .0097796     0.27   0.790    -.0165614    .0217748
-------------+----------------------------------------------------------------
     sigma_u |  1.0041056
     sigma_e |  .99609093
         rho |  .50400688   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(499, 49498) = 2.19                  Prob > F = 0.0000

.   estimates store FE

.   xtreg Y_it X_t Z_it, re

Random-effects GLS regression                   Number of obs     =     50,000
Group variable: ID                              Number of groups  =        500

R-sq:                                           Obs per group:
     within  = 0.8071                                         min =        100
     between = 0.9882                                         avg =      100.0
     overall = 0.8360                                         max =        100

                                                Wald chi2(2)      =  248320.19
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
        Y_it |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         X_t |   1.995729   .0043813   455.51   0.000     1.987141    2.004316
        Z_it |   1.985372   .0098336   201.90   0.000     1.966098    2.004645
       _cons |   .0038504   .0099983     0.39   0.700     -.015746    .0234468
-------------+----------------------------------------------------------------
     sigma_u |  .04075747
     sigma_e |  .99609093
         rho |  .00167144   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.   estimates store RE

.   predict a_i_hat, u

.   predict resid, ue

.   hausman FE RE, sigmamore

Note: the rank of the differenced variance matrix (1) does not equal the number of coefficients being tested (2); be sure this is what you expect, or there may be problems computing the test.  Examine the output of your estimators for
        anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale.

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |       FE           RE         Difference          S.E.
-------------+----------------------------------------------------------------
         X_t |    1.996059     1.995729        .0003308        .0000147
        Z_it |   -.0298832     1.985372       -2.015255        .0896963
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      504.79
                Prob>chi2 =      0.0000
                (V_b-V_B is not positive definite)

.   corr X_t Z_it a_i_hat resid
(obs=50,000)

             |      X_t     Z_it  a_i_hat    resid
-------------+------------------------------------
         X_t |   1.0000
        Z_it |   0.0003   1.0000
     a_i_hat |   0.0000   0.1091   1.0000
       resid |   0.0000   0.0017   0.1078   1.0000

The true coefficient estimate for explanatory variable Z_it is 2. Therefore, the coefficient estimate of the RE estimator is much closer to the true value than the FE estimate. Yet, the Hausman test rejects the null hypothesis of the RE assumption and, hence, I'm supposed to favor the FE estimator due to endogeneity issue of variable Z_it being correlated with the fixed effect. Is this a general issue of the Hausman test?

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#4

27 Jan 2021, 11:53

Ingo:
some comments about your post:
1) T=100 and N=500 make yours a panel with both N and T long: hence, -xtreg- is not the way to go. An example of a short pane may be: N=500; T=8;
2) -hausman- properties are asymptotic: hence, it is frequent that it gives back unexpected results. In your case, as

(V_b-V_B is not positive definite)

the results of the -hausman- test are not that reliable.
3) the correlation between the panel-wise effect and the vector or regressors is really low in your -fe- specification.
4) you may give it a shot with the community-contributed programme -xtoverid- from SSC (just type -search xtoverid- to spot and install it).
After -xtoverid- (and its prerequired community-contributed companions) installation, you can code as follows:

Code:

xtreg Y_it X_t Z_it, re xtoverid

If -xtoverid- reject the null, you shoud go -fe-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement