Stata - Fixed effects or Random effects with panel data and time invariant interaction dummies

Joeri Goulooze

Join Date: Aug 2022

Posts: 2
#1

Stata - Fixed effects or Random effects with panel data and time invariant interaction dummies

01 Sep 2022, 10:10

Hello Statalist forum,

So far I have been using Stata for around 1 year and currently I'm working on the following research hypothesis and regression equation; The difference in the financial performance of family firms relative to non-family firms increased during the covid-19 pandemic

I'll find an answer that question by running the regression equation below in a random effects model.
ROE % = B₁Family_ist + B₂COVIDcrisis_{t +}B₃Family_ist * COVIDcrisis_t+ B₄Size_ist + B₅Age_ist + B₆Leverage_ist + Z_s + t_t + e_ist

Where Firmperformance = ROE in %, Family = family firm dummy variable (1 for family, 0 for non-family), COVIDcrisis dummy variable (1 for year 2020, 0 for years 2014-2019), Interaction term between Family and COVIDcrisis, size/age/leverage are control variables and Z = industry fixed effect and t = time fixed effect. i = per individual firm, s = per industry, t = per year and e = error term.

Originally I wanted to use a fixed-effects model, however after doing some research I came to the conclusion that a random effects model would be more suitable (random effects allows for having time-invariant dummy variables, such as family firm and COVIDcrisis in my case). I have 8 years of data for 259 firms in my panel (2072 obs.).

However, as found on this forum I also conducted the Hausman test to check whether RE or FE should be used, the FE model dropped my family firm dummy variable since it is time invariant but also Age and Industry. However, based on the remaining variables, it concluded that I should use fixed effects instead of random effects. The latter is causing some confusion for me, therefore I was wondering if the expertise on this forum could assist me.

Question: Would it be better to use FE or RE when looking at the above described variables and question of interest? Also, is the code below correct for using my RE/FE model?

Code:

egen ID = group(GlobalCompanyKey) egen ID_industry = group(Division) gsort ID Year xtset ID Year, yearly xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage i.ID_industry i.Year, re xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage ID_industry Year, fe, estimate store fe xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage i.ID_industry i.Year, re estimate store re hausman fe re

Thank you in advance for your advice and learnings!

Last edited by Joeri Goulooze; 01 Sep 2022, 10:15.
Tags: fixed effects, interaction, panel data, random effects
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

02 Sep 2022, 01:16

Joeri:
welcome to this forum.
I suspect that you switched to a random effect model because the -fe- estimator cannot give you back the time-invariant coefficients you're interested in (and I think that, if your reasearch goal deals with within-oanel differences across time, -fe- is the wey to go, despite its limitations).
That said, your code looks fine to me, exception made for the predictor -Year- that I would consider as categorical in -fe- specification, too.
I also assume that you've already explored the non-default standard error issue and decided that their default counterparts work well in your case (are you really sure about that?).

Kind regards,
Carlo
(Stata 19.0)
Comment

Joeri Goulooze

Join Date: Aug 2022
Posts: 2

03 Sep 2022, 05:27

Good morning Carlo, thank you for welcoming me and for your reply!

Regarding the standard errors, I have now added the option for clustering the Standard Errors per firm ID

Code:

vce(cluster ID)

for both models. Additionally, I have now also included the i.Year categorical specification for the FE model to be in line with my other regressions.

However, regarding the FE/RE consideration, thank you for your input but I am still quite in the middle of why FE would be a better option than RE.

I do believe that I really need all the coefficients (including family firm). In order to determine whether the change in performance between family firms vs non-family increased/stayed the same/decreased during the covid-19 pandemic, I will need the family firm coefficient to calculate that difference. Besides, I have read in the Woolridge econometrics book and on this forum that the random effects model is more suitable when you are looking for such time-invariant coefficients. As this is a research for my Master thesis, I am wondering whether I am overthinking this consideration too much.

As I ran the first RE model, the statistics do seem to be looking good and the model seems fine (in my understanding), see the results below:

Code:

xtreg ROA_percent i.Family##i.COVIDcrisis Size Age Leverage i.ID_industry i.Year, re vce(cluster ID)
note: 2020.Year omitted because of collinearity.

Random-effects GLS regression                   Number of obs     =      2,072
Group variable: ID                              Number of groups  =        259

R-squared:                                      Obs per group:
     Within  = 0.0179                                         min =          8
     Between = 0.0986                                         avg =        8.0
     Overall = 0.0637                                         max =          8

                                                Wald chi2(19)     =      57.07
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                         (Std. err. adjusted for 259 clusters in ID)
------------------------------------------------------------------------------------
                   |               Robust
       ROA_percent | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------------+----------------------------------------------------------------
          1.Family |   1.020816   .8851688     1.15   0.249    -.7140828    2.755715
     1.COVIDcrisis |  -1.465001   1.099918    -1.33   0.183    -3.620802     .690799
                   |
Family#COVIDcrisis |
              1 1  |   .7447068   1.031939     0.72   0.471    -1.277857     2.76727
                   |
              Size |   .8604789   .2477941     3.47   0.001     .3748115    1.346146
               Age |   .4582726   .5961799     0.77   0.442    -.7102185    1.626764
          Leverage |  -.1832703   .1476763    -1.24   0.215    -.4727104    .1061699
                   |
       ID_industry |
                2  |  -4.519942   1.653934    -2.73   0.006    -7.761592   -1.278291
                3  |    .451344   .7667724     0.59   0.556    -1.051502     1.95419
                4  |  -2.279793   4.883535    -0.47   0.641    -11.85135    7.291758
                5  |    -3.8276   3.771097    -1.01   0.310    -11.21882    3.563614
                6  |   .7528745   1.079892     0.70   0.486    -1.363675    2.869424
                7  |  -.5669462     1.5353    -0.37   0.712    -3.576079    2.442187
                8  |  -3.040993   1.598288    -1.90   0.057     -6.17358    .0915953
                   |
              Year |
             2014  |   .0855472   .4719153     0.18   0.856    -.8393898    1.010484
             2015  |   .1853412   .6891949     0.27   0.788    -1.165456    1.536138
             2016  |    1.02752   .7373443     1.39   0.163    -.4176487    2.472688
             2017  |   .8071915    .767292     1.05   0.293    -.6966733    2.311056
             2018  |  -.0050738   .7869919    -0.01   0.995     -1.54755    1.537402
             2019  |  -.8805421   .8395657    -1.05   0.294    -2.526061    .7649764
             2020  |          0  (omitted)
                   |
             _cons |   1.534652   3.423545     0.45   0.654    -5.175374    8.244678
-------------------+----------------------------------------------------------------
           sigma_u |  6.7219152
           sigma_e |  6.7597676
               rho |  .49719233   (fraction of variance due to u_i)

For comparison, this is the FE model

Code:

 xtreg ROA_percent i.Family##i.COVIDcrisis Size Age Leverage i.ID_industry i.Year, fe vce(cluster ID)
note: 1.Family omitted because of collinearity.
note: Age omitted because of collinearity.
note: 2.ID_industry omitted because of collinearity.
note: 3.ID_industry omitted because of collinearity.
note: 4.ID_industry omitted because of collinearity.
note: 5.ID_industry omitted because of collinearity.
note: 6.ID_industry omitted because of collinearity.
note: 7.ID_industry omitted because of collinearity.
note: 8.ID_industry omitted because of collinearity.
note: 2020.Year omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =      2,072
Group variable: ID                              Number of groups  =        259

R-squared:                                      Obs per group:
     Within  = 0.0205                                         min =          8
     Between = 0.0598                                         avg =        8.0
     Overall = 0.0389                                         max =          8

                                                F(10,258)         =       3.52
corr(u_i, Xb) = -0.4710                         Prob > F          =     0.0002

                                         (Std. err. adjusted for 259 clusters in ID)
------------------------------------------------------------------------------------
                   |               Robust
       ROA_percent | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
          1.Family |          0  (omitted)
     1.COVIDcrisis |  -2.115584   1.114962    -1.90   0.059    -4.311168         .08
                   |
Family#COVIDcrisis |
              1 1  |     .83765    1.02926     0.81   0.416     -1.18917     2.86447
                   |
              Size |   2.413395   1.452916     1.66   0.098    -.4476892     5.27448
               Age |          0  (omitted)
          Leverage |  -.1936553   .1512975    -1.28   0.202    -.4915905    .1042799
                   |
       ID_industry |
                2  |          0  (omitted)
                3  |          0  (omitted)
                4  |          0  (omitted)
                5  |          0  (omitted)
                6  |          0  (omitted)
                7  |          0  (omitted)
                8  |          0  (omitted)
                   |
              Year |
             2014  |  -.0104053   .4377067    -0.02   0.981    -.8723379    .8515273
             2015  |  -.0148713    .638139    -0.02   0.981    -1.271496    1.241753
             2016  |   .7371309   .6931752     1.06   0.289    -.6278706    2.102132
             2017  |   .4349871   .7436228     0.58   0.559    -1.029356     1.89933
             2018  |  -.4709956   .7796995    -0.60   0.546    -2.006381     1.06439
             2019  |  -1.468117   .9000626    -1.63   0.104    -3.240521    .3042879
             2020  |          0  (omitted)
                   |
             _cons |  -5.420082   8.891037    -0.61   0.543    -22.92832    12.08816
-------------------+----------------------------------------------------------------
           sigma_u |  8.1153951
           sigma_e |  6.7597676
               rho |  .59038296   (fraction of variance due to u_i)
---------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

03 Sep 2022, 13:46

Joeri:
1) both your models show low within Rsq (relevant for the -fe- specification) and between Rsq (relevant for -re- specification;
2) you can check the appropriateness of the functional form of your regressand via the same procedure detailed in -linktest- entry, Stata .pdf manual (unfortunately, you should replicate by hand, as -linktest- does not work after -xtreg-);
3) what does the -xttest0- give you back after -xtreg,re;
4) in order to estimate time- invariant coefficient when -fe- is the way to go, you may want to consider the Mundlak's approach (detailed in one of the Stats blog entries).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement