Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata - Fixed effects or Random effects with panel data and time invariant interaction dummies

    Hello Statalist forum,

    So far I have been using Stata for around 1 year and currently I'm working on the following research hypothesis and regression equation; The difference in the financial performance of family firms relative to non-family firms increased during the covid-19 pandemic

    I'll find an answer that question by running the regression equation below in a random effects model.
    ROE % = B1Familyist + B2COVIDcrisist + B3Familyist * COVIDcrisist + B4Sizeist + B5Ageist + B6Leverageist + Zs + tt + eist

    Where Firmperformance = ROE in %, Family = family firm dummy variable (1 for family, 0 for non-family), COVIDcrisis dummy variable (1 for year 2020, 0 for years 2014-2019), Interaction term between Family and COVIDcrisis, size/age/leverage are control variables and Z = industry fixed effect and t = time fixed effect. i = per individual firm, s = per industry, t = per year and e = error term.

    Originally I wanted to use a fixed-effects model, however after doing some research I came to the conclusion that a random effects model would be more suitable (random effects allows for having time-invariant dummy variables, such as family firm and COVIDcrisis in my case). I have 8 years of data for 259 firms in my panel (2072 obs.).

    However, as found on this forum I also conducted the Hausman test to check whether RE or FE should be used, the FE model dropped my family firm dummy variable since it is time invariant but also Age and Industry. However, based on the remaining variables, it concluded that I should use fixed effects instead of random effects. The latter is causing some confusion for me, therefore I was wondering if the expertise on this forum could assist me.

    Question: Would it be better to use FE or RE when looking at the above described variables and question of interest? Also, is the code below correct for using my RE/FE model?

    Code:
    egen ID = group(GlobalCompanyKey)
    egen ID_industry = group(Division)
    gsort ID Year
    xtset ID Year, yearly
    xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage i.ID_industry i.Year, re
    
    xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage ID_industry Year, fe,
    estimate store fe
    xtreg ROE_percent Family COVIDcrisis Family#COVIDcrisis Size Age Leverage i.ID_industry i.Year, re
    estimate store re
    hausman fe re
    Thank you in advance for your advice and learnings!
    Last edited by Joeri Goulooze; 01 Sep 2022, 10:15.

  • #2
    Joeri:
    welcome to this forum.
    I suspect that you switched to a random effect model because the -fe- estimator cannot give you back the time-invariant coefficients you're interested in (and I think that, if your reasearch goal deals with within-oanel differences across time, -fe- is the wey to go, despite its limitations).
    That said, your code looks fine to me, exception made for the predictor -Year- that I would consider as categorical in -fe- specification, too.
    I also assume that you've already explored the non-default standard error issue and decided that their default counterparts work well in your case (are you really sure about that?).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Good morning Carlo, thank you for welcoming me and for your reply!

      Regarding the standard errors, I have now added the option for clustering the Standard Errors per firm ID
      Code:
      vce(cluster ID)
      for both models. Additionally, I have now also included the i.Year categorical specification for the FE model to be in line with my other regressions.

      However, regarding the FE/RE consideration, thank you for your input but I am still quite in the middle of why FE would be a better option than RE.

      I do believe that I really need all the coefficients (including family firm). In order to determine whether the change in performance between family firms vs non-family increased/stayed the same/decreased during the covid-19 pandemic, I will need the family firm coefficient to calculate that difference. Besides, I have read in the Woolridge econometrics book and on this forum that the random effects model is more suitable when you are looking for such time-invariant coefficients. As this is a research for my Master thesis, I am wondering whether I am overthinking this consideration too much.

      As I ran the first RE model, the statistics do seem to be looking good and the model seems fine (in my understanding), see the results below:

      Code:
      xtreg ROA_percent i.Family##i.COVIDcrisis Size Age Leverage i.ID_industry i.Year, re vce(cluster ID)
      note: 2020.Year omitted because of collinearity.
      
      Random-effects GLS regression                   Number of obs     =      2,072
      Group variable: ID                              Number of groups  =        259
      
      R-squared:                                      Obs per group:
           Within  = 0.0179                                         min =          8
           Between = 0.0986                                         avg =        8.0
           Overall = 0.0637                                         max =          8
      
                                                      Wald chi2(19)     =      57.07
      corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
      
                                               (Std. err. adjusted for 259 clusters in ID)
      ------------------------------------------------------------------------------------
                         |               Robust
             ROA_percent | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
                1.Family |   1.020816   .8851688     1.15   0.249    -.7140828    2.755715
           1.COVIDcrisis |  -1.465001   1.099918    -1.33   0.183    -3.620802     .690799
                         |
      Family#COVIDcrisis |
                    1 1  |   .7447068   1.031939     0.72   0.471    -1.277857     2.76727
                         |
                    Size |   .8604789   .2477941     3.47   0.001     .3748115    1.346146
                     Age |   .4582726   .5961799     0.77   0.442    -.7102185    1.626764
                Leverage |  -.1832703   .1476763    -1.24   0.215    -.4727104    .1061699
                         |
             ID_industry |
                      2  |  -4.519942   1.653934    -2.73   0.006    -7.761592   -1.278291
                      3  |    .451344   .7667724     0.59   0.556    -1.051502     1.95419
                      4  |  -2.279793   4.883535    -0.47   0.641    -11.85135    7.291758
                      5  |    -3.8276   3.771097    -1.01   0.310    -11.21882    3.563614
                      6  |   .7528745   1.079892     0.70   0.486    -1.363675    2.869424
                      7  |  -.5669462     1.5353    -0.37   0.712    -3.576079    2.442187
                      8  |  -3.040993   1.598288    -1.90   0.057     -6.17358    .0915953
                         |
                    Year |
                   2014  |   .0855472   .4719153     0.18   0.856    -.8393898    1.010484
                   2015  |   .1853412   .6891949     0.27   0.788    -1.165456    1.536138
                   2016  |    1.02752   .7373443     1.39   0.163    -.4176487    2.472688
                   2017  |   .8071915    .767292     1.05   0.293    -.6966733    2.311056
                   2018  |  -.0050738   .7869919    -0.01   0.995     -1.54755    1.537402
                   2019  |  -.8805421   .8395657    -1.05   0.294    -2.526061    .7649764
                   2020  |          0  (omitted)
                         |
                   _cons |   1.534652   3.423545     0.45   0.654    -5.175374    8.244678
      -------------------+----------------------------------------------------------------
                 sigma_u |  6.7219152
                 sigma_e |  6.7597676
                     rho |  .49719233   (fraction of variance due to u_i)
      For comparison, this is the FE model
      Code:
       xtreg ROA_percent i.Family##i.COVIDcrisis Size Age Leverage i.ID_industry i.Year, fe vce(cluster ID)
      note: 1.Family omitted because of collinearity.
      note: Age omitted because of collinearity.
      note: 2.ID_industry omitted because of collinearity.
      note: 3.ID_industry omitted because of collinearity.
      note: 4.ID_industry omitted because of collinearity.
      note: 5.ID_industry omitted because of collinearity.
      note: 6.ID_industry omitted because of collinearity.
      note: 7.ID_industry omitted because of collinearity.
      note: 8.ID_industry omitted because of collinearity.
      note: 2020.Year omitted because of collinearity.
      
      Fixed-effects (within) regression               Number of obs     =      2,072
      Group variable: ID                              Number of groups  =        259
      
      R-squared:                                      Obs per group:
           Within  = 0.0205                                         min =          8
           Between = 0.0598                                         avg =        8.0
           Overall = 0.0389                                         max =          8
      
                                                      F(10,258)         =       3.52
      corr(u_i, Xb) = -0.4710                         Prob > F          =     0.0002
      
                                               (Std. err. adjusted for 259 clusters in ID)
      ------------------------------------------------------------------------------------
                         |               Robust
             ROA_percent | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -------------------+----------------------------------------------------------------
                1.Family |          0  (omitted)
           1.COVIDcrisis |  -2.115584   1.114962    -1.90   0.059    -4.311168         .08
                         |
      Family#COVIDcrisis |
                    1 1  |     .83765    1.02926     0.81   0.416     -1.18917     2.86447
                         |
                    Size |   2.413395   1.452916     1.66   0.098    -.4476892     5.27448
                     Age |          0  (omitted)
                Leverage |  -.1936553   .1512975    -1.28   0.202    -.4915905    .1042799
                         |
             ID_industry |
                      2  |          0  (omitted)
                      3  |          0  (omitted)
                      4  |          0  (omitted)
                      5  |          0  (omitted)
                      6  |          0  (omitted)
                      7  |          0  (omitted)
                      8  |          0  (omitted)
                         |
                    Year |
                   2014  |  -.0104053   .4377067    -0.02   0.981    -.8723379    .8515273
                   2015  |  -.0148713    .638139    -0.02   0.981    -1.271496    1.241753
                   2016  |   .7371309   .6931752     1.06   0.289    -.6278706    2.102132
                   2017  |   .4349871   .7436228     0.58   0.559    -1.029356     1.89933
                   2018  |  -.4709956   .7796995    -0.60   0.546    -2.006381     1.06439
                   2019  |  -1.468117   .9000626    -1.63   0.104    -3.240521    .3042879
                   2020  |          0  (omitted)
                         |
                   _cons |  -5.420082   8.891037    -0.61   0.543    -22.92832    12.08816
      -------------------+----------------------------------------------------------------
                 sigma_u |  8.1153951
                 sigma_e |  6.7597676
                     rho |  .59038296   (fraction of variance due to u_i)
      ---------------------------------------------------------------------

      Comment


      • #4
        Joeri:
        1) both your models show low within Rsq (relevant for the -fe- specification) and between Rsq (relevant for -re- specification;
        2) you can check the appropriateness of the functional form of your regressand via the same procedure detailed in -linktest- entry, Stata .pdf manual (unfortunately, you should replicate by hand, as -linktest- does not work after -xtreg-);
        3) what does the -xttest0- give you back after -xtreg,re;
        4) in order to estimate time- invariant coefficient when -fe- is the way to go, you may want to consider the Mundlak's approach (detailed in one of the Stats blog entries).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X