Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data regressions techniques for company data

    Hello network,

    I am currently writing a master's thesis on the integration of the european insurance market. I would like to measure how different countries insurance markets react to a global shock and how different country variables affect the outcome. I have a panel dataset of 942 life insurance companies from 18 different countries over 9 years time(2014-2022) with some missing values mostly in the year 2022. I have tried xtreg with fixed effect structure as well as random effects. Both give me some result but the F-statistic is very low.
    code used: xtreg GWP_growth GWP c4_ratio GDP_growth inflation economic_downturn_dummy.
    GWP= gross written premiums and c4 ratio is a measure of market concentration. I have tried other variables but f-statistic does not seem to be improving. What are better techniques or structures that I can use to get better results?
    Thanks in advance for your help!

  • #2
    Wout:
    welcome to this forum.
    As per FAQ, please share what you typed and what Stata gave you back to increase your chances of getting helpful replies. Thanks.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      my code: xtreg GWP_growth_life_w log_total_revenue_life solvency_ratio_w marketshare GDP_growth log_population inflation c4_ratio,fe
      response:
      Fixed-effects (within) regression Number of obs = 3,863
      Group variable: company1 Number of groups = 523

      R-squared: Obs per group:
      Within = 0.1069 min = 1
      Between = 0.0003 avg = 7.4
      Overall = 0.0015 max = 9

      F(7, 3333) = 57.02
      corr(u_i, Xb) = -0.9713 Prob > F = 0.0000

      ----------------------------------------------------------------------------------
      GWP_growth_lif~w | Coefficient Std. err. t P>|t| [95% conf. interval]
      -----------------+----------------------------------------------------------------
      log_total_reve~e | 33.41725 1.754276 19.05 0.000 29.97768 36.85681
      solvency_ratio_w | .2176293 .0400664 5.43 0.000 .1390721 .2961865
      marketshare | 77.81642 24.82696 3.13 0.002 29.1388 126.494
      GDP_growth | .1456874 .2500398 0.58 0.560 -.3445597 .6359345
      log_population | -52.45524 50.05718 -1.05 0.295 -150.6012 45.69067
      inflation | .610816 .4248622 1.44 0.151 -.2222011 1.443833
      c4_ratio | -13.78162 19.56075 -0.70 0.481 -52.13391 24.57067
      _cons | 514.8184 870.9144 0.59 0.554 -1192.763 2222.399
      -----------------+----------------------------------------------------------------
      sigma_u | 111.70597
      sigma_e | 44.605294
      rho | .86247912 (fraction of variance due to u_i)
      ----------------------------------------------------------------------------------
      F test that all u_i=0: F(522, 3333) = 2.39 Prob > F = 0.0000

      Comment


      • #4
        It helps read your output if you put it within CODE tags (see the # button on the ribbon).
        There are two F statistics reported, which report different things. The first, under the R-squared values, is the overall model fit, which is 57 and statistically significant. The second is the last line and it has to do with the panel structure of the data. It is 2.39, which is low, but still significant.

        In your panel model, you have Y(i,t) = b0 + b1*X1(i,t) + ... + u(i) + e(i,t)

        The Xs and Ys are for each firm (i) at time (t) and there are firm-specific, time invariant effects u(i) for each firm. The second F test has a null that all the u(i) for each of the firms are 0; i.e. , there are no firm-specific effects in the model. The test is rejected, but with a very low F-stat, indicating there is at least one firm with a statistically significant u(i). Also, the sigma_u and sigma_e values indicate that 86% of the variation is within firms as opposed to between firms. I suspect, but feel free to verify, that due to the combination of these things, your fixed effect results will be similar to the random effects results. Additionally, I would not be surprised if a pooled OLS model yielded similar results as well.

        Not sure what you mean by getting "better results". It appears there are only small differences between your panels (insurance companies), at least given the current variables in the model.

        Comment


        • #5
          Wout:
          as an aside to JJ's helpful reply (your output is difficult to read), you should take a look at within R-sq when dealing with the -fe- estimator.
          With such a large sample, you should go -robust- or -vce(cluster, panelid)- standard errors.
          I would also recommend you to check the functional form of your regressand (this is a test that checks whether your regression is correctly specified).
          Basically, you have to calculate by hand the -linktest-, which is not supported by -xt- commands.
          Let's show it with a toy-example:
          Code:
          . use "https://www.stata-press.com/data/r18/nlswork.dta"
          (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
          
          . xtreg ln_wage c.age##c.age, fe vce(cluster idcode)
          
          Fixed-effects (within) regression               Number of obs     =     28,510
          Group variable: idcode                          Number of groups  =      4,710
          
          R-squared:                                      Obs per group:
               Within  = 0.1087                                         min =          1
               Between = 0.1006                                         avg =        6.1
               Overall = 0.0865                                         max =         15
          
                                                          F(2, 4709)        =     507.42
          corr(u_i, Xb) = 0.0440                          Prob > F          =     0.0000
          
                                       (Std. err. adjusted for 4,710 clusters in idcode)
          ------------------------------------------------------------------------------
                       |               Robust
               ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                   age |   .0539076    .004307    12.52   0.000     .0454638    .0623515
                       |
           c.age#c.age |  -.0005973    .000072    -8.30   0.000    -.0007384   -.0004562
                       |
                 _cons |    .639913   .0624195    10.25   0.000     .5175415    .7622845
          -------------+----------------------------------------------------------------
               sigma_u |   .4039153
               sigma_e |  .30245467
                   rho |  .64073314   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . predict fitted, xb
          (24 missing values generated)
          
          . g sq_fitted=fitted^2
          (24 missing values generated)
          
          . xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)
          
          Fixed-effects (within) regression               Number of obs     =     28,510
          Group variable: idcode                          Number of groups  =      4,710
          
          R-squared:                                      Obs per group:
               Within  = 0.1092                                         min =          1
               Between = 0.1033                                         avg =        6.1
               Overall = 0.0881                                         max =         15
          
                                                          F(2, 4709)        =     523.09
          corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000
          
                                       (Std. err. adjusted for 4,710 clusters in idcode)
          ------------------------------------------------------------------------------
                       |               Robust
               ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                fitted |   2.569185   .7085064     3.63   0.000     1.180181    3.958189
             sq_fitted |    -.47432   .2153021    -2.20   0.028    -.8964128   -.0522272
                 _cons |  -1.290258    .580562    -2.22   0.026    -2.428431   -.1520844
          -------------+----------------------------------------------------------------
               sigma_u |    .403403
               sigma_e |  .30238578
                   rho |  .64025357   (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          
          . test sq_fitted
          
           ( 1)  sq_fitted = 0
          
                 F(  1,  4709) =    4.85
                      Prob > F =    0.0276
          
          .
          As expected, the outcome of the (redundant, in this case) -test- reject the null that -sq_fitted- has not informative power. Therefore, the regression is (as expected again) misspecified.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Thanks to both of you for your insights, here I have the codes for first the RE regression and second the FE regression. Is it now fair of me to assume that the bigger the company the higher the growth? The vce(cluster panelid) fixed effect regression gave me a lower F-statistic.

            Code:
             xtreg GWP_growth_life_w log_total_revenue_life solvency_ratio_w benefits_paid_to_
            > NPW_life_w GDP_growth log_population inflation c4_ratio,re
            
            Random-effects GLS regression                   Number of obs     =      3,863
            Group variable: company1                        Number of groups  =        523
            
            R-squared:                                      Obs per group:
                 Within  = 0.0692                                         min =          1
                 Between = 0.0013                                         avg =        7.4
                 Overall = 0.0057                                         max =          9
            
                                                            Wald chi2(7)      =      47.95
            corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
            
            ----------------------------------------------------------------------------------
            GWP_growth_lif~w | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
            -----------------+----------------------------------------------------------------
            log_total_reve~e |   2.293654   .5268653     4.35   0.000     1.261017    3.326291
            solvency_ratio_w |   .0478744   .0144962     3.30   0.001     .0194624    .0762864
            benefits_paid_~w |  -.0059781    .001456    -4.11   0.000    -.0088319   -.0031243
                  GDP_growth |    .171903   .2344935     0.73   0.464    -.2876958    .6315018
              log_population |  -1.462672   1.010944    -1.45   0.148    -3.444087     .518742
                   inflation |  -.4290087   .3704173    -1.16   0.247    -1.155013    .2969958
                    c4_ratio |  -10.11063   6.526214    -1.55   0.121    -22.90178    2.680512
                       _cons |   11.90004   21.25739     0.56   0.576    -29.76367    53.56375
            -----------------+----------------------------------------------------------------
                     sigma_u |  17.849728
                     sigma_e |  43.809415
                         rho |  .14237275   (fraction of variance due to u_i)
            ----------------------------------------------------------------------------------
            Code:
            . xtreg GWP_growth_life_w log_total_revenue_life solvency_ratio_w benefits_paid_to_NPW_life_w GDP_growth log_population inflation c4_r
            > atio,fe 
            
            Fixed-effects (within) regression               Number of obs     =      3,863
            Group variable: company1                        Number of groups  =        523
            
            R-squared:                                      Obs per group:
                 Within  = 0.1385                                         min =          1
                 Between = 0.0003                                         avg =        7.4
                 Overall = 0.0024                                         max =          9
            
                                                            F(7, 3333)        =      76.56
            corr(u_i, Xb) = -0.9521                         Prob > F          =     0.0000
            
            ---------------------------------------------------------------------------------------------
                      GWP_growth_life_w | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
            ----------------------------+----------------------------------------------------------------
                 log_total_revenue_life |   36.62378   1.731628    21.15   0.000     33.22862    40.01894
                       solvency_ratio_w |   .2232736   .0393541     5.67   0.000     .1461129    .3004343
            benefits_paid_to_NPW_life_w |  -.0354331   .0030796   -11.51   0.000    -.0414711    -.029395
                             GDP_growth |   .1329537   .2454861     0.54   0.588    -.3483649    .6142724
                         log_population |  -16.23394   49.27217    -0.33   0.742    -112.8407    80.37281
                              inflation |   .7954277   .4128897     1.93   0.054    -.0141152    1.604971
                               c4_ratio |  -4.406103   19.05757    -0.23   0.817    -41.77183    32.95963
                                  _cons |  -148.8261   857.4759    -0.17   0.862    -1830.059    1532.406
            ----------------------------+----------------------------------------------------------------
                                sigma_u |  94.024101
                                sigma_e |  43.809415
                                    rho |  .82162629   (fraction of variance due to u_i)
            ---------------------------------------------------------------------------------------------
            F test that all u_i=0: F(522, 3333) = 2.70                   Prob > F = 0.0000

            Comment


            • #7
              Wout:
              1) go -robust- standard errors;
              2) test wether -re- is the way to go via the community-contributed module -xtoverid- (type -search xtoverid- and follow the instructions to install it).
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Dear carlo, I have tried the robust standard errors you suggested and tested the RE regression with xtoverid. However the P-value of my Sargan-hausmann test is very low, it is supposed to be above 0.10 but I don't know how to get it up, do you have some ideas?
                Code:
                . xtreg GWP_growth_life_w log_total_revenue_life solvency_ratio_w GDP_growth log_po
                > p inflation c4_ratio2,re robust
                (1 missing value generated)
                
                Random-effects GLS regression                   Number of obs     =      6,462
                Group variable: company1                        Number of groups  =        868
                
                R-squared:                                      Obs per group:
                     Within  = 0.0215                                         min =          1
                     Between = 0.0007                                         avg =        7.4
                     Overall = 0.0048                                         max =          9
                
                                                                Wald chi2(6)      =      53.98
                corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                
                                                 (Std. err. adjusted for 868 clusters in company1)
                ----------------------------------------------------------------------------------
                                 |               Robust
                GWP_growth_lif~w | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                -----------------+----------------------------------------------------------------
                log_total_reve~e |   1.526335   .2983307     5.12   0.000     .9416179    2.111053
                solvency_ratio_w |   .0183528   .0080698     2.27   0.023     .0025364    .0341693
                      GDP_growth |   .4378522   .1663199     2.63   0.008     .1118711    .7638333
                         log_pop |   -1.85549   .8487064    -2.19   0.029    -3.518924   -.1920561
                       inflation |  -.3907189   .2042236    -1.91   0.056    -.7909898     .009552
                       c4_ratio2 |  -9.849762   4.300189    -2.29   0.022    -18.27798   -1.421546
                           _cons |   23.63569   16.96632     1.39   0.164    -9.617693    56.88908
                -----------------+----------------------------------------------------------------
                         sigma_u |  14.216874
                         sigma_e |  39.373681
                             rho |  .11533827   (fraction of variance due to u_i)
                ----------------------------------------------------------------------------------
                
                . 
                end of do-file
                
                . do "C:\Users\woutb\AppData\Local\Temp\STD34c8_000000.tmp"
                
                . xtoverid
                
                Test of overidentifying restrictions: fixed vs random effects
                Cross-section time-series model: xtreg re  robust cluster(company1)
                Sargan-Hansen statistic  78.108  Chi-sq(6)    P-value = 0.0000

                Comment


                • #9
                  Wout:
                  I do not undestand ythe reasons of your concern: the -xtoverid- outcome clearly rejects the null that -re- is the way to go.
                  Therefore, you should stick with -fe- and explore wether your regression is correctly specified (see my previous example).
                  Kind regards,
                  Carlo
                  (StataNow 18.5)

                  Comment


                  • #10
                    Hausman test also indicates that FE model fits better

                    Comment


                    • #11
                      Okay thank you for the clarification

                      Comment


                      • #12
                        Dear Carlo,

                        In response to your comment quoted below:
                        Originally posted by Carlo Lazzaro View Post
                        2) test wether -re- is the way to go via the community-contributed module -xtoverid- (type -search xtoverid- and follow the instructions to install it).
                        If one has the potential to include time-invariant controls in their RE model, should the xtoverid test be implemented using a model that includes these time invariant controls, or without them?

                        As an example, I have one time invariant dummy that represents different regional groupings, should my xtoverid test include the region dummy, or not?

                        WithOUT region dummy:

                        Code:
                         xi: xtreg price_dispersion_use i.TS_ce2 E unem lnGDPPC i.year, re cluster(id)
                        i.TS_ce2          _ITS_ce2_1-10       (naturally coded; _ITS_ce2_1 omitted)
                        i.year            _Iyear_2014-2022    (naturally coded; _Iyear_2014 omitted)
                        
                        Random-effects GLS regression                   Number of obs     =        664
                        Group variable: id                              Number of groups  =        165
                        
                        R-squared:                                      Obs per group:
                             Within  = 0.0915                                         min =          1
                             Between = 0.4706                                         avg =        4.0
                             Overall = 0.4428                                         max =          5
                        
                                                                        Wald chi2(14)     =    1547.38
                        corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                        
                                                           (Std. err. adjusted for 165 clusters in id)
                        ------------------------------------------------------------------------------
                                     |               Robust
                        price_disp~e | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                        -------------+----------------------------------------------------------------
                          _ITS_ce2_2 |  -15.40622   3.801623    -4.05   0.000    -22.85727   -7.955176
                          _ITS_ce2_3 |  -12.90865   4.646619    -2.78   0.005    -22.01586   -3.801448
                          _ITS_ce2_4 |   4.656375   3.178191     1.47   0.143    -1.572766    10.88552
                          _ITS_ce2_5 |  -2.627148    4.08374    -0.64   0.520    -10.63113    5.376835
                          _ITS_ce2_6 |  -10.30967   3.339777    -3.09   0.002    -16.85552   -3.763831
                          _ITS_ce2_8 |  -13.72805    3.66635    -3.74   0.000    -20.91397   -6.542139
                         _ITS_ce2_10 |  -22.86692   3.457971    -6.61   0.000    -29.64442   -16.08942
                                   E |   -.456633   1.104486    -0.41   0.679    -2.621386     1.70812
                                unem |  -.3757004   .1861485    -2.02   0.044    -.7405448    -.010856
                             lnGDPPC |   7.160454   1.115432     6.42   0.000     4.974248    9.346661
                         _Iyear_2016 |   1.473102   1.239416     1.19   0.235    -.9561087    3.902312
                         _Iyear_2018 |   2.016894   1.545326     1.31   0.192    -1.011889    5.045677
                         _Iyear_2020 |   2.913825   1.559089     1.87   0.062    -.1419326    5.969584
                         _Iyear_2022 |   2.054701   1.688859     1.22   0.224    -1.255403    5.364804
                               _cons |  -6.286011   11.86555    -0.53   0.596    -29.54207    16.97004
                        -------------+----------------------------------------------------------------
                             sigma_u |  13.232832
                             sigma_e |  11.041641
                                 rho |  .58953776   (fraction of variance due to u_i)
                        ------------------------------------------------------------------------------
                        
                        . xtoverid
                        
                        Test of overidentifying restrictions: fixed vs random effects
                        Cross-section time-series model: xtreg re  robust cluster(id)
                        Sargan-Hansen statistic 225.588  Chi-sq(14)   P-value = 0.0000
                        With region dummy:

                        Code:
                        xi: xtreg price_dispersion_use i.TS_ce2 E unem lnGDPPC i.region_id i.year, re cluster(id)
                        i.TS_ce2          _ITS_ce2_1-10       (naturally coded; _ITS_ce2_1 omitted)
                        i.region_id       _Iregion_id_1-6     (naturally coded; _Iregion_id_1 omitted)
                        i.year            _Iyear_2014-2022    (naturally coded; _Iyear_2014 omitted)
                        
                        Random-effects GLS regression                   Number of obs     =        664
                        Group variable: id                              Number of groups  =        165
                        
                        R-squared:                                      Obs per group:
                             Within  = 0.0918                                         min =          1
                             Between = 0.5101                                         avg =        4.0
                             Overall = 0.4556                                         max =          5
                        
                                                                        Wald chi2(19)     =    1682.55
                        corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
                        
                                                            (Std. err. adjusted for 165 clusters in id)
                        -------------------------------------------------------------------------------
                                      |               Robust
                        price_dispe~e | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                        --------------+----------------------------------------------------------------
                           _ITS_ce2_2 |  -13.51359   4.018199    -3.36   0.001    -21.38912   -5.638067
                           _ITS_ce2_3 |  -10.14077   5.144088    -1.97   0.049      -20.223   -.0585472
                           _ITS_ce2_4 |   5.253318   3.835672     1.37   0.171    -2.264461     12.7711
                           _ITS_ce2_5 |  -1.953641   4.079841    -0.48   0.632    -9.949982    6.042701
                           _ITS_ce2_6 |  -8.720096   3.455647    -2.52   0.012    -15.49304   -1.947152
                           _ITS_ce2_8 |  -11.38548   4.023333    -2.83   0.005    -19.27107   -3.499896
                          _ITS_ce2_10 |   -21.0094   4.130606    -5.09   0.000    -29.10524   -12.91356
                                    E |   .0301852   1.129646     0.03   0.979     -2.18388     2.24425
                                 unem |  -.3046913   .1797088    -1.70   0.090    -.6569141    .0475315
                              lnGDPPC |   6.373911   1.309296     4.87   0.000     3.807737    8.940085
                        _Iregion_id_2 |   6.161709   4.126013     1.49   0.135    -1.925129    14.24855
                        _Iregion_id_3 |  -7.405391   5.823265    -1.27   0.203    -18.81878    4.007998
                        _Iregion_id_4 |   3.858108   4.960603     0.78   0.437    -5.864495    13.58071
                        _Iregion_id_5 |  -8.662549   6.677738    -1.30   0.195    -21.75067    4.425577
                        _Iregion_id_6 |   6.337647   5.404951     1.17   0.241    -4.255861    16.93116
                          _Iyear_2016 |   1.585421   1.242743     1.28   0.202    -.8503104    4.021153
                          _Iyear_2018 |   2.133654    1.55553     1.37   0.170    -.9151293    5.182437
                          _Iyear_2020 |   3.042491   1.573126     1.93   0.053    -.0407788     6.12576
                          _Iyear_2022 |    2.29141       1.72     1.33   0.183    -1.079727    5.662548
                                _cons |   -4.39099   12.18405    -0.36   0.719    -28.27129     19.4893
                        --------------+----------------------------------------------------------------
                              sigma_u |  12.989944
                              sigma_e |  11.041641
                                  rho |  .58054322   (fraction of variance due to u_i)
                        -------------------------------------------------------------------------------
                        
                        . xtoverid
                        
                        Test of overidentifying restrictions: fixed vs random effects
                        Cross-section time-series model: xtreg re  robust cluster(id)
                        Sargan-Hansen statistic 142.171  Chi-sq(14)   P-value = 0.0000
                        From both, it is clear that fixed effects is the way to go; and I also noticed that Chi-sq(14) in both models; but the Sargan-Hansen statistics are different. I don't know which version is the most correct.

                        If dataex helps with answering my question, here it is:

                        Code:
                        * Example generated by -dataex-. For more info, type help dataex
                        clear
                        input str56 country float id double price_dispersion_use float TS_ce2 byte E double(coc unem) float lnGDPPC str3 region float region_id int year
                        "Afghanistan"          1                 15 . 4   -1.36474287509918               7.91  8.014661 "EMR" 3 2014
                        "Afghanistan"          1                  . . 5   -1.54035270214081             10.092  7.994392 "EMR" 3 2016
                        "Afghanistan"          1 13.333333333333334 . 5   -1.50288057327271             11.131  7.974823 "EMR" 3 2018
                        "Afghanistan"          1  11.76470588235294 . 5   -1.49369978904724              11.71  7.928968 "EMR" 3 2020
                        "Afghanistan"          1                  . . 5   -1.18377649784088               14.1         . "EMR" 3 2022
                        "Albania"              2  44.44444444444444 1 5   -.586141347885132              18.05  9.465752 "EUR" 4 2014
                        "Albania"              2 56.666666666666664 1 5   -.471469223499298              15.42  9.524819 "EUR" 4 2016
                        "Albania"              2               62.5 1 5   -.545840263366699               12.3  9.604934 "EUR" 4 2018
                        "Albania"              2  60.60606060606061 1 5   -.572924494743347             12.833   9.60202 "EUR" 4 2020
                        "Albania"              2                 60 1 5   -.407875537872314             11.629  9.756207 "EUR" 4 2022
                        "Algeria"              3  33.33333333333333 6 4    -.61265641450882              10.21  9.511575 "AFR" 1 2014
                        "Algeria"              3 35.714285714285715 . 4    -.67341673374176               10.2  9.539472 "AFR" 1 2016
                        "Algeria"              3                 15 . 5   -.658660113811493             12.145  9.525714 "AFR" 1 2018
                        "Algeria"              3                 50 . 5    -.66646021604538             14.036  9.447598 "AFR" 1 2020
                        "Algeria"              3  48.57142857142857 5 5   -.637929856777191             12.491 9.4796715 "AFR" 1 2022
                        "Andorra"              4  72.85714285714285 6 2    1.22070860862732 3.4574213637138524 11.030043 "EUR" 4 2014
                        "Andorra"              4  72.85714285714285 6 2    1.15955591201782 3.4620623503549632 11.067958 "EUR" 4 2016
                        "Andorra"              4  77.77777777777777 1 2    1.17916560173035 3.4856245150448624 11.053652 "EUR" 4 2018
                        "Andorra"              4  68.44993141289439 1 2    1.26600527763367 3.5770777680298496  10.91981 "EUR" 4 2020
                        "Andorra"              4  69.86301369863014 1 2    1.27020359039307  3.890483345078273 11.056888 "EUR" 4 2022
                        "Angola"               5                 75 . 2   -1.45779824256897             16.317  9.236285 "AFR" 1 2014
                        "Angola"               5                  . . 2   -1.48333728313446             16.577  9.147498 "AFR" 1 2016
                        "Angola"               5                  . . 4   -1.19925093650818             16.626   9.06262 "AFR" 1 2018
                        "Angola"               5                 25 2 4   -.938672542572021             16.698 8.9309025 "AFR" 1 2020
                        "Angola"               5                 25 . 4   -.601941287517548             14.478  8.910195 "AFR" 1 2022
                        "Antigua and Barbuda"  6                 75 . 2    .634897768497467 3.4574213637138524 10.143725 "AMR" 2 2014
                        "Antigua and Barbuda"  6                 50 . 2    .645558714866638 3.4620623503549632  10.18348 "AMR" 2 2016
                        "Antigua and Barbuda"  6             66.875 . 5    .236239701509476 3.4856245150448624 10.263378 "AMR" 2 2018
                        "Antigua and Barbuda"  6  62.05673758865249 . 5    .238533273339272 3.5770777680298496 10.073401 "AMR" 2 2020
                        "Antigua and Barbuda"  6 63.829787234042556 . 5    .310604453086853  3.890483345078273  10.23125 "AMR" 2 2022
                        "Argentina"            7 41.935483870967744 2 4   -.549443066120148               7.27  10.25563 "AMR" 2 2014
                        "Argentina"            7              37.75 3 4   -.298964887857437              8.085 10.240202 "AMR" 2 2016
                        "Argentina"            7  45.34920634920635 3 4   -.098668172955513               9.22 10.220944 "AMR" 2 2018
                        "Argentina"            7 18.726114649681527 3 4    -.16378065943718              11.46 10.076843 "AMR" 2 2020
                        "Argentina"            7 13.384615384615383 3 4   -.447030484676361              6.805   10.2083 "AMR" 2 2022
                        "Armenia"              8                 30 6 2   -.565155386924744             11.989  9.488754 "EUR" 4 2014
                        "Armenia"              8 26.666666666666668 6 2   -.659123718738556             12.625  9.530623 "EUR" 4 2016
                        "Armenia"              8 42.857142857142854 3 2   -.408891350030899              13.21  9.663906 "EUR" 4 2018
                        "Armenia"              8               47.5 1 5 -.00343869999051094              12.18  9.673404 "EUR" 4 2020
                        "Armenia"              8  48.23529411764706 1 5   .0280352365225554              8.588  9.857456 "EUR" 4 2022
                        "Australia"            9  78.93318965517241 1 4    1.84946465492249               6.08  10.90798 "WPR" 6 2014
                        "Australia"            9  73.84341637010677 1 4    1.77200365066528               5.71  10.92716 "WPR" 6 2016
                        "Australia"            9  82.34126984126985 1 4    1.76737761497498                5.3 10.947303 "WPR" 6 2018
                        "Australia"            9  71.02189781021899 6 4    1.63295590877533               6.46 10.938417 "WPR" 6 2020
                        "Australia"            9  68.45524542829644 6 4    1.76448953151703                3.7 10.987324 "WPR" 6 2022
                        "Austria"             10  80.61224489795919 4 4    1.46674907207489               5.67 11.040983 "EUR" 4 2014
                        "Austria"             10                 80 4 4    1.49696803092957               6.06 11.048753 "EUR" 4 2016
                        "Austria"             10                 80 4 4    1.56836605072021               4.93 11.083235 "EUR" 4 2018
                        "Austria"             10  82.45614035087719 4 4    1.47778916358948                5.2 11.020405 "EUR" 4 2020
                        "Austria"             10  68.35820895522387 4 4    1.25861942768097               4.99 11.094935 "EUR" 4 2022
                        "Azerbaijan"          11                 24 6 4   -1.02249026298523               4.91  9.938668 "EUR" 4 2014
                        "Azerbaijan"          11              56.25 1 4   -.852654457092285                  5  9.894967 "EUR" 4 2016
                        "Azerbaijan"          11 23.076923076923077 6 5   -.852769494056702                4.9  9.893378 "EUR" 4 2018
                        "Azerbaijan"          11  47.05882352941177 6 5   -1.07708406448364               7.24  9.858809 "EUR" 4 2020
                        "Azerbaijan"          11  55.55555555555556 6 5   -1.04057228565216               5.65  9.953777 "EUR" 4 2022
                        "Bahamas"             12 48.658536585365916 1 2    1.30873775482178                  . 10.381657 "AMR" 2 2014
                        "Bahamas"             12  40.22346368715088 1 2    1.06738793849945               12.7 10.366473 "AMR" 2 2016
                        "Bahamas"             12                  . . 2    1.09553563594818                 10 10.405302 "AMR" 2 2018
                        "Bahamas"             12  61.08949416342412 1 2    1.10620594024658             12.563 10.118558 "AMR" 2 2020
                        "Bahamas"             12                  . 1 2    1.25618994235992             10.089 10.401076 "AMR" 2 2022
                        "Bahrain"             13                 50 . 5    .273521840572357              1.147 10.890368 "EMR" 3 2014
                        "Bahrain"             13  33.33333333333333 . 5  -.0476647540926933              1.193 10.877423 "EMR" 3 2016
                        "Bahrain"             13                 40 2 5   -.176231503486633              1.198  10.88669 "EMR" 3 2018
                        "Bahrain"             13  34.78260869565218 3 5  -.0935939401388168              1.786 10.867227 "EMR" 3 2020
                        "Bahrain"             13 58.333333333333336 3 5    .139385640621185              1.339 10.944588 "EMR" 3 2022
                        "Bangladesh"          14 15.789473684210526 . 4   -.892129957675934              4.405  8.543592 "SEA" 5 2014
                        "Bangladesh"          14 22.727272727272727 . 4    -.88687801361084               4.35  8.651562 "SEA" 5 2016
                        "Bangladesh"          14  33.33333333333333 . 4   -.926946818828583              4.373  8.761912 "SEA" 5 2018
                        "Bangladesh"          14 32.142857142857146 . 4   -1.00367724895477              5.316  8.849105 "SEA" 5 2020
                        "Bangladesh"          14                 25 . 4    -1.0755273103714              4.271   8.96254 "SEA" 5 2022
                        "Barbados"            15  79.32850559578671 1 2    1.13345634937286              12.17  9.704554 "AMR" 2 2014
                        "Barbados"            15              81.25 1 2     1.2135511636734               8.25    9.7494 "AMR" 2 2016
                        "Barbados"            15  45.23433385992628 1 2    1.37191247940063               8.32  9.741538 "AMR" 2 2018
                        "Barbados"            15                  . . 2    1.19406688213348              9.743  9.604329 "AMR" 2 2020
                        "Barbados"            15  78.84615384615384 1 2    1.28457343578339              8.501  9.700481 "AMR" 2 2022
                        "Belarus"             16             35.625 6 4    -.23470650613308              5.908 10.187328 "EUR" 4 2014
                        "Belarus"             16 31.914893617021278 6 4   -.224086627364159               5.84  10.12049 "EUR" 4 2016
                        "Belarus"             16 30.645161290322577 6 4    -.15480200946331               4.76 10.179738 "EUR" 4 2018
                        "Belarus"             16  25.71428571428572 6 4   -.133964225649834               4.05 10.193598 "EUR" 4 2020
                        "Belarus"             16 23.958333333333332 6 4    -.57967621088028               3.57 10.185905 "EUR" 4 2022
                        "Belgium"             17  80.82901554404145 4 4    1.51295030117035               8.52  10.96869 "EUR" 4 2014
                        "Belgium"             17  81.64556962025317 4 4    1.53148806095123               7.83  10.99063 "EUR" 4 2016
                        "Belgium"             17  83.33333333333334 4 4    1.42942035198212               5.95 11.016062 "EUR" 4 2018
                        "Belgium"             17  85.29411764705883 4 4    1.44595634937286               5.55 10.974466 "EUR" 4 2020
                        "Belgium"             17               72.5 4 4    1.49504864215851               5.56  11.05771 "EUR" 4 2022
                        "Belize"              18  41.66666666666667 . 2   -.159106820821762               8.24 9.4002905 "AMR" 2 2014
                        "Belize"              18  41.66666666666667 1 2   -.229891732335091                  7  9.390294 "AMR" 2 2016
                        "Belize"              18                 40 1 2   -.169436514377594              7.899  9.343116 "AMR" 2 2018
                        "Belize"              18                 50 1 2   -.193349361419678             10.619  9.203805 "AMR" 2 2020
                        "Belize"              18 50.391644908616186 1 2   -.237028583884239              8.672  9.426018 "AMR" 2 2022
                        "Benin"               19                  . 2 4   -.669143795967102              1.808  8.040519 "AFR" 1 2014
                        "Benin"               19                 20 2 4   -.529120028018951              1.843  8.031984 "AFR" 1 2016
                        "Benin"               19               22.5 2 5   -.391388416290283               1.47  8.093288 "AFR" 1 2018
                        "Benin"               19 47.368421052631575 2 5   -.040327787399292              1.616  8.140294 "AFR" 1 2020
                        "Benin"               19                  . 2 5   -.124255605041981              1.476  8.215443 "AFR" 1 2022
                        "Bhutan"              20                  . . 4    1.30612897872925               2.63  9.345343 "SEA" 5 2014
                        "Bhutan"              20                  . . 4    1.09102046489716              2.747  9.469749 "SEA" 5 2016
                        "Bhutan"              20                  . . 4    1.59051811695099               3.35  9.533319 "SEA" 5 2018
                        "Bhutan"              20                  . . 4    1.61823654174805               5.03  9.467918 "SEA" 5 2020
                        "Bhutan"              20                  . 2 4    1.51425933837891               5.95         . "SEA" 5 2022
                        end
                        label values TS_ce2 TS_ce2_l
                        label def TS_ce2_l 1 "specific uniform", modify
                        label def TS_ce2_l 2 "adv un NO min", modify
                        label def TS_ce2_l 3 "adv uni WITH min", modify
                        label def TS_ce2_l 4 "mixed uni NO min", modify
                        label def TS_ce2_l 5 "mixed uni WITH min", modify
                        label def TS_ce2_l 6 "specific tiered", modify
                        label values region_id region_id_l
                        label def region_id_l 1 "AFR", modify
                        label def region_id_l 2 "AMR", modify
                        label def region_id_l 3 "EMR", modify
                        label def region_id_l 4 "EUR", modify
                        label def region_id_l 5 "SEA", modify
                        label def region_id_l 6 "WPR", modify
                        Thank you!

                        Sam

                        Comment


                        • #13
                          Sam:
                          As an example, I have one time invariant dummy that represents different regional groupings, should my xtoverid test include the region dummy, or not?
                          Short answer: yes, you should include it.

                          In addition, you coded two different specifications in the two regressions. Therefore, while both -xtoverid- outcomes point you towards -fe-, no wonder that the Sargan statististics differ.
                          That said, if your time-invariant predictor are notwithstanding crucial for your research goal, you may want to explore The Stata Blog ยป Fixed effects or random effects: The Mundlak approach
                          Kind regards,
                          Carlo
                          (StataNow 18.5)

                          Comment


                          • #14
                            Carlo Lazzaro thank you for the advice. I read the article that you shared a link to and tried to implement it.

                            Code:
                            . ** Step 1: gen means of time-varying controls:
                            .
                            . ** TS_ce_2 is a categorical variable:
                            
                            . bysort id: egen mean_TS_ce2 = mean(TS_ce2)
                            (205 missing values generated)
                            
                            .
                            . ** E is a continuous variable:
                            . bysort id: egen mean_E = mean(E)
                            (110 missing values generated)
                            
                            .
                            . ** unem is a continuous variable:
                            . bysort id: egen mean_unem = mean(unem)
                            (120 missing values generated)
                            
                            .
                            . ** lnGDPPC is a continuous variable:
                            . bysort id: egen mean_lnGDPPC = mean(lnGDPPC)
                            (160 missing values generated)
                            
                            .
                            . ** year
                            . bysort id: egen mean_year = mean(year)
                            (110 missing values generated)
                            
                            .
                            . ** Step 2:  run regression including time-invariant region_id, all time-varying controls and their panel means:
                            .
                            . quietly xtreg price_dispersion_use region_id i.TS_ce2 E unem lnGDPPC i.year mean_TS_ce2 mean_E mean_unem mean_lnGDPPC mean_year, vce(robust)
                            
                            .
                            . estimates store mundlak
                            
                            .
                            . ** Step 3: do the test:
                            .
                            . test mean_TS_ce2 mean_E mean_unem mean_lnGDPPC mean_year
                            
                             ( 1)  mean_TS_ce2 = 0
                             ( 2)  mean_E = 0
                             ( 3)  mean_unem = 0
                             ( 4)  mean_lnGDPPC = 0
                             ( 5)  o.mean_year = 0
                                   Constraint 5 dropped
                            
                                       chi2(  4) =    6.60
                                     Prob > chi2 =    0.1585
                            According to this, we fail to reject the null. This is evidence that there is no correlation between the time-invariant unobservable and my regressors; that is, the random effects assumptions are satisfied.

                            However, in step 3, my year mean gets omitted. I am not sure if this is right? I also included the region_id in my regression, and I am not sure if I was meant to do this (I treated region_id) as being the same as x1 in the example article you shared.

                            Moreover, it is not clear to me if this test is meant to supplement the Hausman test implemented with xtoverid; or if or is meant to replace it. The Hausman test in my original post pointed to FE being the consistent estimator. But, if I have implemented this Mundlak test correctly, then I am told that RE is the way to go. Which test am I to follow?

                            Thank you!

                            Sam

                            Comment


                            • #15
                              Sam:
                              your panel is severely unbalanced. Therefore, Mundlak outcome is less reliable.
                              I would trust -xtoverid- outcome more than Mundlak one and go -fe-.
                              Kind regards,
                              Carlo
                              (StataNow 18.5)

                              Comment

                              Working...
                              X