Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • effect size after regression of panel data

    I am working on a study to estimate a panel regression (fixed-effect) of data for 118 countries in the past 30 years.

    I plan to know how GDP and population influence the country's use of renewable energy.

    One reviewer commented that it will be better to report effect size.

    However, after doing some research, I found it is difficult to calculate effect size (e.g., the Eta-Squared statistics) for variables in a panel regression.

    For example, if the regression command is: xtreg renewable gdp population, fe

    What might be some post-estimation commands that can be use to calculate effect size of gdp and population?


    I appreciate for any suggestions that can help addressing this issue.

  • #2
    What I have seen done is to use the margins command to show the difference in the predicted value for some standard change in the right-hand side variables. Many conventional statistics used for effect size are really basically measures of fit. However, a right-hand side variable could have a substantial influence on the dependent variable (that is, changes in the right-hand side variable dramatically change the predicted value) even though the overall explained variance is not that high.

    Comment


    • #3
      Thanks Phil. I used the Margins and Marginsplot commands in the paper. However, the reviewer of my paper seems more interested in the explanatory power. For example, the reviewer want to know will independent variable A explain more variance in the dependent variable than independent variable B and C. Is there a way that can help me respond to this request? I have tried the "estat esize" command but it is only valid after reg, not xtreg. Thanks!

      Comment


      • #4
        Dear Stata members
        Though this thread is about 3 years old, I have some doubts about this: calculation of effect size, power calculations, and confidence intervals. I have run a model (similar to classical DiD) and below is the output

        Code:
        reghdfe lever_w i.treat_time##i.treat2 size_w nfa_ta_w cash_ta_w trade_credit_w sales_grow_w roa_w pb_w cfo_ta_w
        >  rddcc_dum div_dum nw_ta_w age , absorb(ff48##i.year id) cluster (id)
        (dropped 50 singleton observations)
        note: 1bn.treat_time is probably collinear with the fixed effects (all partialled-out values are close to zero; to
        > l = 1.0e-09)
        note: 1bn.treat2 is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 
        > 1.0e-09)
        (MWFE estimator converged in 11 iterations)
        note: 1.treat_time omitted because of collinearity
        note: 1.treat2 omitted because of collinearity
        
        HDFE Linear regression                            Number of obs   =      4,559
        Absorbing 2 HDFE groups                           F(  13,    743) =      56.10
        Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                          R-squared       =     0.9194
                                                          Adj R-squared   =     0.8970
                                                          Within R-sq.    =     0.4583
        Number of clusters (id)      =        744         Root MSE        =     0.0631
        
                                                (Std. err. adjusted for 744 clusters in id)
        -----------------------------------------------------------------------------------
                          |               Robust
                  lever_w | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        ------------------+----------------------------------------------------------------
             1.treat_time |          0  (omitted)
                 1.treat2 |          0  (omitted)
                          |
        treat_time#treat2 |
                     1 1  |   .0117077   .0075814     1.54   0.123    -.0031759    .0265912
                          |
                   size_w |  -.0060164   .0098228    -0.61   0.540    -.0253001    .0132674
                 nfa_ta_w |   .1207329   .0255879     4.72   0.000     .0704998     .170966
                cash_ta_w |   .0750381   .0684868     1.10   0.274    -.0594125    .2094888
           trade_credit_w |   .2226657   .0428537     5.20   0.000      .138537    .3067944
             sales_grow_w |  -.0120943   .0055367    -2.18   0.029    -.0229637   -.0012248
                    roa_w |  -.0890246    .040655    -2.19   0.029     -.168837   -.0092122
                     pb_w |  -.0008145   .0017106    -0.48   0.634    -.0041727    .0025437
                 cfo_ta_w |  -.1217877   .0219773    -5.54   0.000    -.1649327   -.0786426
                rddcc_dum |  -.0116847   .0057897    -2.02   0.044    -.0230508   -.0003186
                  div_dum |  -.0039561   .0051014    -0.78   0.438    -.0139709    .0060587
                  nw_ta_w |  -.5682216   .0251936   -22.55   0.000    -.6176807   -.5187624
                      age |  -.0168279   .0321798    -0.52   0.601    -.0800021    .0463463
                    _cons |   .6086971   .1328404     4.58   0.000       .34791    .8694843
        -----------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        -----------------------------------------------------+
         Absorbed FE | Categories  - Redundant  = Num. Coefs |
        -------------+---------------------------------------|
           ff48#year |       234           0         234     |
                  id |       744         744           0    *|
        -----------------------------------------------------+
        * = FE nested within cluster; treated as redundant for DoF computation
        
        .
        To put blatantly, the interaction is not significant at conventional level of significance. I have read somewhere that for a null-result, one should elaborate on effect sizes, power calculations/confidence interval etc. In other words, how can one explain in detail the above results with effect size and power. I have read that "The effect size is the main finding of a quantitative study. While a P value can inform the reader whether an effect exists, the P value will not reveal the size of the effect. In reporting and interpreting studies, both the substantive significance (effect size) and statistical significance (P value) are essential results to be reported."-Source:https://online225.psych.wisc.edu/wp-..._JGME_2012.pdf
        Since in medicine, epidemiology, null results are much informative, I would like to know how to report null results in the most proper way. Can someone help in this regard!

        Comment


        • #5
          Dear Members
          In the mean while, I tried estat esize, but that doesnt work with panel regression models and hence I tried using command reg, and did the above analysis.

          Code:
          reg lever_w i.treat_time##i.treat2 size_w nfa_ta_w cash_ta_w trade_credit_w sales_grow_w roa_w pb_w cfo_ta_w rdd
          > cc_dum div_dum nw_ta_w age  ff48##i.year i.id
          note: 2019.year omitted because of collinearity.
          note: 37.ff48#2012b.year identifies no observations in the sample.
          note: 37.ff48#2013.year identifies no observations in the sample.
          note: 37.ff48#2016.year identifies no observations in the sample.
          note: 37.ff48#2017.year identifies no observations in the sample.
          note: 37.ff48#2018.year omitted because of collinearity.
          note: 37.ff48#2019.year identifies no observations in the sample.
          note: 38.ff48#2012b.year identifies no observations in the sample.
          note: 38.ff48#2019.year omitted because of collinearity.
          note: 175007.id omitted because of collinearity.
          note: 185711.id omitted because of collinearity.
          note: 196667.id omitted because of collinearity.
          note: 216993.id omitted because of collinearity.
          note: 248823.id omitted because of collinearity.
          note: 265165.id omitted because of collinearity.
          note: 265397.id omitted because of collinearity.
          note: 346089.id omitted because of collinearity.
          note: 356060.id omitted because of collinearity.
          note: 356683.id omitted because of collinearity.
          note: 369886.id omitted because of collinearity.
          note: 373645.id omitted because of collinearity.
          note: 375271.id omitted because of collinearity.
          note: 377259.id omitted because of collinearity.
          note: 380647.id omitted because of collinearity.
          note: 385965.id omitted because of collinearity.
          note: 386273.id omitted because of collinearity.
          note: 386916.id omitted because of collinearity.
          note: 387851.id omitted because of collinearity.
          note: 389433.id omitted because of collinearity.
          note: 393826.id omitted because of collinearity.
          note: 397515.id omitted because of collinearity.
          note: 406350.id omitted because of collinearity.
          note: 416232.id omitted because of collinearity.
          note: 443711.id omitted because of collinearity.
          note: 444974.id omitted because of collinearity.
          note: 467986.id omitted because of collinearity.
          note: 468995.id omitted because of collinearity.
          note: 489164.id omitted because of collinearity.
          note: 502255.id omitted because of collinearity.
          note: 506401.id omitted because of collinearity.
          note: 507748.id omitted because of collinearity.
          
                Source |       SS           df       MS      Number of obs   =     4,609
          -------------+----------------------------------   F(1010, 3598)   =     41.42
                 Model |  165.427272     1,010  .163789379   Prob > F        =    0.0000
              Residual |  14.2271171     3,598  .003954174   R-squared       =    0.9208
          -------------+----------------------------------   Adj R-squared   =    0.8986
                 Total |  179.654389     4,608  .038987498   Root MSE        =    .06288
          
          -----------------------------------------------------------------------------------
                    lever_w | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
          ------------------+----------------------------------------------------------------
               1.treat_time |   .0061319   .0334774     0.18   0.855    -.0595048    .0717686
                   1.treat2 |   .3217427   .0541092     5.95   0.000      .215655    .4278305
                            |
          treat_time#treat2 |
                       1 1  |   .0117077   .0042506     2.75   0.006     .0033739    .0200414
                            |
                     size_w |  -.0060164    .003838    -1.57   0.117    -.0135412    .0015084
                   nfa_ta_w |   .1207329   .0130274     9.27   0.000     .0951911    .1462747
                  cash_ta_w |   .0750381   .0371278     2.02   0.043     .0022445    .1478318
             trade_credit_w |   .2226657   .0184463    12.07   0.000     .1864994     .258832
               sales_grow_w |  -.0120943   .0044287    -2.73   0.006    -.0207774   -.0034112
                      roa_w |  -.0890246   .0290391    -3.07   0.002    -.1459594   -.0320898
                       pb_w |  -.0008145   .0009951    -0.82   0.413    -.0027655    .0011365
                   cfo_ta_w |  -.1217877   .0181431    -6.71   0.000    -.1573596   -.0862158
                  rddcc_dum |  -.0116847    .005259    -2.22   0.026    -.0219957   -.0013737
                    div_dum |  -.0039561    .003912    -1.01   0.312    -.0116259    .0037138
                    nw_ta_w |  -.5682216   .0118374   -48.00   0.000    -.5914303   -.5450128
                        age |  -.0168279   .0180267    -0.93   0.351    -.0521715    .0185157
                            |
          For the above model, I could use estat esize, but I think that is wrong,

          Code:
           estat esize, epsilon
          
          Effect sizes for linear models
          
          ------------------------------------------------
                          Source | Epsilon-squared      df
          -----------------------+------------------------
                           Model |     .8985784       1.0e+03
                                 |
                      treat_time |    -.0002422          1
                          treat2 |     .0097874          1
               treat_time#treat2 |     .0018268          1
                          size_w |     .0004048          1
                        nfa_ta_w |     .0230433          1
                       cash_ta_w |     .0008564          1
                  trade_credit_w |      .038654          1
                    sales_grow_w |     .0017911          1
                           roa_w |     .0023281          1
                            pb_w |    -.0000917          1
                        cfo_ta_w |      .012094          1
                       rddcc_dum |     .0010926          1
                         div_dum |     6.30e-06          1
                         nw_ta_w |      .390227          1
                             age |    -.0000357          1
                            ff48 |     .1292017         31
                            year |      .002501          6
                       ff48#year |    -.0042017        211
                              id |     .7071139        747
          ------------------------------------------------
          Note: Epsilon-squared values for individual
                model terms are partial.
          .

          Moreover, the interaction coeffcient has become significant here which I doubt is due to incorrect estimation. So in this context, I would like to ask
          1. How to estimate effect size with panel/IV/DID regression, and does it make sense to do so in panel data?

          Comment


          • #6
            Dear Members
            I am not quite sure whether my doubt made sense or not and I am not sure whether to add more details. I will ask for help in a different way,-Can the Stata members help me pointing some good articles, and writing pieces, that have a null result? I will be delighted if it is a longitudinal study. In that way I may learn how null results has to be presented and what are the test for strengthening the null.

            Comment


            • #7
              You are on the right track, thinking of your fixed-effects regression as an OLS regression with fixed-effect indicator variables. You can use Stata's power analysis commands for this. The -power- command is very complicated, and since it is typically not used very frequently it is hard to remember its details. So this is one of the few situations where I recommend using the graphical user interface. In Stata, open the Statistics drop-down member, and then select "Power, precision, and Sample Size." In the right panel, expand the Hypothesis test option, and select R-square test. Then in the right panel, select R-sqsuared test of a subset of coefficients in a multiple linear regression. Then fill out the form that opens up. For the "Compute:" combo box, select "Effect Size and target R2," since your sample size is already known and immutable. Bear in mind that you are looking to test only 1 covariate, namely treat_time#treat2. All the other covariates, including the fixed-effect indicators, are "control covariates" for this purpose. I suggest running several different values of R-squared for the reduced model. (Reduced model means the model every RHS variable except treat_time#treat2). You might also want to do this with a few different values of power, so that in the end you will know what effect sizes (change in R2) you had .8, .85, .9, and .95 power to detect with your sample.

              Comment


              • #8
                The reason the pooled regression result is now significant is because you didn't cluster your standard errors. With "reg" you should still use "cluster(id)."

                Comment


                • #9
                  Thanks a lot Clyde Schechter for the detailed exposition on Power it was so effortless to have a thorough Stata tour through GUI. Thanks for taking through effect sizes which I havent used.

                  Dear Jeff Wooldridge, I didnt use "cluster (id) as " estat esize" doesnt work with cluster option I guess

                  Code:
                  reg lever_w cash_ta_w,cluster (id )
                  
                  Linear regression                               Number of obs     =     10,634
                                                                  F(1, 1794)        =     223.95
                                                                  Prob > F          =     0.0000
                                                                  R-squared         =     0.0712
                                                                  Root MSE          =     .20158
                  
                                                   (Std. err. adjusted for 1,795 clusters in id)
                  ------------------------------------------------------------------------------
                               |               Robust
                       lever_w | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                  -------------+----------------------------------------------------------------
                     cash_ta_w |  -1.036097   .0692343   -14.97   0.000    -1.171885   -.9003088
                         _cons |   .3621274   .0061325    59.05   0.000     .3500997    .3741551
                  ------------------------------------------------------------------------------
                  
                  . estat esize
                  estat esize only works with vce(ols)
                  r(198);
                  I am not sure whether esize option is readily available after "xtreg", or not.

                  Thanks to you both once again

                  Comment


                  • #10
                    Dear Clyde Schechter
                    I have a doubt ( a dumb one) in this, regarding fixed effects. When we use firm and year fixed effects, for instance we have 10 firms and 5 years, should we treat them as 2 controls ( firm fe+ year fe: a pair is counted) or 13 ( 9+4-excluding 2 dummies) controls?

                    Comment


                    • #11
                      Treat them as 13.

                      Comment


                      • #12
                        Dear Members,

                        I have a question regarding panel regression. I have run it to find out if the circular economy's indicators have an impact on SDGI. I have received the following results:
                        Y Coefficient Std. Err. t P>|t| [95 conf. interval]
                        Recycling rate of municipal waste (X6) 0.111388 0.010814 10.30 0.000 0.09008 0.132696
                        Recycling rate of packaging waste by type of packaging (X7) 0.005514 0.013992 0.39 0.694 -0.02206 0.033087
                        Recycling rate of waste of electrical and electronic equipment (WEEE) separately collected (X8) -0.04221 0.009389 -4.50 0.000 -0.06071 -0.02371
                        Cons 77.9081 1.140163 68.33 0.000 75.66139 80.1548

                        The reviewer asked to calculate f^2 (I suppose in order to know the effect size). Could you help me with this? How can I calculate f^2 for panel data regression?

                        Comment

                        Working...
                        X