Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    This is the output in terms of heteroskedasticity
    Click image for larger version

Name:	Heteroskedasticity.jpg
Views:	1
Size:	102.1 KB
ID:	1529595

    Comment


    • #32
      Francesca:
      the model looks ok to me.
      Hiwever, you might have a minor heteroskedasticity issue: just impose cluster robust SEs and see if the 95% CIs do differ vs default SEs.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #33
        Carlo, thank you very much again. I invoked robust standard errors. You can find below the output with robust standard errors and without. It seems to me that CI are smaller in the model without robust standard errors, do you agree?

        Code:
        xtreg TobinsQ ROA DE LNTA YoYSales RDCS i. Years, fe vce(cluster Company1)
        
        Fixed-effects (within) regression               Number of obs     =      1,060
        Group variable: Company1                        Number of groups  =        212
        
        R-sq:                                           Obs per group:
             within  = 0.1893                                         min =          5
             between = 0.0031                                         avg =        5.0
             overall = 0.0070                                         max =          5
        
                                                        F(9,211)          =      10.82
        corr(u_i, Xb)  = -0.7982                        Prob > F          =     0.0000
        
                                     (Std. Err. adjusted for 212 clusters in Company1)
        ------------------------------------------------------------------------------
                     |               Robust
             TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 ROA |   1.677901   .5152768     3.26   0.001     .6621507    2.693651
                  DE |  -.2066595   .0689203    -3.00   0.003    -.3425202   -.0707989
                LNTA |  -1.350005   .2896686    -4.66   0.000     -1.92102   -.7789899
            YoYSales |   .7732031   .2754733     2.81   0.005     .2301706    1.316236
                RDCS |  -.7317888   .3324677    -2.20   0.029    -1.387173    -.076405
                     |
               Years |
               2014  |   .1331407    .082901     1.61   0.110    -.0302795    .2965609
               2015  |   .0866756   .1132684     0.77   0.445     -.136607    .3099582
               2016  |   .2196538   .1044505     2.10   0.037     .0137535    .4255541
               2017  |    .679877   .1435643     4.74   0.000     .3968728    .9628811
                     |
               _cons |   30.76325   6.044455     5.09   0.000     18.84799     42.6785
        -------------+----------------------------------------------------------------
             sigma_u |  3.1345762
             sigma_e |  1.0474911
                 rho |  .89954618   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . 
        . xtreg TobinsQ ROA DE LNTA YoYSales RDCS i. Years, fe
        
        Fixed-effects (within) regression               Number of obs     =      1,060
        Group variable: Company1                        Number of groups  =        212
        
        R-sq:                                           Obs per group:
             within  = 0.1893                                         min =          5
             between = 0.0031                                         avg =        5.0
             overall = 0.0070                                         max =          5
        
                                                        F(9,839)          =      21.76
        corr(u_i, Xb)  = -0.7982                        Prob > F          =     0.0000
        
        ------------------------------------------------------------------------------
             TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 ROA |   1.677901   .4205996     3.99   0.000     .8523498    2.503452
                  DE |  -.2066595   .0667669    -3.10   0.002    -.3377094   -.0756097
                LNTA |  -1.350005    .129184   -10.45   0.000    -1.603567   -1.096443
            YoYSales |   .7732031   .1465645     5.28   0.000      .485527    1.060879
                RDCS |  -.7317888   .1967754    -3.72   0.000    -1.118019   -.3455589
                     |
               Years |
               2014  |   .1331407   .1034784     1.29   0.199    -.0699663    .3362476
               2015  |   .0866756   .1058546     0.82   0.413    -.1210953    .2944466
               2016  |   .2196538   .1098901     2.00   0.046     .0039619    .4353456
               2017  |    .679877   .1150542     5.91   0.000     .4540491    .9057048
                     |
               _cons |   30.76325   2.699273    11.40   0.000     25.46513    36.06137
        -------------+----------------------------------------------------------------
             sigma_u |  3.1345762
             sigma_e |  1.0474911
                 rho |  .89954618   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(211, 839) = 14.60                   Prob > F = 0.0000
        
        .

        However, I would like to ask your opinion on the following:

        -Why if my observation years go from 2013 (included) to 2017(included), I see only years from 2014 when I insert years dummy variable?
        -How can I control for the company effect if I don't invoke robust standard errors?
        -When I do the Hausman test, do I need to include the dummy variables for the years or is not necessary?
        -If my F value increase, it means that the model increase in significance, right?

        Many thanks in advance,

        Francesca

        Comment


        • #34
          Francesca:
          - I would keep the model with clustered robust SE (and this comment answers to your question #2, too) because SEs are actually higher (and for some coefficients they almost double) than the default ones;
          - 2013 is omitted by default to shelter your regression from the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...(statistics));
          - yes, include the dummy variables for year in both the -fe- and -re- regressions, that now should be compared via the community-contributed command -xtoverid- as you invoked non default SEs which are not supported by -hausman-;
          - not quite. The F-test investigates whether your coefficients jointly differ from zero.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #35
            Thank you very much Carlo as always and sorry for the delay in the response!

            First of all, merry Christmas and happy holidays! I just realized that for my model I did not take into account potential outliers that should be eliminated. Do you think this is a major issue or can I avoid to check for outliers?

            Many thanks again and my best wishes,

            Francesca

            Comment


            • #36
              Francesca:
              I do reciprocate best wishes for the Xmas season to you and your dears.
              Sticking with statistics, eliminating outliers (unless you're 100% sure that they are the result of a mistaken data entry) is, in general, a very bad idea. What we call outliers, are, in general, expressions of the data generating process undelying our samples. For instance, it is frequent that the statistical distribution of the total cost of a given activity is positively skewed (gamma distribution).
              In sum, think very carefully about eliminating "weird" observations: personally, I do not advise that approach.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #37
                Thank you very much Carlo! However, I would have another concern. When I apply robust standard errors and plot again to check for heteroskedasticity, it seems to stay the same as before invoking for robust standard errors (I attach the graph below). Is it normal?

                Comment


                • #38
                  This is the output after invoking robust standard errors
                  Attached Files

                  Comment


                  • #39
                    Would it be better to apply the natural logarithm to some of the variables? If I do so and plot for heteroskedastity, these are the output I get:
                    Code:
                    xtreg LNTobinsQ LNTA ROA LNDE lnYoYSales i.Years, fe
                    
                    Fixed-effects (within) regression               Number of obs     =      1,049
                    Group variable: Company1                        Number of groups  =        210
                    
                    R-sq:                                           Obs per group:
                         within  = 0.2717                                         min =          4
                         between = 0.0011                                         avg =        5.0
                         overall = 0.0058                                         max =          5
                    
                                                                    F(8,831)          =      38.76
                    corr(u_i, Xb)  = -0.7684                        Prob > F          =     0.0000
                    
                    ------------------------------------------------------------------------------
                       LNTobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                            LNTA |  -.3862058   .0367041   -10.52   0.000    -.4582493   -.3141622
                             ROA |   .9756964   .1211722     8.05   0.000     .7378569    1.213536
                            LNDE |  -.0886246   .0141089    -6.28   0.000    -.1163178   -.0609314
                      lnYoYSales |   .0513431    .009779     5.25   0.000     .0321487    .0705375
                                 |
                           Years |
                           2014  |   .0703752   .0295889     2.38   0.018     .0122975    .1284529
                           2015  |   .0508165    .030604     1.66   0.097    -.0092537    .1108866
                           2016  |   .1006498   .0316692     3.18   0.002     .0384888    .1628109
                           2017  |    .266613   .0331686     8.04   0.000     .2015089    .3317171
                                 |
                           _cons |   8.532238   .7666167    11.13   0.000     7.027505    10.03697
                    -------------+----------------------------------------------------------------
                         sigma_u |  .98246626
                         sigma_e |  .29886225
                             rho |  .91530234   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    F test that all u_i=0: F(209, 831) = 19.11                   Prob > F = 0.0000
                    The R-sqared increase from 17% to 27% and the heteroskedasticity improves:
                    Attached Files

                    Comment


                    • #40
                      Thank you very much in advance.

                      Comment


                      • #41
                        Francesca:
                        what you experienced (#37) occurred to everybody when trying to get familiar with -regress- (and I lead the queue of those frightened with looking at the same graph before/after invoking robust standard errors)..
                        Indeed, re-checking for heteroskedasticity after invoking cluster robust standard error is not useful at all: the graph will remain the same because the -robust- option correct the standard errors, not the residuals.
                        As far as your second question is concerned, actually there seem to be heteroskedasticity in your last graph (the distribution of the systematic error seems to widen when the value of the fitted values increase). It is worth checking whether heteroskedasticity depends on misspecification.
                        Logging some variables at random makes little methodological sense (by the way, from your code I fail to get which variables, if any, have already been logged): logging the regressand and/or the predictors means creating different regression models, which coefficients can be difficult to undestand and/or disseminate.
                        If the heteroskedasticity is not due to model misspecification, invoking cluster robust standard error is, in general, enough.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #42
                          Just to add on to Carlo's helpful comments, you're fixating way too much on heteroskedasticity. It's much more likely that serial correlation is a bigger issue, if you think of these problems the way they are traditionally taught. How come you're testing for heteroskedasticity and not serial correlation?

                          The point is, you should test for neither. The clustering accounts for any kind of heteroskedasticity and any kind of serial correlation. You're not changing the estimates -- it's still standard fixed effects -- but you're computing robust standard errors. They're robust to heteroskedasticity and serial correlation, and that's why almost all empirical researchers compute them now and do not even bother to check if either is a problem. The standard errors work in either case.

                          Two things about taking the log of TobinsQ. First, note that you've lost 11 observations, which is due to TobinsQ <=0 in 11 cases. And you've lost two firms entirely. So you don't want to do this. And even if you did, it makes no sense to compare R-squareds across different transformations of the dependent variable. It's possible to compute an R-squared that is comparable, but, since you are losing data taking the logs, you shouldn't do that. It is often true that the R-squared using logs is higher but that doesn't mean you should do it, especially when the samples aren't comparable.

                          Do you have any negative values of TobinsQ, or just zeros?

                          Above I suggested that you stop with the results in post #33, using the results with clustered standard errors. That's still my suggestion. Is someone insisting you test for heteroskedasticity?

                          Comment


                          • #43
                            Dear Carlo and Jeff, thank you very much for your support, it is really important to me! First of all, apologies for my late response and wish you all the best for the new year!

                            I understand that heteroskedasticity is definitely not a big issue (most of all after applying robust standard errors) and thank you for your patience in explaining me why. I was looking at previous thesis with regression analysis as a reference and I noticed that they were checking on heteroskedasticity before and after some transformations. That's why I was erroneously looking at heteroskedasticity too much. The minimum value of my Tobin's Q is 0,18 and now I also understand why it doesn't make sense to apply the log. So, really thank you.

                            However, I would like to kindly ask you another clarification on my analysis:

                            -When I run the regression including time effect, therefore years' dummy variables, what is a good explanation to the fact that only one year is statistically significant?

                            Below I provide an output for your reference:

                            Code:
                            . xtreg TobinsQ LNTA ROA DE YoYSales RDS i. Years, fe vce(cluster Company1)
                            
                            Fixed-effects (within) regression               Number of obs     =      1,060
                            Group variable: Company1                        Number of groups  =        212
                            
                            R-sq:                                           Obs per group:
                                 within  = 0.1768                                         min =          5
                                 between = 0.0030                                         avg =        5.0
                                 overall = 0.0071                                         max =          5
                            
                                                                            F(9,211)          =      10.54
                            corr(u_i, Xb)  = -0.7614                        Prob > F          =     0.0000
                            
                                                         (Std. Err. adjusted for 212 clusters in Company1)
                            ------------------------------------------------------------------------------
                                         |               Robust
                                 TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                    LNTA |  -1.201734   .2806209    -4.28   0.000    -1.754914   -.6485543
                                     ROA |   1.799148   .4931036     3.65   0.000      .827107    2.771189
                                      DE |  -.2142444   .0701898    -3.05   0.003    -.3526075   -.0758813
                                YoYSales |   .8832866   .2885457     3.06   0.002     .3144848    1.452088
                                     RDS |  -.9375915   1.432355    -0.65   0.513     -3.76115    1.885967
                                         |
                                   Years |
                                   2014  |   .1154523   .0818341     1.41   0.160    -.0458648    .2767695
                                   2015  |   .0553842   .1094215     0.51   0.613    -.1603152    .2710835
                                   2016  |   .1661806   .0992611     1.67   0.096      -.02949    .3618511
                                   2017  |   .6106977   .1347841     4.53   0.000     .3450019    .8763936
                                         |
                                   _cons |   27.29121   5.722012     4.77   0.000     16.01158    38.57085
                            -------------+----------------------------------------------------------------
                                 sigma_u |  2.9067784
                                 sigma_e |  1.0554858
                                     rho |  .88350911   (fraction of variance due to u_i)
                            ------------------------------------------------------------------------------
                            
                            .
                            Many thanks in advance!

                            Francesca

                            Comment


                            • #44
                              Francesca:
                              best wishes for the just begun 2020 to you, too.
                              You should better consider the joint statistical significance of -i.year- via:
                              Code:
                              testparm(i.year)

                              It may well be that -i.year-, when adjusted for the remaining predictors, does not play a relevant role (jointly speaking) in explaining the variation of the regressand within the same panel (as you're dealing with -fe- specification).
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment


                              • #45
                                Thank you very much Carlo!

                                Now that I am going to interpret the coefficients, I am wondering if my interpretation is correct. This is the output I get, for example, from my first model:

                                Code:
                                . xtreg TobinsQ LNTA ROA DE YoYSales RDS i. Years, fe vce(cluster Company1)
                                
                                Fixed-effects (within) regression               Number of obs     =      1,060
                                Group variable: Company1                        Number of groups  =        212
                                
                                R-sq:                                           Obs per group:
                                     within  = 0.1768                                         min =          5
                                     between = 0.0030                                         avg =        5.0
                                     overall = 0.0071                                         max =          5
                                
                                                                                F(9,211)          =      10.54
                                corr(u_i, Xb)  = -0.7614                        Prob > F          =     0.0000
                                
                                                             (Std. Err. adjusted for 212 clusters in Company1)
                                ------------------------------------------------------------------------------
                                             |               Robust
                                     TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                                -------------+----------------------------------------------------------------
                                        LNTA |  -1.201734   .2806209    -4.28   0.000    -1.754914   -.6485543
                                         ROA |   1.799148   .4931036     3.65   0.000      .827107    2.771189
                                          DE |  -.2142444   .0701898    -3.05   0.003    -.3526075   -.0758813
                                    YoYSales |   .8832866   .2885457     3.06   0.002     .3144848    1.452088
                                         RDS |  -.9375915   1.432355    -0.65   0.513     -3.76115    1.885967
                                             |
                                       Years |
                                       2014  |   .1154523   .0818341     1.41   0.160    -.0458648    .2767695
                                       2015  |   .0553842   .1094215     0.51   0.613    -.1603152    .2710835
                                       2016  |   .1661806   .0992611     1.67   0.096      -.02949    .3618511
                                       2017  |   .6106977   .1347841     4.53   0.000     .3450019    .8763936
                                             |
                                       _cons |   27.29121   5.722012     4.77   0.000     16.01158    38.57085
                                -------------+----------------------------------------------------------------
                                     sigma_u |  2.9067784
                                     sigma_e |  1.0554858
                                         rho |  .88350911   (fraction of variance due to u_i)
                                ------------------------------------------------------------------------------
                                
                                .
                                is it correct stating the following?
                                • When the natural logarithm of total assets increases by one unit, the change in Tobin’s Q equals a decrease of 1,2 units;
                                • When the return on assets increases by one unit, the change in Tobin’s Q equals an increase of 1,8 units;
                                • When the debt-equity ratio increases by one unit, the change in Tobin’s Q equals a decrease of 0,2 units;
                                • When the year-over-year sales increase by one unit, the change in Tobin’s Q equals an increase of 0,9 units;
                                • When the research and development expenditure intensity increases by one unit, the change in Tobin’s Q equals a decrease of 0,9 units.
                                Or should I consider increases/decreases in percentages rather than in units?

                                Many thanks in advance,

                                Francesca

                                Comment

                                Working...
                                X