How to deal with ratios as dependent and independent variables in a panel data regression

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17675
#16

17 Dec 2019, 07:27

Francesca:
unfortunately, I cannot see any graph!

Kind regards,
Carlo
(Stata 19.0)
Comment
Francesca Sossella

Join Date: Nov 2019

Posts: 29
#17

17 Dec 2019, 07:28

Code:
Comment
Francesca Sossella

Join Date: Nov 2019

Posts: 29
#18

17 Dec 2019, 07:31

Apologies Carlo, I attached the graph, since I was not able to copy it in the proper way.

Thanks,

Francesca

Attached Files
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#19

17 Dec 2019, 07:34

Francesca: Of this list,

1-Test for multicollinearity
2-Test for normality
3-Test for heteroskedasticity
4-Insert dummy variable for years

You should only do the last one. It's essentially necessary to control for aggregate time effects. You also need to add vce(cluster firmid) to the end of both the RE and FE commands in Stata, where firmid is the firm identifier (you didn't show this part of your code).

None of the other issues is a problem. Multicollinearity can make standard errors large, but that's the way it is. Using robust standard errors accounts for (3) along with any serial correlation. Normality is not an issue because everything is based on asymptotics, anyway.

Code:

xtreg TobinsQ ROA DE LNTA YoYSales i.year, re vce(cluster firmid) xtoverid xtreg TobinsQ ROA DE LNTA YoYSales i.year, fe vce(cluster firmid)
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17675
#20

17 Dec 2019, 07:47

Jeff:
enlightening as usual.
It took me tons of words to try to say the same things you wrapped up in a few lines: that's talent!

Kind regards,
Carlo
(Stata 19.0)
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17675
#21

17 Dec 2019, 07:52

Francesca:
the scatter plots seem to widen for values of the linear predictors that =>4.
Hence, you might have some heteroskedasticity issue that you can easily manage with robust/clustered standard error, as Jeff said.

Kind regards,
Carlo
(Stata 19.0)
Comment
Francesca Sossella

Join Date: Nov 2019

Posts: 29
#22

17 Dec 2019, 08:42

Thank you very much both!

Jeff: when I copy paste the code you kindly suggested, I get this:

Code:

xtreg TobinsQ ROA DE LNTA YoYSales i.Years, re vce(cluster Company1) xtoverid xtreg TobinsQ ROA DE LNTA YoYSales i.Years, fe vce(clus > ter Company1) invalid 'fe' r(198);

Am I doing it correctly? Or should I use the commands in a different order?

Many thanks in advance,

Francesca
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17675

#23

17 Dec 2019, 08:51

Francesca:
the issue is that -xtoverid- supports -re- specification only:

Code:

. use "http://www.stata-press.com/data/r15/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. quietly xtreg ln_wage age i.race, fe

. xtoverid
xtoverid not compatible with xtreg model fe
r(198);

. quietly xi: xtreg ln_wage age i.race, re

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  
Sargan-Hansen statistic  14.662  Chi-sq(1)    P-value = 0.0001

.

In this toy-example, -xtoverid- outcome points towards -fe- specification.

Kind regards,
Carlo
(Stata 19.0)

Comment

Francesca Sossella

Join Date: Nov 2019

Posts: 29
#24

17 Dec 2019, 09:05

Thank you Carlo!
This is the output I get:

Code:

quietly xi: xtreg TobinsQ ROA DE LNTA YoYSales i.Years, re vce(cluster Company1) . xtoverid Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(Company1) Sargan-Hansen statistic 19.674 Chi-sq(4) P-value = 0.0006

Correct me if I am wrong, please:

-In the above mentioned example I invoked robust standards errors to correct for heteroskedasticity
-For this reason I cannot use anymore the Hausman test and apply the Sargan-Hansen test instead
-The output I get makes me conclude that I should use fixed effects

Many thanks again,

Francesca
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17675
#25

17 Dec 2019, 09:19

Francesca:
perfect.
One minor correction only: since you invoked non-default standard error you switched from -hausman- to the community-contributed command -xtoverid- that reports the Sargan-Hansen statistic (for more details, see the comprehensive help file that comes with -xtoverid-).
No need to say that you can also use -xtoverid- without imposing the -quietly- option before -xtreg-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Francesca Sossella

Join Date: Nov 2019
Posts: 29

#26

17 Dec 2019, 10:11

Thank you Carlo! So, I can get rid of the quietly since it is not necessary right?

What I did next is:
-Use .xtreg (invoking robust standard errors and inserting dummy variables for years)
-Test the joint statistical significance of i.Years, as you suggested before

This is the output I get:

Code:

 xtreg TobinsQ ROA DE LNTA YoYSales i.Years, fe vce(cluster Company1)

Fixed-effects (within) regression               Number of obs     =      1,060
Group variable: Company1                        Number of groups  =        212

R-sq:                                           Obs per group:
     within  = 0.1759                                         min =          5
     between = 0.0037                                         avg =        5.0
     overall = 0.0080                                         max =          5

                                                F(8,211)          =      11.51
corr(u_i, Xb)  = -0.7620                        Prob > F          =     0.0000

                             (Std. Err. adjusted for 212 clusters in Company1)
------------------------------------------------------------------------------
             |               Robust
     TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         ROA |   1.936935   .5375583     3.60   0.000     .8772617    2.996607
          DE |  -.2134501   .0699562    -3.05   0.003    -.3513527   -.0755475
        LNTA |  -1.207298   .2799577    -4.31   0.000     -1.75917   -.6554253
    YoYSales |   .8969609   .2943407     3.05   0.003     .3167357    1.477186
             |
       Years |
       2014  |   .1150691    .081846     1.41   0.161    -.0462715    .2764097
       2015  |   .0570657   .1090352     0.52   0.601    -.1578721    .2720035
       2016  |   .1695347   .1001073     1.69   0.092    -.0278038    .3668732
       2017  |   .6175279   .1378719     4.48   0.000     .3457451    .8893106
             |
       _cons |   27.23429   5.761046     4.73   0.000     15.87771    38.59087
-------------+----------------------------------------------------------------
     sigma_u |  2.9082317
     sigma_e |  1.0554605
         rho |  .88361688   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. 
. 
. testparm(i.Years)

 ( 1)  2014.Years = 0
 ( 2)  2015.Years = 0
 ( 3)  2016.Years = 0
 ( 4)  2017.Years = 0

       F(  4,   211) =   10.41
            Prob > F =    0.0000

.

I would like to kindly ask your opinion on the below:

-Is it okay that only Year 2017 is significant, given that the years have a joint statistical significance?
-Now I am testing the model that includes only the controlling variables
So,
-When I will include in the model the independent variable R&D/Net sales, which is the one I wanna test, will I have to repeat the same checks for the model again right? Or, for example, I can directly use fixed effect since I have already tested for it in the beginning model?

Many thanks again,

Francesca

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#27

17 Dec 2019, 11:12

Francesca: Don't overwork model selection. Use the output above with the full set of year dummies. You're essentially done now. You have company FEs, year FEs, and you've clustered your standard errors. RE is clearly rejected. There's not more to do expect put in squares and interactions to see if there might be nonlinearities. There are some more advanced things, like testing for strict exogeneity.
Comment
Francesca Sossella

Join Date: Nov 2019

Posts: 29
#28

17 Dec 2019, 11:41

Thank you very much Jeff! Really appreciate.
So, if I understand correctly, when I include in the model the variable that I wanna test, that is R&D intensity, I don't need to repeat the checks and I can directly go to consider squares and interactions?

Many thanks,

Francesca
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17675
#29

18 Dec 2019, 00:08

Francesca:
the main issue is to use the model that gives that truest and fairest view of the data generating process you're analyzing via your sample.
Introducing squared terms and interactions makes sense in this respect.
If the R&D predictors is part of the data generating process, you shoud have included it from the start.
As far your other questions are concerned:
- it makes perfect sense that only 2017 is statistical significant but when taken together years are jointly significant;
- -quietly- prefix is something useful when you're not interrested in Stata outcome (because you already know it from a previous Stata session and/or your main goal is to investigate the outcome of a postregression tests).

Kind regards,
Carlo
(Stata 19.0)
Comment

Francesca Sossella

Join Date: Nov 2019
Posts: 29

#30

21 Dec 2019, 07:07

Carlo, thank you very much! Sorry for the delay in my response. I included R&D capital as well in my model, did the Hausman test and accepted fixed effects. When I check for heteroskedasticity, the output I get makes me think that there is not a big issue. I will attach the graph below. Do you suggest to apply standard errors anyway or there is no need? Also, I don't see problems of multicollinearity. Please, see below the output I get.

Many thanks again,

Francesca

Code:

xtreg TobinsQ ROA DE LNTA YoYSales RDCS, fe

Fixed-effects (within) regression               Number of obs     =      1,060
Group variable: Company1                        Number of groups  =        212

R-sq:                                           Obs per group:
     within  = 0.1456                                         min =          5
     between = 0.0041                                         avg =        5.0
     overall = 0.0083                                         max =          5

                                                F(5,843)          =      28.73
corr(u_i, Xb)  = -0.6989                        Prob > F          =     0.0000

------------------------------------------------------------------------------
     TobinsQ |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         ROA |   1.488559   .4294244     3.47   0.001     .6456922    2.331425
          DE |  -.1817562   .0676505    -2.69   0.007    -.3145394    -.048973
        LNTA |  -1.015895   .1177476    -8.63   0.000    -1.247008   -.7847824
    YoYSales |   .8005307   .1497289     5.35   0.000     .5066454    1.094416
        RDCS |  -.5583195    .198884    -2.81   0.005    -.9486854   -.1679536
       _cons |   23.90587   2.478005     9.65   0.000     19.04209    28.76966
-------------+----------------------------------------------------------------
     sigma_u |   2.620492
     sigma_e |  1.0727754
         rho |  .85646395   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(211, 843) = 13.77                   Prob > F = 0.0000

. 
. estat vce, corr

Correlation matrix of coefficients of xtreg model

        e(V) |      ROA        DE      LNTA  YoYSales      RDCS     _cons 
-------------+------------------------------------------------------------
         ROA |   1.0000                                                   
          DE |   0.1280    1.0000                                         
        LNTA |  -0.0210   -0.2125    1.0000                               
    YoYSales |  -0.0635   -0.0338   -0.0491    1.0000                     
        RDCS |   0.1805   -0.0482    0.2550    0.2306    1.0000           
       _cons |   0.0066    0.1970   -0.9977    0.0266   -0.3171    1.0000 

. 
. 
. vif, unc

    Variable |       VIF       1/VIF  
-------------+----------------------
        LNTA |      4.41    0.226612
        RDCS |      3.39    0.295221
          DE |      1.53    0.651736
         ROA |      1.24    0.809279
    YoYSales |      1.22    0.816649
-------------+----------------------
    Mean VIF |      2.36

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment