Dear Statalisters, I am struggling to build a 2 step estimation procedure.
In particular, as a dependent variable, I have the count of new entrants at the regional level (nuts-2) from 1990 to 2020. I am interested in the effect of two regressors on my dependent variable: on the one hand, the knowledge stock at the nuts-2 level (proxied by a fractional counting of inventors for each patent; therefore it is a continuous variable and not a proper count variable) and, on the other hand, R&D expenditures aggregated at the nuts-2 level. I have estimated a panel poisson model with fixed effects "xtpoisson new_entrants L1.k_stock L1.R&D L1.controls, fe", following Wooldridge (1999), hence avoiding the panel specification of the negative binomial with fe.
However, as you may imagine, R&D expenditures and knowledge stock are highly correlated, therefore including both regressors in the same model would produce multicollinearity issues. Therefore, I would like to implement a first step where: the knowledge stock is regressed on the lagged R&D expenditures (y=knowledge stock at time t, x=R&D expenditures at time t-1) and then take the residuals. Then, I would estimate my main model again by substituting the knowledge stock with the residuals of the first step (which can be interpreted as the amount of knowledge stock NOT explained by R&D expenditures undertaken by large corporations, but due to serendipitous inventions of individuals), as: "xtpoisson new_entrants L1.res L1.controls, fe".
In this respect, as a FIRST STEP, I considered estimating a tobit model since the knowledge stock varies from 0 to 860.35 (many corner solutions occur). Therefore, "tobit k_stock L1.R&D, nocons ll(0)" or even "xttobit k_stock L1.R&D, nocons ll(0)".
Do you know how I can proceed from here? Do you have any suggestions?
In particular, as a dependent variable, I have the count of new entrants at the regional level (nuts-2) from 1990 to 2020. I am interested in the effect of two regressors on my dependent variable: on the one hand, the knowledge stock at the nuts-2 level (proxied by a fractional counting of inventors for each patent; therefore it is a continuous variable and not a proper count variable) and, on the other hand, R&D expenditures aggregated at the nuts-2 level. I have estimated a panel poisson model with fixed effects "xtpoisson new_entrants L1.k_stock L1.R&D L1.controls, fe", following Wooldridge (1999), hence avoiding the panel specification of the negative binomial with fe.
However, as you may imagine, R&D expenditures and knowledge stock are highly correlated, therefore including both regressors in the same model would produce multicollinearity issues. Therefore, I would like to implement a first step where: the knowledge stock is regressed on the lagged R&D expenditures (y=knowledge stock at time t, x=R&D expenditures at time t-1) and then take the residuals. Then, I would estimate my main model again by substituting the knowledge stock with the residuals of the first step (which can be interpreted as the amount of knowledge stock NOT explained by R&D expenditures undertaken by large corporations, but due to serendipitous inventions of individuals), as: "xtpoisson new_entrants L1.res L1.controls, fe".
In this respect, as a FIRST STEP, I considered estimating a tobit model since the knowledge stock varies from 0 to 860.35 (many corner solutions occur). Therefore, "tobit k_stock L1.R&D, nocons ll(0)" or even "xttobit k_stock L1.R&D, nocons ll(0)".
Do you know how I can proceed from here? Do you have any suggestions?
Comment