Dear Professors,
I have panel data with 120 countries and 10 time periods, so N=120 and T=10.
The equation I need to estimate is:
yit = β1 x1,it +β2 x2,it + ci + δt + uit
using the TWFE estimator. My concern lies in the nature (suggested by economic theory) of the regressors with respect to the idiosyncratic error term: x1 (capital, measured at the beginning of the period) is predetermined, while x2 (investment) is endogenous. I could take the first-order lag of both variables, which still makes sense since it is reasonable to assume that a change in either investment or capital takes time before affecting GDP (y).
Apologies for the naive question, but since the TWFE estimator time-demeans the data, the time-demeaned regressors become correlated with the time-demeaned idiosyncratic error term. This correlation could introduce bias in the point estimates, when dealing with endogenous and preditermined variables, even if lagged.
However, the TWFE estimator is equivalent to POLS when we add country and time dummies, which avoids time-demeaning the data. Is it incorrect to run the regression using this approach?
The code I used is : xtreg y L.x1 L.x2 i.year, fe vce(cluster country)
I’m also concerned about regressing capital and investment together. The former is a stock variable, and the latter is a flow variable. Investments at time t will affect capital formation in the next period, at t+1. Strong correlation between the two predictors could badly affect the precision of my estimator.
Should I consider alternative estimator like Arellano and Bond (but those are used in case of dynamic panels, as long as I know)?
Thanks in advance for your help!
I have panel data with 120 countries and 10 time periods, so N=120 and T=10.
The equation I need to estimate is:
yit = β1 x1,it +β2 x2,it + ci + δt + uit
using the TWFE estimator. My concern lies in the nature (suggested by economic theory) of the regressors with respect to the idiosyncratic error term: x1 (capital, measured at the beginning of the period) is predetermined, while x2 (investment) is endogenous. I could take the first-order lag of both variables, which still makes sense since it is reasonable to assume that a change in either investment or capital takes time before affecting GDP (y).
Apologies for the naive question, but since the TWFE estimator time-demeans the data, the time-demeaned regressors become correlated with the time-demeaned idiosyncratic error term. This correlation could introduce bias in the point estimates, when dealing with endogenous and preditermined variables, even if lagged.
However, the TWFE estimator is equivalent to POLS when we add country and time dummies, which avoids time-demeaning the data. Is it incorrect to run the regression using this approach?
The code I used is : xtreg y L.x1 L.x2 i.year, fe vce(cluster country)
I’m also concerned about regressing capital and investment together. The former is a stock variable, and the latter is a flow variable. Investments at time t will affect capital formation in the next period, at t+1. Strong correlation between the two predictors could badly affect the precision of my estimator.
Should I consider alternative estimator like Arellano and Bond (but those are used in case of dynamic panels, as long as I know)?
Thanks in advance for your help!
Comment