TWFE Estimator with endogenous regressor

Frank Giaquinto

Join Date: Dec 2023

Posts: 18
#1

TWFE Estimator with endogenous regressor

07 Nov 2024, 14:35

Dear Professors,

I have panel data with 120 countries and 10 time periods, so N=120 and T=10.

The equation I need to estimate is:

y_it = β₁x_1,it +β₂x_2,it + c_i+ δ_t + u_it

using the TWFE estimator. My concern lies in the nature (suggested by economic theory) of the regressors with respect to the idiosyncratic error term: x₁(capital, measured at the beginning of the period) is predetermined, while x₂ (investment) is endogenous. I could take the first-order lag of both variables, which still makes sense since it is reasonable to assume that a change in either investment or capital takes time before affecting GDP (y).

Apologies for the naive question, but since the TWFE estimator time-demeans the data, the time-demeaned regressors become correlated with the time-demeaned idiosyncratic error term. This correlation could introduce bias in the point estimates, when dealing with endogenous and preditermined variables, even if lagged.

However, the TWFE estimator is equivalent to POLS when we add country and time dummies, which avoids time-demeaning the data. Is it incorrect to run the regression using this approach?

The code I used is : xtreg y L.x1 L.x2 i.year, fe vce(cluster country)

I’m also concerned about regressing capital and investment together. The former is a stock variable, and the latter is a flow variable. Investments at time t will affect capital formation in the next period, at t+1. Strong correlation between the two predictors could badly affect the precision of my estimator.

Should I consider alternative estimator like Arellano and Bond (but those are used in case of dynamic panels, as long as I know)?

Thanks in advance for your help!

Last edited by Frank Giaquinto; 07 Nov 2024, 14:43.
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2048
#2

07 Nov 2024, 20:58

Frank: That the TWFE and dummy variable regressions are the same is an algebraic result. And you are correct that the estimator is inconsistent when T is treated as fixed when the explanatory variables are not strictly exogenous. As you surmised, lagging the explanatory variables can render them predetermined but not strictly exogenous. Now, under fairly weak assumptions, the inconsistency in the TWFE estimator is on the order of 1/T, and you have about T = 10. So the bias might be tolerable. But you can never know for sure. You might try differencing followed by IV -- that is, Arellano and Bond -- but the instruments could be weak.
Comment
Frank Giaquinto

Join Date: Dec 2023

Posts: 18
#3

08 Nov 2024, 03:06

Dear Professor Wooldridge,

Thank you very much for answering my question. I have a further question, if I may ask:

I tried using ONE-STEP and TWO-STEP GMM difference estimators, but both the Hansen Test and the Difference-in-Hansen test yield p-values below 0.1, which strongly suggests the invalidity of the internal instruments. As you pointed out, the instruments may indeed be weak, perhaps due to persistence in the series. In static panels, would it be reasonable to explore additional moment conditions and use a SYS-GMM estimator?

I did try SYS-GMM, and the Hansen test for overidentifying restrictions produced a p-value above 0.2. The Difference-in-Hansen test also yielded favorable p-values (between 0.25 and 0.8), and autocorrelation does not appear to be an issue. However, I haven’t found any applications of SYS-GMM for static panels in the literature. The point estimates are slightly higher than those obtained with TWFE. If I understand correctly, the Nickell Bias in TWFE seems to be low, as you mentioned it is on the order of 1/T.

Thank you for your insights.
Comment

Announcement

TWFE Estimator with endogenous regressor

Comment

Comment