2sls, Poisson first stage, linear second stage

junlei luo

Join Date: Feb 2022

Posts: 6
#1

2sls, Poisson first stage, linear second stage

08 May 2023, 14:59

Dear Statalisters,

After reading through many posts, I would like to run a 2sls regression with the first stage being Poisson and the second a regular linear regression. This is because my endogenous variable is bounded between 0 and 1 (a percentage) which looks like a Poisson distribution (mass close to zero, no observations get to 1 but a few extremes get close). I have unbalanced firm panel data with t= 10 and a few hundred thousand firms. The variable of interest, y2, is at the industry level, as is the instrumental variable, z, while the outcome in the second stage is firm-level. Would the following be a correct way to implement the strategy and also get correct errors estimated? If so, would the predicted coefficient be interpreted as normally for linear regression?

Code:

ppmlhdfe y2 x1 x2 z, absorb(firm year) cluster(industry) predict y2_hat, xb reghdfe ln_y y2_hat x1 x2, absorb(firm year) cluster(industry)

I appreciate any help!
Tags: 2SLS, poisson
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#2

08 May 2023, 22:47

Dear junlei luo,

No, you need to use y2_hat as an instrument in a 2SLS regression. Also, consider using a fractional logit in the first step.

Best wishes,

Joao
2 likes
Comment
junlei luo

Join Date: Feb 2022

Posts: 6
#3

20 May 2023, 16:36

Dear Joao Santos Silva ,

Thank you very much for your help. So to clearify, the best way to do that in Stata should be

Code:

fracreg logit y2 x1 x2 z i.firm i.year predict y2_hat, xb ivreghdfe ln_y (y2 = y2_hat) x1 x2, absorb(firm year) cluster(industry)

and the errors will be correctly computed?

Last edited by junlei luo; 20 May 2023, 16:49.
Comment
junlei luo

Join Date: Feb 2022

Posts: 6
#4

21 May 2023, 01:55

Dear Joao Santos Silva

I also perhaps should have said that the endogenous y2 variable is the firms market share, so a percent of the market and not coming from a binonomal. Is that a problem for fracreg logit?

I appreciate your help!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#5

21 May 2023, 04:29

Dear junlei luo

This looks all fine.

Best wishes,

Joao
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#6

21 May 2023, 09:06

I'll have to case doubt on this procedure -- unless someone can point to a published source. I believe this issue has been covered before on this site. There's a general problem with these nonlinear fixed effects procedures unless T is pretty large -- and even then, there's some work to do to justify it. The first problem is that almost all nonlinear estimation methods suffer from the incidental parameters problem -- including fractional logit. Putting in firm fixed effects is what causes the problem. So you are using fitted values as instruments from a procedure that is not estimating anything in particular. One needs the first-stage estimators to behave well asymptotically to justify using the fitted values as IVs.

Even in the Poisson case there's an issue. In the Poisson case, there's no IP problem for estimating beta2, the coefficients in the y2 equation. But using the fitted values as IVs when including firm fixed effects in the first stage means you are including the unit-specific averages, y2bar(i), as part of your instrument! That's because the estimates of the firm FEs necessarily depend on the y2(i,t), t = 1, ..., T (and in the unbalanced case, too). hadn't thought of this aspect before, but it's bad news. The y2bar(i) is the average over the T(i) time periods of the endogenous explanatory variable, and it will necessarily be correlated with the errors u1(i,t) in the main equation. The IVs should not depend on y2!

By the way, there is nothing wrong with the usual TWFE 2SLS estimator with a linear first stage, where you simply instrument y2 with z.

Code:

ivreghdfe ln_y (y2 = z) x1 x2, absorb(firm year) cluster(industry)

This estimator is always consistent if the IV is valid and the rank condition holds. The only reason to go beyond that is efficiency. But you shouldn't use what I believe is an inconsistent procedure in trying to be more efficient.

There is a consistent solution that uses fracreg in the first stage. Namely, use the correlated random effects approach as in Papke and Wooldridge (2008, Journal of Econometrics). In Wooldridge (2019, Journal of Econometrics), I showed how to extend to unbalanced case. You want to include the time averages of x1, x2, z for the complete cases in the fractional logit -- not firm dummies. Then use these as fitted values for the IVs. The frac logit model does not have to be correctly specified for this to produce consistent estimators. The easiest way to implement the procedure is to drop all observations without complete cases -- and they would be dropped, anyway. Let tobs(i) be the number of complete cases for unit i. Let s be the complete cases indicator, which I assume has been defined.

Code:

keep if s == 1 egen x1bar = mean(x1), by(firm) egen x2bar = mean(x2), by(firm) egen zbar = mean(z), by(firm) fracreg logit y2 x1 x2 z x1bar x2bar zbar i.tobs i.year, vce(cluster firm) predict y2hat ivreghdfe ln_y (y2 = y2_hat) x1 x2, absorb(firm year) cluster(industry)

Note that you don't even have to include (x1bar x2bar zbar i.tobs) in the flogit because you're not assuming a correct model. But it seems reasonable to do so because you'd like the first stage to mimic what happens in a linear model. Inclusion of i.tobs is discussed in W (2019) as a way to allow correlation between heterogeneity and selection. Also, clustering in the fracreg isn't needed but you should look at the t statistic on z to assess strength of the IV.

Speaking of a linear model, how come the same problem with using fitted values doesn't occur? There are several ways to think of it, but the one most relevant for this discussion is how the fitted values are calculated from a first stage FE estimation. They are

y2^(i,t) = c2^(i) + x(i,t)*d2^ + z(i,t)g2^ + theta(t)^

where c2^(i) is a linear function of y2bar(i), xbar(i), and zbar(i). The key is that c2^(i) gets eliminated in the second step by the within (fixed effects) transformation. You get the same estimator whether c2^(i) is included or not. As usual, the linear model is special. I don't think putting in unit fixed effects in the first stage works for any nonlinear model. Typically, if it won't work for Poisson FE with an exponential mean, it won't work for other nonlinear models (that also suffer from the IP problem in addition to generating endogenous instruments).
3 likes
Comment
Tao HanT

Join Date: Mar 2025

Posts: 1
#7

24 Mar 2025, 07:42

Dear Jeff Wooldridge,

Thank you very much for your insightful perpective and detailed explanation! I encounter similar question where my Y_{it} (number of orders) and X_{it} (number of posts) are both count data (many zeros) in an online retailing context. To examine the effect of X_{it} on Y_{it}, I adopt a policy Z_{it} (time-variant dummy) as instrument variable. The data is a product-day panel and Z_{it} changes from 0 to 1 at a time point T0.

My initial thought was to run a two-stage ivregression model with two-way fixed effects, simply changing OLS to Poisson regression. Becasue I learnt that doing OLS on log-transformed X or Y is not a recommended way.

Code:

# stage 1: Poisson xtpoisson X_{it} Z_{it} i.days, fe predict X_{it}^{fit} # stage 2: Poisson xtpoisson Y_{it} X_{it}^{fit} i.days, fe

After reading your post, I realize that things could be more complicated. However, I have a question when you explain why Linear FEs do not suffer from genearting endogenous instruments. As you mentioned, please let me quote as below,

c2^(i) gets eliminated in the second step by the within (fixed effects) transformation.

Thus the fixed effect c2^(i) is free of averaging over Y_{it}. But according to my undertanding, the within transformation needs to substract the average \bar{Y_{i}} from original Y_{it}, in the linear FE model
Although the fixed effects term will disappear and need not to be solved explicitly, they are still determined by the average of Y_{it}. So the predicted X_{it}_{fit} will include the information of average Y_{it} as well.

Finally, from the intution of individual fixed effects, it captures the "average" level of one unit across the whole time period. So it should be determined by the endogenous Y_{it} across the whole time period. Therefore, I guess OLS FE model might not be a special case from Poission here?

I am not sure if I get your point and where I make msitake. Hope to have your reply. Thank you again for your fantastic analysis!

Best regards,
Tao
Comment

Announcement

2sls, Poisson first stage, linear second stage

Comment

Comment

Comment

Comment

Comment

Comment