ivpoisson with panel-data fixed effects

Alvaro Zarzoso

Join Date: Mar 2021

Posts: 1
#31

22 Mar 2021, 14:03

Originally posted by Jeff Wooldridge View Post

Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

Code:

xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe predict double v2h_fe, e xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)

The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

Once people like you start to use this, it will really catch on. ;-)

link_to_paper

Dear Jeff Wooldridge

Thanks for sharing those sound tips. Only one short question:
How should I interpret the sign (+/-) and the (non)singnificance of the residuals (v2h_fe, in your example) in the second regression?

Thanks in advance,

Kind regards,

Álvaro Zarzoso.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2115
#32

28 Mar 2021, 13:23

Álvaro: What it means is that allowing for spending to be correlated with heterogeneity appears to be enough in this application. In other words, there seems to be no remaining endogeneity (with respect to the idiosyncratic errors) once the Chamberlain-Mundlak device is used for the lavgexp variable.

JW
Comment
Dennis Wajda

Join Date: Apr 2019

Posts: 7
#33

16 Apr 2021, 05:57

Jeff: can you employ this panel poisson approach if y2 is binary (but time varying) and is interacted with another exogenous variable (X2) ? If not, do you have suggestions to correct endogeneity of y2 x X2 interaction?

I found one of your older posts along the same lines: https://www.stata.com/statalist/arch.../msg00188.html

Thank you!
Comment
Angie Wu

Join Date: Aug 2021

Posts: 1
#34

23 Aug 2021, 09:33

Originally posted by Jeff Wooldridge View Post

Álvaro: What it means is that allowing for spending to be correlated with heterogeneity appears to be enough in this application. In other words, there seems to be no remaining endogeneity (with respect to the idiosyncratic errors) once the Chamberlain-Mundlak device is used for the lavgexp variable.

JW

Dear Prof. Jeff,

I have a similar question to this thread and my situation might be harder. I have a panel (firm-year) data. The dependent variable is count and the endogenous explanatory variable is also count. The IV that I considered is time-invariant. Is it appropriate to apply a control function with firm and year fixed effects? If not, do you have any recommendations? Thanks a lot!
Comment
Miguel Henry

Join Date: Oct 2015

Posts: 9
#35

06 Dec 2021, 14:11

Dear Jeff Wooldridge

What would be the case in your #12 above applying the Lin/Wooldridge approach for xtrge, fe in the case of having data that is indexed by say states and municipalities for each time period?

Thank you!
Best,

Miguel
Comment

Sumedha Gupta

Join Date: May 2016
Posts: 278

#36

16 Mar 2022, 11:18

Dear All,
I am trying to follow Prof. Jeff Wooldridge 's advise to estimate a two-step fixed effects Poisson IV. I am having some difficulty in programming the bootstrap procedure to get the correct standard errors. I would be very grateful for help in correcting the following code:

Code:

.         capture program drop myboot

.         program define myboot, rclass
  1.                 preserve
  2.                         bsample 100, cluster(id)
  3.                         * first stage
.                         xtset id year
  4.                         xtreg LgrcxtotC i.year if (numbersibs>1), fe /*cluster(id)*/
  5.                         predict double LgrcxtotChat_fe, e
  6.                         * second stage                  
.                         xtpoisson LgAneedc      Lgm      LgrcxtotChat_fe i.year if (numbersibs>1), fe vce(robust
> ) 
  7.                         
.                         return scalar bLgAneedc = _b[LgAneedc]
  8.                         return scalar bLgm = _b[Lgm]
  9.                         return scalar bLgrcxtotChat_fe = _b[LgrcxtotChat_fe]
 10.                         
.                         return scalar seLgAneedc = _se[LgAneedc]
 11.                         return scalar seLgm = _se[Lgm]
 12.                         return scalar seLgrcxtotChat_fe = _se[LgrcxtotChat_fe]
 13.                         
.                 restore
 14.         end

. 
.         bootstrap r(bLgAneedc) r(bLgm) r(bLgrcxtotChat_fe) r(seLgAneedc) r(seLgm) r(seLgrcxtotChat_fe), reps(500
> ) seed(123) cluster(id) idcluster(newid): myboot
(running myboot on estimation sample)
repeated time values within panel
an error occurred when bootstrap executed myboot
r(451);

I want to panel bootstrap and thus in line 2 above I specify bsample 100, cluster(id). Yet, I get an error stating repeated time values within panel. I am of course unknowingly making some basic error, but I would be very grateful help with a solution.

Many thanks in advance for any help you may be able to offer.
Sincerely,
Sumedha.

Comment

Reeju Guha

Join Date: May 2021

Posts: 12
#37

08 May 2022, 17:29

Originally posted by Jeff Wooldridge View Post

Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

Code:

xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe predict double v2h_fe, e xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)

The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

Once people like you start to use this, it will really catch on. ;-)

link_to_paper

Dear Jeff Wooldridge ,

Would you have any suggestions as to how to tweak this approach when there is an interaction term present?
My model is as follows:

xtreg y x1 x2 x1#x2 controls i.time, fe

where y is a count variable, and x1, x2 are both continuous.
My instruments are z1 for my endogenous regressor x1, and z1#x2 for the interaction term "x1#x2"

Would the following approach work?
gen z_int = x1*x2
gen z_iv = z1*x2

xtreg x1 z1 x2 controls i.time, fe vce(robust)
predict double v2h1_fe, e
xtreg z_int z_iv x2 controls i.time, fe vce(robust)
predict double v2h2_fe, e
xtpoisson y x1 z_int v2h1_fe v2h2_fe x2 controls i.time, fe vce(robust)

If not, what would be the best way to address this concern?

Thanks!
Comment
Juan Quicana

Join Date: Dec 2019

Posts: 32
#38

25 Jun 2022, 20:35

Hi Jeff Wooldridge,

I have the same question as Reeju. If I have a interaction term in my model (potential endogeneus variable is interacting with an independent variable). How can I deal with this trouble?

My basic model is as follows:

Code:

pplmhdfe Y X1 X1#X2 controls, a(importers HSsection) vce(cluster distance)

where Y is a continuous variable, and X1 is the potential endogenuous variable.

So, when I tried to apply the test, I am not sure about including this interaction term into the test process of the X1 variable.

Do you have any solution for this?
Comment
Mauricio Carvalho

Join Date: May 2018

Posts: 22
#39

10 Jan 2023, 15:39

Originally posted by Jeff Wooldridge View Post

Sorry for the delay. I'm attaching a link to the paper that proposed the following method. The paper is published in a book but it is behind a pay wall.

The method is very simple, but you have to compute the proper standard errors. We did it via the panel bootstrap. Note that every variable is allowed to be correlated with the so-called fixed effect. This allows y2 to be correlated with idiosyncratic shocks, too. The z are assumed exogenous with respect to idiosyncratic shocks.

Code:

xtreg y2 z1 ... zJ zJp1 ... zM i.year, fe predict double v2h_fe, e xtpoisson y1 y2 v2h_fe z1 ... zJ i.year, fe vce(robust)

The t statistic on v2h_fe is a valid test of the null that lfare is exogenous. The first FE estimation is the first stage or reduced form for the endogenous variable y2. The second is Poisson FE with a control function, v2h_fe.

Incidentally, the test is always valid when you make it robust. But for the correction to be justified, the y2 variable should be roughly continuous, probably. If you decide to include the control function, you need to adjust the standard errors.

Once people like you start to use this, it will really catch on. ;-)

link_to_paper

Does someone have the paper mentioned by Prof. Wooldridge? The link is not working.
Comment
Andrew Bernal

Join Date: Feb 2022

Posts: 31
#40

27 Feb 2023, 08:51

Mauricio Carvalho I believe this is the paper Prof. Wooldridge was referring to. It is a 2019 paper by Lin and Wooldridge included as a chapter in a book, which addresses the issue of non-linear models with endogeneity.

I have a doubt that may be elementary, but is crucial to my current project, and relevant to this thread. Would it be problematic to have a binary (dummy) instrumental variable? The dummy IV I have satisfies the first-stage, but is then omitted due to co-linearity in the Poisson FE with control
1 like
Comment
Mauricio Carvalho

Join Date: May 2018

Posts: 22
#41

28 Feb 2023, 15:53

Thank you very much for posting the link, Andrew!

I also have a question following this topic of PPML-IV with FE. I have been trying to implement the PPML FE IV with the control function and I am having the "variance matrix is nonsymmetric or highly singular" problem. I don`t know if it is also elementary, but it only occurs in the second-stage ppmlhdfe where the "v2hat" from the fist-stage is plugged in. That is, on its own, both estimations works just fine without the warning. But when I combine both then I get the warning. If I understand it correctly by reading older posts in the statalist the cause of the problem might be because of having many categories of the fixed effects (municipalities and/or time#sectors) where only one observation attends. However, I could not find them at all.

My code is something like

Y = dependent variable (count)
X2 = EEV
Z = instrument
X1, X3 and X4 = control variables (all of them are continuous)

* First-stage
reghdfe X2 Z X1 X3 X4, absorb(municipality i.sector#i.year) res
predict double v2hat, r
* Second-stage
ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(municipality i.sector#i.year)

If I run, for instance,

* First-stage
reghdfe X2 Z X1 X3 X4, absorb(municipality) res
predict double v2hat, r
* Second-stage
ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(municipality)

or

* First-stage
reghdfe X2 Z X1 X3 X4, absorb(i.sector#i.year) res
predict double v2hat, r
* Second-stage
ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(i.sector#i.year)

It works fine. Only when I absorb both FE I get the warning.

Thank you all in advance!

Last edited by Mauricio Carvalho; 28 Feb 2023, 16:10.
Comment
Andrew Bernal

Join Date: Feb 2022

Posts: 31
#42

03 Mar 2023, 10:19

Originally posted by Mauricio Carvalho View Post

Thank you very much for posting the link, Andrew!

I also have a question following this topic of PPML-IV with FE. I have been trying to implement the PPML FE IV with the control function and I am having the "variance matrix is nonsymmetric or highly singular" problem. I don`t know if it is also elementary, but it only occurs in the second-stage ppmlhdfe where the "v2hat" from the fist-stage is plugged in.

I wonder, does this same issue occur if you were to run the PPML estimation without the v2hat regressor? If so, then the issue may have something to do with the fixed effects you are absorbing and the other explanatory variables you are using. I.e. perhaps X3 is already being taken into account when you absorb both municipality and the interaction between sector and year. I would imagine this would just lead to X3 being omitted due to collinearity, however.

Those, are my two cents, but hopefully someone with more expertise can help out.

P.S. When using the absorb option for the reghdfe and ppmlhdfe packages, I don't believe it is necessary to absorb by creating dummies (using i.year, for example). It should work just as fine just using the variable themselves.
Comment
Mauricio Carvalho

Join Date: May 2018

Posts: 22
#43

07 Mar 2023, 07:25

Hi, Andrew

Once more, thank you very much for your answer!

No, it not occurs when I run the PPML estimation without the v2hat regressor. It only happens when the v2hat is plugged in. Anyways, I think you have a good point, though.
Looking for an answer to this issue I think the way to solve this might be a theoretical one.

From -reghdfe - helpfile:

Warning: in a FE panel regression, using robust will lead to inconsistent standard errors if, for every fixed effect, the other dimension is fixed. For instance, in a standard panel with individual and time fixed effects, we require both the number of individuals and periods to grow asymptotically. If that is not the case, an alternative may be to use clustered errors, which as discussed below will still have their own asymptotic requirements. For a discussion, see Stock and Watson, "Heteroskedasticity-robust standard errors for fixed-effects panel-data regression," Econometrica 76 (2008): 155-174.

I have only T=3. I am not sure, but I believe the same argument is valid when using -ppmlhdfe-.

P.S: Thank you!
Comment
Tivea Vorn

Join Date: May 2023

Posts: 3
#44

24 May 2023, 23:12

Dear Jeff Wooldridge

I am very happy to read your discussion and find they are accommodating in understanding my case.
I got a similar problem to Dante Donati. But my dep. var. is count discrete and my indep. var is endogenous.

Is there any proposed solution to the matter, Jeff Wooldridge?

Thank you so much for your consideration.

Best regard,
Tivea VORN
Comment
dolcampb

Join Date: Aug 2014

Posts: 4
#45

13 Jul 2023, 08:36

Thank you Jeff, this was extremely helpful to read. (Also thanks for your amazing research and textbooks, but let me get to the point...)

I have two pretty questions which an actual econometrician are sure to find pretty ignorant, but here goes.

1. In the control function regression (xtpoisson), we definitely still need to include the instrument? I've seen it done without. For example, the example in these Cameron notes seem to omit it in their example, although it's included in the equation on page 17. https://cameron.econ.ucdavis.edu/nhh...ount_part2.pdf

2. What if y2 and v2h_fe are pretty highly correlated here? Especially once we control for, say, year and firm fixed effects in a panel setting? In my case, the coefficient on y2 increases 4x (from .007 to .028), and the coefficient on v2h_fe is almost as large and negative (-.024). The correlation between y2 and v2h_fe is .79, and when I run an OLS reg with y2 as the dependent variable, the R2 is .98, and the variance inflation factor (VIF), computed after the poisson regression using "vif, uncentered" in Stata, is 296.

Isn't this problematic? And, what can/should be done?

Huge thanks in advance.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment