IV regression with different clustering at first and second stage

Al Gardiner

Join Date: Jun 2024

Posts: 2
#1

IV regression with different clustering at first and second stage

27 Jun 2024, 04:38

Hello all

I am conducting a IV regression with two-way fixed effects on panel data of Australian states and given I have a small number of states/clusters (8) I would like to use wild bootstrapped standard errors. However the problem I am running into is that my data is clustered differently at the first and second stage of the IV regression and I cannot work out an easy way to tell stata to do this. The variables are as follows:

X= expected election outcome

Y=public debt

Z= presence of political scandal in term of office

My instrument, scandals, is clustered at the election term level (in my data occurs typically on a 4-year cycle but with some anomalies) and takes the value 1 each year if there is a scandal in that term, and 0 if not. Expected election results are at the term level also. Public debt is annual. I cannot convert all variables into term format because the election periods differ between states.

Tl;dr: how do I cluster differently at first and second stage of IV regression, while also including TWFE and wild bootstrap SEs?

Many thanks for any assistance!
Tags: None
Al Gardiner

Join Date: Jun 2024

Posts: 2
#2

27 Jun 2024, 04:41

I have drafted the following code, but not sure if this will give me what I want:

* First stage
xtset indicator_term year, yearly
xtivreg2 prob_of_defeatold scandals_term, fe cluster(indicator_term) first

* Second stage
xtset state_id year, yearly
xtivreg2 borrowing_tps (prob_of_defeatold = scandals_term), fe cluster(state_id) first

* Wild bootstrap
boottest prob_of_defeatold, cluster(state_id) reps(999)
outreg2 using regression_table1.doc, replace ctitle(borrowing_tps) addtext(State FE, YES, Year FE, YES, Time period, 1999-2022) dec(0)
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#3

27 Jun 2024, 06:06

Originally posted by Al Gardiner View Post

I have drafted the following code, but not sure if this will give me what I want:

* First stage
xtset indicator_term year, yearly

* Second stage
xtset state_id year, yearly

If your data allows you to xtset using both indicator_term and state_id, then both variables are equivalent. The first-stage predicted values do not depend on the level of clustering, and those are all you need from the first stage. Note that what you call second stage in your code implements both stages as you are using xtivreg2 (from SSC).
Comment

Announcement

IV regression with different clustering at first and second stage

Comment

Comment