Difference in differences - two-way fixed effects problem

Daniel Mercer

Join Date: Feb 2024

Posts: 13
#1

Difference in differences - two-way fixed effects problem

11 Feb 2024, 04:20

Hello,

I am currently running a DID model in Stata and I am facing an issue which I can´t wrap my head around.

I have panel data with daily yield data on bonds and the corresponding bond id´s. I have already created a treat and post dummy as well as an interaction variable (treat x post).
I ran the simple DID model without fixed effects with the following code:

Code:

xtset id date xtreg yield treat post interaction, robust

This worked without any problems and the results are as I expected them to be. After this I tried running the model with two-way fixed effects with time (days) and the bond id´s as fixed effects. I used the following code and dropped treat and post:

Code:

xtreg yield interaction i.id i.date, fe vce(cluster id)

Now the problem is that the dummys for the bond id´s (for the fixed effects) are all omitted because of collinearity. I do not understand why this is the case.
I should note that there are several bonds in the dataset belonging to the same firm. Could this be the cause of the problem? Should I rather try firm fixed effects?

Best regards

Daniel
Tags: data, fixed effects, interaction, panel data, regression
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#2

11 Feb 2024, 08:21

Daniel: The TWFE estimator effectively includes dummies for id and date, so your putting them into the second command is redundant. xtreg is working exactly as it should.
1 like
Comment
Daniel Mercer

Join Date: Feb 2024

Posts: 13
#3

11 Feb 2024, 08:49

Jeff Wooldridge Thank you for your response. But I am not quite sure how the TWFE estimator in regression 1 controls for id and date. For example: If American bonds are specified as treatment and European bonds as control, how can I make sure that some bonds do not alter my results? As an example, let´s assume I have 10 bonds of Ford in my US bond sample and these may be higher in credit risk than the European counterparts. This would not be taken into account in model 1, right? Furthermore some events during the time could alter yields (e.g. regulation). Would you still only rely on model 1?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#4

11 Feb 2024, 09:15

Your first command uses random effects because you didn't specify the -fe- option. Your second command is fixed effects, and then the id and date are redundant. Now, as I showed in my 2021 working paper, the two will be equivalent if you have a balanced panel at the id level, but not in general. I would use the second command and drop i.id and i.date.
1 like
Comment
Daniel Mercer

Join Date: Feb 2024

Posts: 13
#5

11 Feb 2024, 09:47

Jeff Wooldridge Thanks again! I didn´t know that. So is this the command you would suggest?

Code:

xtreg yield interaction, fe vce(cluster id)

Does this imply that Stata automatically "knows" that id and date should be used as fixed effects because they were previously specified in xtset? Or how does Stata know which fixed effects I am referring to? In my case R^2 decreased significantly. Is this potentially due to reasons I have pointed out in post #3?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#6

11 Feb 2024, 10:38

Okay, I got something wrong. You need to include i.date because fe only accounts for the unit-specific dummy variables. I was thinking of the user-written command reghdfe, where you specific the id and time effects to absorb.

You should pretty much ignore the R-squared. Of course including i.id will result in a much larger R-squared, but you shouldn't get credit for that. Do one of the following:

Code:

xtset id date xtreg yield interaction i.date, fe vce(cluster id) reg yield interaction i.id i.date, vce(cluster id) reg yield interaction treat post, vce(cluster id)

The reg version with i.id will give you a much higher R-squared, but it doesn't change anything you care about. Also, the clustered standard errors for the i.id coefficients are useless. In the balanced case, the third command will give the same estimates as the first two.
2 likes
Comment
Daniel Mercer

Join Date: Feb 2024

Posts: 13
#7

11 Feb 2024, 11:24

Jeff Wooldridge Ok, understood. Thank you. My code now works. But what I do not completely understand is, why the interaction coefficients as well as the standard errors and the test-statistic are identical for the fixed effects case and the random effects model. Why is this the case?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#8

11 Feb 2024, 16:14

It follows from the equivalence of TWFE and the Mundlak regression. See here: Wooldridge (2021).
Comment
Mukesh Punia

Join Date: May 2020

Posts: 67
#9

12 Feb 2024, 00:53

Originally posted by Jeff Wooldridge View Post

Okay, I got something wrong. You need to include i.date because fe only accounts for the unit-specific dummy variables. I was thinking of the user-written command reghdfe, where you specific the id and time effects to absorb.

You should pretty much ignore the R-squared. Of course including i.id will result in a much larger R-squared, but you shouldn't get credit for that. Do one of the following:

Code:

xtset id date xtreg yield interaction i.date, fe vce(cluster id) reg yield interaction i.id i.date, vce(cluster id) reg yield interaction treat post, vce(cluster id)

The reg version with i.id will give you a much higher R-squared, but it doesn't change anything you care about. Also, the clustered standard errors for the i.id coefficients are useless. In the balanced case, the third command will give the same estimates as the first two.

Thank you! Jeff Wooldridge, it’s very insightful clarification 🤗

Best regards,
Mukesh
Comment

Announcement

Difference in differences - two-way fixed effects problem

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment