Probit or Logit for a RE diff in diff model

Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#1

Probit or Logit for a RE diff in diff model

28 Dec 2021, 17:39

Gretteing Stata specialists,

I would like to run a Diff-in-Diff model to estimate the impact of one policy on the incidence of catastrophic health expenditure. I am using a Household Income and Expenditure Survey for 10 years, in which the households are not the same. So, I will use a random effect. The outcome is a dummy variable (incident of catastrophic health spending). But I am not sure whether I should use logit model (xlogit ) or probit (xprobit) because both are used in previous studies.

xlogit outcome i.treat##i.time
or
xprobit outcome i.treat##i.time

I would greatly appreciate your advice in this regard.

Last edited by Hadi Kahalzadeh; 28 Dec 2021, 17:42.
Tags: None
Fei Wang

Join Date: Oct 2021

Posts: 726
#2

28 Dec 2021, 18:54

As you have pooled cross-sectional data, logit or probit would be sufficient. According to Puhani's definition of treatment effect in non-linear settings, I would suggest separately defining the interaction, like gen treat_time = treat * time, and then run, for example, logit y treat time i.treat_time x. After the estimation, calculate the partial effect of treat_time using margins, dydx(treat_time) on the treated group at treated periods for ATT.

Last edited by Fei Wang; 28 Dec 2021, 19:14.
1 like
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#3

28 Dec 2021, 19:09

Dear Fei,

Many thanks for your advice. I think your advice would solve my other problem. With xlogit outcome treat time i.treat##i.time, the result shows that the interaction term are omitted because of collinearity. I hope with your advice, I could fix the problem.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#4

28 Dec 2021, 19:23

I revised #2 by replacing treat_time with i.treat_time in the logit estimation because only in this sense margins will estimate the partial effect of treat_time based on its changes from 0 to 1.
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#5

28 Dec 2021, 19:31

So you advise me to use
xlogit outcome treat time i.treat_time x

Last edited by Hadi Kahalzadeh; 28 Dec 2021, 19:33.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#6

28 Dec 2021, 20:01

Originally posted by Hadi Kahalzadeh View Post

So you advise me to use
xlogit outcome treat time i.treat_time x

Yes but logit would be enough.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#7

28 Dec 2021, 22:21

An addendum to Fei's helpful advice: To obtain the ATT, you will want to set treat to unity and then you can set the time to unity. With 10 years, I wonder why you're not including a full set of year dummies. You could even compute a different effect for each post-period intervention. But the simplest thing to do is this:

Code:

logit y treat i.year i.treat_time margins, dydx(treat_time) at(treat = 1 time = 1) noestimcheck

You can't use random effects because that would require seeing the same households over time.
1 like
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#8

28 Dec 2021, 23:57

Dear Jeff,

Thank you so much for your time and advice.

I decided to consider only 5 years 2010 to 2014. My outcome = risk of Catastrophic health expenditure (dummy).
Time is a dummy (pre =0/ post=1) for a policy, and my treatment is dummy too - those who had no insurance.

[logit CHE treat policy i.treat_policy x]

The dataset is pooled 5 years of a household survey, but households (id) are not the same. Every year, the survey collected information of above 38K households, and IDs are unique.

Please advise me if I am wrong- I thought because the households are not the same, I can't use FE- that's why I used RE
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#9

29 Dec 2021, 01:43

Jeff Wooldridge Thanks, Jeff, for your advice in detail. I'm wondering which code below calculates the average treatment effect on the treated group at treated periods.

Code:

margins, dydx(treat_time) at(treat = 1 time = 1) margins, dydx(treat_time) subpop(if treat == 1 & time == 1)

I thought the second code does the ATT, because the first seems to include observations from the control group. Or am I misunderstanding something? Many thanks.
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#10

29 Dec 2021, 01:55

the first one gave me -.0068476
[
margins, dydx(treat_time) at(treat = 1 time = 1) ][/
margins, dydx(treat_time) subpop(if treat == 1 & time == 1) ]
the second is -.0066343

How do you interpret these results?

Many thanks for your valuable advice
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#11

29 Dec 2021, 10:14

With a balanced panel and on covariates, the two should be the same. If you have covariates, then you want to actually use both options. The first one, without covariates, does give the proper ATT.

Hadi: You're finding that the estimated effect on the probability seems pretty small: less than .7 percentage points. But without context I don't know if this is practically small.
2 likes
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#12

06 Jan 2022, 01:33

@Jeff Wooldridge, if I want to run an IV for my logit model, ivprobit would be enough? I reviewed your amazing books Econometric Analysis of Cross Section and Panel Data, and Introductory Econometrics, 6e

But both examples - Problems 6.8 and 6.11- are not binary outcomes. Since my dataset is Pooled Cross Sections over Time and I use a DID model, Is there any source that I can use it?

Is the code correct?

[ivprobit CHE Notertiary policy i.Notertiary##i.policy age ageSqr rural state married_st Seniors Kids female (Expenditure = wealth_rank) if year >= 2009 & year<= 2014, nolog

][/margins, dydx(*) predict(pr)]
Comment
Hadi Kahalzadeh

Join Date: Nov 2019

Posts: 23
#13

09 Jan 2022, 23:43

Originally posted by Jeff Wooldridge View Post

With a balanced panel and on covariates, the two should be the same. If you have covariates, then you want to actually use both options. The first one, without covariates, does give the proper ATT.

Hadi: You're finding that the estimated effect on the probability seems pretty small: less than .7 percentage points. But without context I don't know if this is practically small.

Thank you so much for your advice. Your comment and @Fei Wang were very helpful, I have 2 questions:

Based on your advice, I run ivprobit

[ ivprobit CHE treatment i.time i.treat_time X1 X2 (Totalequ = IV1 IV2 IV3 IV4 IV5) , twostep ]

[ margins, dydx(treat_time) ]

The results for margin for interaction term is :

--------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
---------------+----------------------------------------------------------------
1.treat_time | -.1748487 .0575814 -3.04 0.002 -.2877062 -.0619912
--------------------------------------------------------------------------------

I expected the policy increased the risk of CHE that is catastrophic health expenditures for the treatment group (those who have no insurance). but here the margin is negative for the interaction term.

Q1- I used two-step because I had more than one IV- Is it correct or I can use "ml"
Q2- How I can interpret the margin for interaction term? I expected that the policy increased the risk of catastrophic health expenditures (CHE) for those who have no health insurance(treatment group) - but here the margin shows 17 percent less likely treatment would face with CHE- this is correct?

Thank you in advance again for your advice.
Comment

Announcement

Probit or Logit for a RE diff in diff model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment