Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Probit or Logit for a RE diff in diff model

    Gretteing Stata specialists,

    I would like to run a Diff-in-Diff model to estimate the impact of one policy on the incidence of catastrophic health expenditure. I am using a Household Income and Expenditure Survey for 10 years, in which the households are not the same. So, I will use a random effect. The outcome is a dummy variable (incident of catastrophic health spending). But I am not sure whether I should use logit model (xlogit ) or probit (xprobit) because both are used in previous studies.

    xlogit outcome i.treat##i.time
    or
    xprobit outcome i.treat##i.time

    I would greatly appreciate your advice in this regard.
    Last edited by Hadi Kahalzadeh; 28 Dec 2021, 18:42.

  • #2
    As you have pooled cross-sectional data, logit or probit would be sufficient. According to Puhani's definition of treatment effect in non-linear settings, I would suggest separately defining the interaction, like gen treat_time = treat * time, and then run, for example, logit y treat time i.treat_time x. After the estimation, calculate the partial effect of treat_time using margins, dydx(treat_time) on the treated group at treated periods for ATT.
    Last edited by Fei Wang; 28 Dec 2021, 20:14.

    Comment


    • #3
      Dear Fei,

      Many thanks for your advice. I think your advice would solve my other problem. With xlogit outcome treat time i.treat##i.time, the result shows that the interaction term are omitted because of collinearity. I hope with your advice, I could fix the problem.

      Comment


      • #4
        I revised #2 by replacing treat_time with i.treat_time in the logit estimation because only in this sense margins will estimate the partial effect of treat_time based on its changes from 0 to 1.

        Comment


        • #5
          So you advise me to use
          xlogit outcome treat time i.treat_time x
          Last edited by Hadi Kahalzadeh; 28 Dec 2021, 20:33.

          Comment


          • #6
            Originally posted by Hadi Kahalzadeh View Post
            So you advise me to use
            xlogit outcome treat time i.treat_time x
            Yes but logit would be enough.

            Comment


            • #7
              An addendum to Fei's helpful advice: To obtain the ATT, you will want to set treat to unity and then you can set the time to unity. With 10 years, I wonder why you're not including a full set of year dummies. You could even compute a different effect for each post-period intervention. But the simplest thing to do is this:

              Code:
              logit y treat i.year i.treat_time
              margins, dydx(treat_time) at(treat = 1 time = 1) noestimcheck
              You can't use random effects because that would require seeing the same households over time.

              Comment


              • #8
                Dear Jeff,

                Thank you so much for your time and advice.

                I decided to consider only 5 years 2010 to 2014. My outcome = risk of Catastrophic health expenditure (dummy).
                Time is a dummy (pre =0/ post=1) for a policy, and my treatment is dummy too - those who had no insurance.

                [logit CHE treat policy i.treat_policy x]

                The dataset is pooled 5 years of a household survey, but households (id) are not the same. Every year, the survey collected information of above 38K households, and IDs are unique.

                Please advise me if I am wrong- I thought because the households are not the same, I can't use FE- that's why I used RE

                Comment


                • #9
                  Jeff Wooldridge Thanks, Jeff, for your advice in detail. I'm wondering which code below calculates the average treatment effect on the treated group at treated periods.

                  Code:
                  margins, dydx(treat_time) at(treat = 1 time = 1)
                  margins, dydx(treat_time) subpop(if treat == 1 & time == 1)
                  I thought the second code does the ATT, because the first seems to include observations from the control group. Or am I misunderstanding something? Many thanks.

                  Comment


                  • #10
                    the first one gave me -.0068476
                    [
                    margins, dydx(treat_time) at(treat = 1 time = 1) ][/
                    margins, dydx(treat_time) subpop(if treat == 1 & time == 1) ]
                    the second is -.0066343

                    How do you interpret these results?

                    Many thanks for your valuable advice

                    Comment


                    • #11
                      With a balanced panel and on covariates, the two should be the same. If you have covariates, then you want to actually use both options. The first one, without covariates, does give the proper ATT.

                      Hadi: You're finding that the estimated effect on the probability seems pretty small: less than .7 percentage points. But without context I don't know if this is practically small.

                      Comment


                      • #12
                        @Jeff Wooldridge, if I want to run an IV for my logit model, ivprobit would be enough? I reviewed your amazing books Econometric Analysis of Cross Section and Panel Data, and Introductory Econometrics, 6e

                        But both examples - Problems 6.8 and 6.11- are not binary outcomes. Since my dataset is Pooled Cross Sections over Time and I use a DID model, Is there any source that I can use it?

                        Is the code correct?

                        [ivprobit CHE Notertiary policy i.Notertiary##i.policy age ageSqr rural state married_st Seniors Kids female (Expenditure = wealth_rank) if year >= 2009 & year<= 2014, nolog

                        ][/margins, dydx(*) predict(pr)]

                        Comment


                        • #13
                          Originally posted by Jeff Wooldridge View Post
                          With a balanced panel and on covariates, the two should be the same. If you have covariates, then you want to actually use both options. The first one, without covariates, does give the proper ATT.

                          Hadi: You're finding that the estimated effect on the probability seems pretty small: less than .7 percentage points. But without context I don't know if this is practically small.
                          Thank you so much for your advice. Your comment and @Fei Wang were very helpful, I have 2 questions:

                          Based on your advice, I run ivprobit

                          [ ivprobit CHE treatment i.time i.treat_time X1 X2 (Totalequ = IV1 IV2 IV3 IV4 IV5) , twostep ]

                          [ margins, dydx(treat_time) ]


                          The results for margin for interaction term is :

                          --------------------------------------------------------------------------------
                          | Delta-method
                          | dy/dx std. err. z P>|z| [95% conf. interval]
                          ---------------+----------------------------------------------------------------
                          1.treat_time | -.1748487 .0575814 -3.04 0.002 -.2877062 -.0619912
                          --------------------------------------------------------------------------------

                          I expected the policy increased the risk of CHE that is catastrophic health expenditures for the treatment group (those who have no insurance). but here the margin is negative for the interaction term.

                          Q1- I used two-step because I had more than one IV- Is it correct or I can use "ml"
                          Q2- How I can interpret the margin for interaction term? I expected that the policy increased the risk of catastrophic health expenditures (CHE) for those who have no health insurance(treatment group) - but here the margin shows 17 percent less likely treatment would face with CHE- this is correct?

                          Thank you in advance again for your advice.

                          Comment

                          Working...
                          X