Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • DID designs for repeated cross sections, a binary response variable, and common timing.

    Hi all,

    I am trying to figure out what my options are w.r.t difference in differences designs with common timing, repeated cross sections, and a binary outcome.

    I have individual patient-level repeated cross sections from 85 clinics over T = 26 months (and counting). On average, each clinic has about 75 patients per month, but there is wide variation. The two treated clinics are much larger, averaging ~900 patients per month. I am hoping to stratify by specialty (I have 4). On average, within each specialty-clinic-month cell there are about 22 patients (but among treated clinics there are close to 240 on average). The two large clinics I mentioned were treated on the exact same date, so I'm dealing with common timing. I have 6 months of post-treatment observations and 20 months of pre-treatment observations.

    I am planning on estimating LPMs with clinic and month dummies (both a static (i.e. T = 2 case) and dynamic TWFE specification), but I was also hoping to estimate a non-linear model.

    I am confused about a few things:

    (1) can I apply pooled QMLE as described by Wooldridge 2023 (see link) with repeated cross-sections because I don't have staggered timing?

    (2) if not, will a GLM with a logit or probit link function yield consistent estimates? I am concerned about/don't fully understand: the incidental parameters problem I might run into with the clinic and time dummies AND interpreting interaction terms (computing cross-partial effects) in a non-linear setting.

    Thank you!

    Jeffrey M Wooldridge, Simple approaches to nonlinear difference-in-differences with panel data, The Econometrics Journal, Volume 26, Issue 3, September 2023, Pages C31–C66, https://doi.org/10.1093/ectj/utad016
    Summary. I derive simple, flexible strategies for difference-in-differences settings where the nature of the response variable may warrant a nonlinear mode
    Last edited by Daniel Lipsey; 21 Nov 2024, 16:01.

  • #2
    I can help you on your second question. I would probably go for LPM in your case. Then perhaps apply the trimmed estimator of Horrace and Oaxaca (2006) if you have a lot of out of sample predicted values.

    I would not use probit. Logit on the other hand has a sufficient statistic for the incidental parameter, here clinic fixed effects. This eliminates the IPP, all else equal: this would be xtlogit, fe if my memory serves.

    You are correct: interaction terms in nonlinear models are very complicated. Here are two references that should help:

    INTERACTION TERMS IN POISSON AND LOG LINEAR REGRESSION MODELS - Shang - 2018 - Bulletin of Economic Research - Wiley Online Library

    Interaction terms in logit and probit models - ScienceDirect

    Comment


    • #3
      Thank you, Maxence! Will read up on all of the above.

      If anyone has thoughts about potential for pooled QMLE with repeated cross sections please let me know. Thanks again!

      Comment


      • #4
        I think pooled QMLE would be suitable. Do both LPM and QMLE and compare. I doubt much difference between the two.

        Comment


        • #5
          Thank you, Dr. Ford!

          Comment

          Working...
          X