Hi all,
I am trying to figure out what my options are w.r.t difference in differences designs with common timing, repeated cross sections, and a binary outcome.
I have individual patient-level repeated cross sections from 85 clinics over T = 26 months (and counting). On average, each clinic has about 75 patients per month, but there is wide variation. The two treated clinics are much larger, averaging ~900 patients per month. I am hoping to stratify by specialty (I have 4). On average, within each specialty-clinic-month cell there are about 22 patients (but among treated clinics there are close to 240 on average). The two large clinics I mentioned were treated on the exact same date, so I'm dealing with common timing. I have 6 months of post-treatment observations and 20 months of pre-treatment observations.
I am planning on estimating LPMs with clinic and month dummies (both a static (i.e. T = 2 case) and dynamic TWFE specification), but I was also hoping to estimate a non-linear model.
I am confused about a few things:
(1) can I apply pooled QMLE as described by Wooldridge 2023 (see link) with repeated cross-sections because I don't have staggered timing?
(2) if not, will a GLM with a logit or probit link function yield consistent estimates? I am concerned about/don't fully understand: the incidental parameters problem I might run into with the clinic and time dummies AND interpreting interaction terms (computing cross-partial effects) in a non-linear setting.
Thank you!
Jeffrey M Wooldridge, Simple approaches to nonlinear difference-in-differences with panel data, The Econometrics Journal, Volume 26, Issue 3, September 2023, Pages C31–C66, https://doi.org/10.1093/ectj/utad016
I am trying to figure out what my options are w.r.t difference in differences designs with common timing, repeated cross sections, and a binary outcome.
I have individual patient-level repeated cross sections from 85 clinics over T = 26 months (and counting). On average, each clinic has about 75 patients per month, but there is wide variation. The two treated clinics are much larger, averaging ~900 patients per month. I am hoping to stratify by specialty (I have 4). On average, within each specialty-clinic-month cell there are about 22 patients (but among treated clinics there are close to 240 on average). The two large clinics I mentioned were treated on the exact same date, so I'm dealing with common timing. I have 6 months of post-treatment observations and 20 months of pre-treatment observations.
I am planning on estimating LPMs with clinic and month dummies (both a static (i.e. T = 2 case) and dynamic TWFE specification), but I was also hoping to estimate a non-linear model.
I am confused about a few things:
(1) can I apply pooled QMLE as described by Wooldridge 2023 (see link) with repeated cross-sections because I don't have staggered timing?
(2) if not, will a GLM with a logit or probit link function yield consistent estimates? I am concerned about/don't fully understand: the incidental parameters problem I might run into with the clinic and time dummies AND interpreting interaction terms (computing cross-partial effects) in a non-linear setting.
Thank you!
Jeffrey M Wooldridge, Simple approaches to nonlinear difference-in-differences with panel data, The Econometrics Journal, Volume 26, Issue 3, September 2023, Pages C31–C66, https://doi.org/10.1093/ectj/utad016
Comment