Hello, I have an highly unbalanced panel with t=5 years and n=20k firms. I want to estimate the impact of a training on firm performance.
I am using the following DiD specification following Callaway and Sant'Anna to account for heterogeneity of the treatment effect.
The problem I have is that the parallel trend assumption is unlikely to hold in my case given that firms tend to select into the treatment if they had poor performance the last time they were observed.
I was thinking on matching on lag outcomes prior to run the DiD estimation. Would this be a reasonable approach?
Ideally for reasons that relate to the data generating process in my case, I would like to match on the previous observed outcome, whether is 1,2,3, years previous it does not really matter in my case.
Do you know how I could correctly implement this?
thank you very much for your support
Best
I am using the following DiD specification following Callaway and Sant'Anna to account for heterogeneity of the treatment effect.
Code:
csdid performance i.x1 i.x2, ivar(firm) time(yr) gvar(g_yr_first_post_trai) notyet long2 method(drimp)
I was thinking on matching on lag outcomes prior to run the DiD estimation. Would this be a reasonable approach?
Ideally for reasons that relate to the data generating process in my case, I would like to match on the previous observed outcome, whether is 1,2,3, years previous it does not really matter in my case.
Do you know how I could correctly implement this?
thank you very much for your support
Best
Comment