Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Placebo test for staggered diff-in-diff

    Dear Statalists,

    I am currently struggling for validating the parallel assumption for the diff-in-diff analysis.

    I am trying to investigate the effect of PTAs on the initiation of trade disputes. The dependent variable is the number of trade disputes between countries i and j for a given year. The dependent variable contains 84% of zeros since there isn't any trade dispute between many country pairs.

    Enlighted by this thread, I am therefore using ppmlhdfe. My command is ppmlhdfe no_ad_cases pta gdp_growth3_o gdp_growth3_d lag1_lnrer lag1_lnimp, a(year id) cluster(id) irr nolog
    where pta is the variable of interest and pta=1 if a country-pair enters into a PTA since year t, otherwise it is 0. In other words, the treatment group is the country-pair has a PTA and the controls are these country-pairs without any PTA during my sample period. I have included control variables such as bilateral trade, real exchange rate, GDP growth rate, country-pair fixed effect, and year fixed effects. The data is unbalanced, N=30,515 and T=40. Due to countries entering into PTA at a different time, e.g. US-Canada formed NAFTA in 1994 but Korea-Chile implemented their PTA in 2004, I am in the staggered diff-in-diff world.

    I would like to do a placebo test, i.e., use the data the came before the treatment went into effect. The procedure is as follows:
    1. Use only the data that came before the treatment went into effect.
    2. Pick a fake treatment period.
    3. Estimate the same difference-in-differences model you were planning to use (for example Y=αt+αg+β1Treated+ε), but create the Treated variable as equal to 1 if you’re in the treated group and after the fake treatment date you picked.
    4. If you find an “effect” for that treatment date where there really shouldn’t be one, that’s evidence that there’s something wrong with your design, which may imply a violation of parallel trends.
    The issue here is I cannot identify pre-post period my controls. I can only drop the observations of the post-shock for the treatment group. I am NOT sure whether this is a correct way to do the placebo test.

    I am aware that the dynamic diff-in-diff including lags and leads and also several new commands such as "eventstudyinteract" developed by Sun and Abraham (2020) can verify the parallel assumption. But I am NOT sure whether eventstudyinteract could accommodate Poisson model.I also tried include both leads and lags. Below is what the graph looks like, which does not look like the parallel assumption holds. Because ideally, we would like to have the Incident rate ratio (IRR) above 1 but insignificant before the shock and IRR smaller than 1 and significant after the shock.

    Click image for larger version

Name:	trend_in_AD_initiation.png
Views:	1
Size:	86.0 KB
ID:	1640601

    I would greatly appreciate it if everyone can guide me on how to verify the parallel assumption in the count model setup as I am really clueless.

    Regards,
    Bingzi
    Last edited by Bingzi Zheng; 12 Dec 2021, 20:50.

  • #2
    I don't think there is a well-accepted answer. Linear staggered DiD has not been fully settled in recent literature, not mentioning non-linear staggered DiD. To my knowledge, Wooldridge (2021) seems to be the only paper giving clear solutions to staggered DiD in non-linear models with two-way fixed effects, including parallel-trend testing in the setting.

    Using "pta" alone to capture the treatment effect won't be correct according to tons of recent literature, particularly when treatment effects are heterogenous. All different solutions are based on carefully choosing treatment and control groups and assigning proper weights to each treated group. Wooldridge (2021) doesn't have new Stata commands because the solution simply uses linear regressions and is flexible enough to be extended to non-linear models. For your case, the idea is constructing multiple treatment groups. For example, pairs of countries signing PTAs in 1994 belong to a treatment group, and those entering PTAs in 2002 belong to another treatment group. For each treatment group, define a pre-post indicator -- e.g., for the treatment group entering PTA in 1994, the indicator = 1 after 1994. Then a linear model setting for such a staggered DiD would be

    Code:
    regress y c.d1#c.p1 c.d2#c.p2 c.d3#c.p3 ... c.dG#c.pG d1 d2 d3 ... dG p1 p2 p3 ... pG, vce(cluster id)
    In the model, d1, ..., dG are indicators for G treatment groups, and p1, ..., pG are their pre-post indicators. The regression gives (heterogenous) treatment effects for each treatment group and you may calculate weighted average of them to obtain an average treatment effect (or just keep the original heterogeneous effects). In this setting, parallel-trend would be tested separately for each treatment group, as below.

    Code:
    regress y c.d1#year c.d2#year c.d3#year ... c.dG#year d1 d2 d3 ... dG i.year, vce(cluster id)
    For example, c.d2#year shows a bunch of interactions for treatment group 2, with the default base as the first year. All the interactions before the treated year of group 2 should be statistically insignificant given that trends are parallel. This can be extended to ppmlhdfe, as

    Code:
    ppmlhdfe y c.d1#year c.d2#year c.d3#year ... c.dG#year, a(id year) vce(cluster id)
    Please refer to the paper for more details.

    Comment


    • #3
      Thank you very much for your detailed reply. But I am still curious about the placebo test for the staggered DiD in general. If the placebo test requires using only the data that came before the treatment went into effect, but there is no pre-post for controls, how should we deal with it? Does drop the post-treatment observations for treated sufficient, or we should not use this type of test in the staggered DiD world? Could you kindly let me know the literature discussing this issue?

      Comment


      • #4
        Originally posted by Bingzi Zheng View Post
        Thank you very much for your detailed reply. But I am still curious about the placebo test for the staggered DiD in general. If the placebo test requires using only the data that came before the treatment went into effect, but there is no pre-post for controls, how should we deal with it? Does drop the post-treatment observations for treated sufficient, or we should not use this type of test in the staggered DiD world? Could you kindly let me know the literature discussing this issue?
        You don't need to do anything about the control group. In the specification of #2, for a treatment group entering treatment in a specific year t, the pre-post will be automatically defined for the controls (pre t or post t). In other words, the control group (never entering treatment) will simultaneously serve as the control for every treatment group, and its pre-post timing varies with the treatment group to which it serves as the control.

        Comment

        Working...
        X