Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diff in Diff with dependent variable in variation

    Hello everyone,

    I am doing a difference in difference model with a synthetic control, with annual observations. My dependent variable is "variation in number of work permits", so when doing the difference, I end up with a double difference.

    My question then is: how do I interpret the coeficient of the model?

    Thanks!

    Regards!

  • #2
    To the best of my knowledge, D-in-D is to means of finding treatment effect using some sort of averages of the treated and untreated. That said, you would like to interpret your estimate as a comparison between the treated and untreated (knowing that your estimate is a mean value).

    Comment


    • #3
      What Francis writes is exactly right. Diff-In-Diff is used to find treatment effects. For this, one first takes the difference of some statistic (usually means) before and after the treatment. Then, to find out wether this difference is actually due to the treatment, one then does the same for the control group and then takes the difference of these differences (hence the name). This is the coefficient STATA gives you as an output. If this coeff is significant, you can say that the treatment had an effect. Note however, that for this to work, the treatment and control groups have to be the same size and be sampled from the same population. Since you're using a synthetic control, you have to make sure that you actually have used all variables of importance to create this control, as otherwise the results become useless.

      Hope that helped,

      Tim Umbach

      Comment


      • #4
        I agree with most, but not all of what Tim Umbach says above.

        If this coeff is significant, you can say that the treatment had an effect.
        I will spare you my long rant about the misuse of p-values, but this is a classic example. Even taking null-hypothesis significance testing on its own terms, this is a misinterpretation. A "significant" coefficient means that the data have a low probability of originating from a population in which the effect is zero. It does not exclude the absence of an effect: it just says that these data would be unlikely if that were the case. It also does not support any particular alternative to the zero hypothesis.

        In most situations, we don't need a statistical analysis to know that an effect is non-zero. There are very few zero-effects in the real world. We generally know from the start that our effect is non-zero. We may think it is very small, and we may even think it is close enough to zero that we are unsure whether it is positive or negative. But basically, rejecting the null hypothesis tells us nothing we didn't know before we invested our energy and resources into gathering data, unless you are in that rare situation where the null hypothesis of zero effect is not just a straw man. Moreover thinking of it in dichotomous terms, effect vs no effect, can land you in all sorts of paradoxes where one thing is "significant" and some other closely related thing is not and you can't figure out what to make of that.

        A better way to work with these models is to forget about statistical significance. Think of it as trying to get a decent estimate of the size of the effect. The interaction coefficient in the DID model will be an estimator of that effect size. The confidence interval gives you a sense of the uncertainty attached to that estimate.

        Note however, that for this to work, the treatment and control groups have to be the same size and be sampled from the same population.
        The treatment and control groups do not have to be of the same size (although they do have to be sampled from the same population.)

        Comment


        • #5
          Hi Clyde,

          sorry,you are of course right about the sample sizes.

          The p-value thing however is another matter. Firstly, based on the question, I thought a basic breakdown of the usual steps would be useful here. And p-values are still the standard, although it has become en vogue to hate them. But the a t-test of course encapsulates much of the same information as a confidence interval, since it is made up from the same ingredients (mean, std. deviation, theoretical t-value based on a chosen alpha). To your argument of seldomly having a zero effect null-hypothesis I agree, but that is what one-sided tests are for. But the big advantage of a p-value that it sets an objective, if somewhat arbitrary, standard of when to think of a result a "real" and not as random noise, which is much more difficult using just confidence intervals. After all, who is to say what a CI of [-0.2;1.2] actually means, and who can argue that interpreting this as a zero result is more correct than saying it shows a positive effect? Obiously p-values are a bit of a crutch, and have more severe problems which don't arise here (problem of multiple comparisons etc.), but it still sems better to me than not having them.

          Comment


          • #6
            Thank you all for your helpfull comments ! I think I got the idea.

            Regards

            Comment

            Working...
            X