Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-Differences analysis

    Dear all,

    I am quite new to STATA and maybe this is an easy question but I’m becoming desperate because I have no solution for my problem and so far none of the posts about DiD in this forum have helped me.

    I have data of over 130 countries of their GDP, life-expectancy and their competitiveness score from the years 2010 – 2015. My professor told me to run a difference-in-differences analysis although I have no treatment and hence no control group. He told me to define the diff in diff on a timely basis which means that the year 2010 is t0 and then I have to look how the variables have developed. As far as I understood it, he wants the year 2010 to be the untreated group and the other years to be the treated group (but maybe I am wrong here). I cannot run a times series regression because there are not enough years.

    The aim of this DiD is to identify how a changing competitiveness score of a country is connected with various other factors, like a country’s GDP, life-expectancy etc. I have to do a DiD for each of these factors.

    I would be so grateful if someone can help me because I have no idea how to do it.

    Franz


  • #2
    Princeton has a good, although very short guide: http://www.princeton.edu/~otorres/DID101.pdf
    Advice: read page 1, but forget about creating the interaction variable yourself, and use the method in slide 4.

    Also, wikipedia has a good graph fo making sense of this method:



    In a nutshell,, you compare a trend (over time) between two groups of countries/persons/other ID.
    You therefore need 1) a time variable, and 2) a treatment variable, or a variable that somehow groups your countries (example: adopting a certain policy measure).
    Conflating the two makes little sense. The countries present in year 2010 would be the same group as those present in later years, so all would be treated in one time period, untreated in another.
    Perhaps what your supervisor meant was that different countries become treated (ie., adopted the policy measure at a different point in time). It makes sense to refer to the first year of your data as the starting point of the trend for each of your countries, but not as the factor that decides treated vs untreated

    Comment


    • #3
      Thank you very much for your advice!

      What I still do not understand is how I should run the Diff in Diff since there are no treatments? And how should the time variable look like? 0 for year 2010 and then 1 for year 2011 and then what?

      Comment


      • #4
        The time variable in these examples should be 1 when treatment has started. This could be 1 for all countries from e.g, 2010 onwards, or depend on when the country started treatment, e.g., from 2010 onward for country A, and from 2013 for country B onward.

        The idea with the interaction variable (time##treated) in the slides form princeton is that this variable only equals 1 when a country received treatment, and treatment has started.

        If you truly have no treatment, the DiD method makes no sense.

        To illustrate why the suggestion of your supervisor (if you explain correctly) does not work, see this example with the grunfeld example dataset:

        Example 1: assume companies with ID > 5 are treated, and treatment starts in 1941. Result: the treated group is significantly different from the control group
        Code:
        ssc install diff
        webuse grunfeld, clear
        gen treated = 0
        replace treated=1 if company>5
        *because this dataset has a time var already, we replace the values here
        replace time = 0 
        replace time = 1 if year>=1940
        diff mvalue, t(treated) p(time)

        Code:
        DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
        Number of observations in the DIFF-IN-DIFF: 200
                    Baseline       Follow-up
           Control: 25             75          100
           Treated: 25             75          100
                    50             150
        ------------------------------------------------------
         Outcome var.   | mvalue  | S. Err. |   t   |  P>|t|
        ----------------+---------+---------+-------+---------
        Baseline        |         |         |       | 
           Control      | 1769.868|         |       | 
           Treated      | 256.824 |         |       | 
           Diff (T-C)   | -1.5e+03| 306.547 | -4.94 | 0.000***
        Follow-up       |         |         |       | 
           Control      | 1855.824|         |       | 
           Treated      | 353.095 |         |       | 
           Diff (T-C)   | -1.5e+03| 176.985 | -8.49 | 0.000***
                        |         |         |       | 
        Diff-in-Diff    | 10.314  | 353.969 | 0.03  | 0.977
        ------------------------------------------------------
        R-square:    0.33
        * Means and Standard Errors are estimated by linear regression
        **Inference: *** p<0.01; ** p<0.05; * p<0.1
        Example 2: (following your suggestion): All companies non-treated in year 1935, treated in all years after:
        Code:
        webuse grunfeld, clear
        gen treated = 1
        *because this dataset has a time var already, we replace the values here
        replace time = 0 
        replace time = 1 if year>1935
        diff mvalue, t(treated) p(time)

        Code:
        DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
        Number of observations in the DIFF-IN-DIFF: 200
                    Baseline       Follow-up
           Control: 0              0           0
           Treated: 10             190         200
                    10             190
        ------------------------------------------------------
         Outcome var.   | mvalue  | S. Err. |   t   |  P>|t|
        ----------------+---------+---------+-------+---------
        Baseline        |         |         |       | 
           Control      | 707.471 |         |       | 
           Treated      | 707.471 |         |       | 
           Diff (T-C)   | 0.000   |    .    |    .  |    .
        Follow-up       |         |         |       | 
           Control      | 1101.376|         |       | 
           Treated      | 1101.376|         |       | 
           Diff (T-C)   | 0.000   |    .    |    .  |    .
                        |         |         |       | 
        Diff-in-Diff    | 0.000   |    .    |    .  |    .
        ------------------------------------------------------
        R-square:    0.00
        * Means and Standard Errors are estimated by linear regression
        **Inference: *** p<0.01; ** p<0.05; * p<0.1
        result: not possible to estimate any difference between treatment and control, because the companies were all subject to the same treatment.

        Comment


        • #5
          Thank you again for your help!

          This is exactly what I thought as well. But he insisted on doing a Diff in Diff. He said that I have to define the Diff in Diff on a time basis and that I have to observe how the variables develop over time (using the Diff in Diff). So is this somehow possible to use the 2010 as the untreated year and then the other years as the treated years? Or any other approach?

          He argued that the Diff in Diff approach is possible here although there is no treatment and control group.

          Comment


          • #6
            Again, DiD is out of the question if you have no treatment and control group. If your supervisor still believes this is possible, show him above example, and tell him that what he is asking for is equivalent to saying 'lets look at one group, and see if they differ from a group we dont look at'.

            Any other approach? Other than DiD, then? There are many, suitability of which will depend on what you are trying to do. Plenty of flavours of time series or panel data estimation. Contrary to what you say in #1, the 130 ctries, 5 years, sound like a sufficient dataset for such an analysis to me. The question what analysis to use would better be asked in a new thread, though, where you better explain the purpose of your analysis, rather than focusing on this specific method.

            Comment


            • #7
              Ok, thank you very much. I will email my supervisor.

              Comment


              • #8
                Hi Jorrit,

                If you still there, may I ask a question, about how to interpret the DiD estimation result?

                Zihao

                Comment

                Working...
                X