Making a difference-in-difference graph for common trend assumption

Tarjei W. Havneraas

Join Date: Nov 2016
Posts: 136

Making a difference-in-difference graph for common trend assumption

22 Aug 2018, 06:00

Hi everyone

I have conducted a DiD-analysis and want to plot trend graphs of the dependent variable for the intervention and control group. The main point is to visualize the trends to assess the common trend assumption (also know as parallell paths assumption). Briefly,

The key assumption here is what is known as the “Parallel Paths” assumption, which posits that the average change in the comparison group represents the counterfactual change in the treatment group if there were no treatment

It is commonly visualized with a graph like this:

Click image for larger version

Name: 7go66 (1).png
Views: 1
Size: 21.9 KB
ID: 1459124

Where the dotted line is not necessary and only included in the picture to illustrate the trend for units receiving treatment if they had not received treatment.

However, I am unsure how to get a correct graph in Stata and I wonder if anyone can give me any leads on this?

The dependent variable is monthly regional suicide rate and the treatment status variable is aggregated regions where the intervention group consist of four regions and the control group consist of two regions. Here is some info about my data set:

Code:

. xtdescribe

  region:  1, 2, ..., 6                                      n =          6
 bymonth:  0, 1, ..., 71                                     T =         72
           Delta(bymonth) = 1 unit
           Span(bymonth)  = 72 periods
           (region*bymonth uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                        72      72      72        72        72      72      72

     Freq.  Percent    Cum. |  Pattern
 ---------------------------+--------------------------------------------------------------------------
        6    100.00  100.00 |  111111111111111111111111111111111111111111111111111111111111111111111111
 ---------------------------+--------------------------------------------------------------------------
        6    100.00         |  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


. list suiciderate bymonth eventdate region treated_region in 1/10

     +----------------------------------------------------------------------+
     | suicid~e   bymonth       eventdate                 region   treate~n |
     |----------------------------------------------------------------------|
  1. | .3552352         0     January2012   North central region    Control |
  2. | .4736469         1    February2012   North central region    Control |
  3. | 2.012999         2       March2012   North central region    Control |
  4. | 2.486646         3       April2012   North central region    Control |
  5. | 1.420941         4         May2012   North central region    Control |
     |----------------------------------------------------------------------|
  6. | 1.302529         5        June2012   North central region    Control |
  7. | 1.657764         6        July2012   North central region    Control |
  8. | 2.012999         7      August2012   North central region    Control |
  9. | 1.420941         8   September2012   North central region    Control |
 10. | .9472938         9     October2012   North central region    Control |
     +----------------------------------------------------------------------+

And I will also cross-refer to a related thread: https://www.statalist.org/forums/for...regional-units

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

22 Aug 2018, 10:30

So you would first reduce your data set to one observation per month for the control group and one for the intervention group, containing an indicator for which group (say 1 for intervention, 0 for control), the month, and the "average" suicide rate for the group in that month. You might want to make that a weighted average, weighted by population or something like that. Anyway, probably the -collapse- command will enable you to do that. Then you want to -reshape wide suicide_rate, i(month) j(group)- and then -graph twoway line suicide_rate* month, sort-.
Comment

Tarjei W. Havneraas

Join Date: Nov 2016
Posts: 136

22 Aug 2018, 13:30

Thanks for your reply. I reduced my data set and tried reshaping to wide by:

Code:

collapse (mean) suiciderate, by (intervention region bymonth)
reshape wide suiciderate, i(bymonth) j(intervention)

However, it does not seem like the reshape command is correct (and I've tried some other alternatives without getting it right). The error message says:

Code:

reshape wide suiciderate, i(bymonth) j(intervention)
(note: j = 0 1)
values of variable intervention not unique within bymonth

Here is a look at my original data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(suiciderate attemptrate intervention) long region float(bymonth eventdate)
 .3552352  3.552352 0 1 0 624
 .4736469 2.2498226 0 1 1 625
 2.012999 3.1971166 0 1 2 626
 2.486646  3.552352 0 1 3 627
1.4209406  2.605058 0 1 4 628
 1.302529  3.315528 0 1 5 629
 1.657764 2.2498226 0 1 6 630
 2.012999 3.0787046 0 1 7 631
1.4209406  4.025998 0 1 8 632
 .9472938 2.7234695 0 1 9 633
end
format %tm bymonth
format %tmMCY eventdate
label values intervention intervention
label def intervention 0 "Control", modify
label values region region
label def region 1 "North central region", modify

And a random sample that includes all regions:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(suiciderate attemptrate intervention) long region float(bymonth eventdate)
  1.19644  2.632168 0 1 16 640
 .9571519   2.39288 0 1 21 645
1.9621284  2.697927 0 1 39 663
 .8805054 2.0125837 0 1 61 685
 1.253316  3.759948 1 2  2 626
 .8517325  2.981064 1 2 53 677
 .5102633  5.485331 0 3 45 669
1.8519597  3.571637 0 3 66 690
 .8205981  2.051495 1 4  2 626
 .6224772 2.2132525 1 4 29 653
 .7055012  3.033655 1 4 68 692
 .4779543  .4779543 1 5 59 683
.23527063 2.1174357 1 6 27 651
 .6587578  3.058518 1 6 33 657
 .3300042 3.6300466 1 6 37 661
end
format %tm bymonth
format %tmMCY eventdate
label values intervention intervention
label def intervention 0 "Control", modify
label def intervention 1 "Intervention", modify
label values region region
label def region 1 "North central region", modify
label def region 2 "North east region", modify
label def region 3 "North west region", modify
label def region 4 "South central region", modify
label def region 5 "South east region", modify
label def region 6 "South west region", modify

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

22 Aug 2018, 13:53

Please re-read what I wrote in #2. You need a single observation per month for the control group and for the intervention group. So your -collapse- command must not retain the region variable.
Comment

Tarjei W. Havneraas

Join Date: Nov 2016
Posts: 136

22 Aug 2018, 14:59

Thank you for clearing this up and sorry for misreading #2. I retried without region and the results gave sense now with both graphs for year and month ("bymonth"):

Code:

. collapse (mean) suiciderate, by (intervention year)

. reshape wide suiciderate, i(year) j(intervention)
(note: j = 0 1)

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                       12   ->       6
Number of variables                   3   ->       3
j variable (2 values)      intervention   ->   (dropped)
xij variables:
                            suiciderate   ->   suiciderate0 suiciderate1
-----------------------------------------------------------------------------

. graph twoway line suiciderate0 suiciderate1 year, sort

Comment

Nathan Estifanos

Join Date: Jun 2020

Posts: 6
#6

19 Jun 2020, 07:11

Dear,
I have DID data sets dummy time,dummy groups,interaction ...yet I dont know How I can command DID graph on stat?
Attached Files

Panel101.dta (7.4 KB, 2 views)
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#7

19 Jun 2020, 07:36

Hi Nathan

I am not sure I understand you correctly. What do you mean by "command DID graph on stat"?

Do you want a graph like the one in #1 of the thread or something else? If you only have two treatment groups and t > 2 with some intervention, a starting point would be a trend graph of the outcome by treatment group over time. You can then make an assessment of the parallell trend assumption by examining the difference in outcome trends before and after the intervention.
Comment
Nathan Estifanos

Join Date: Jun 2020

Posts: 6
#8

20 Jun 2020, 00:29

Dear, I need the graph like Number 1,I have 2 groups & 2 time period(dummy)
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#9

20 Jun 2020, 11:03

Ok, I think following the code in #5 should do the trick then. Just exchange "suiciderate" with your outcome variable, "intervention" with your treatment variable and "year" with your time variable.

Note that collapse reduces your data set to only the included variables in the collapse command so save your data set before proceeding. Alternatively, I know there is a way to restore to the original data set w/o reloading data set after collapse, but I don't remember the code right now.
Comment
Nathan Estifanos

Join Date: Jun 2020

Posts: 6
#10

21 Jun 2020, 03:36

Dear,
Thank you very much, it is very helpful. I have some concerns on Difference in Difference Impact Evaluation Method
1-What are other options to assess impact when randomization fails or during natural experiments???
2-Does DID is descriptive or analytic statistics?
3-What are the strength and the weakness of DID?
4-What are assumptions of DID rather than parallel trend assumption and how we can check them?
5-What we are going to do If our data sets fails to fulfill DID assumption?
Thanks in advance!!!
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#11

24 Jun 2020, 01:29

Hi Nathan

I would go to the literature to answer these questions. Angrist & Pishcke Mostly Harmless Econometrics and/or Mastering Metrics (the latter is to a large extent a lighter version of the former) and this more specific DiD intro article by Wing et al. (2018) should have the answers to most of your questions.
Comment

Announcement

Making a difference-in-difference graph for common trend assumption

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment