Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Line chart parallel trends

    Hi,

    I have data of the US election at county level since 2000 (variable "per_dem" - percentage vote for the democratic party), and am running a difference in difference analysis to check the impact of a treatment in the year 2016 ("treat=1"). In order to check for parallel trends, I want to create a line chart by year with two lines:

    - Average % vote since 2000 for democratic party for counties treated in 2020
    - Average % vote since 2000 for democratic party for counties not treated in 2020

    Is there a simple way of telling stata to plot the yearly average of "per_dem" of counties with treat=1 and "per_dem" of countries with treat=2?

    Thank you very much,

    Joan

  • #2
    You need to collapse the data excluding county and graph. Below, I assume that the variable "treated2020" takes the value 1 in all years if a county is treated in the year 2020 and 0 otherwise.

    Code:
    preserve
    collapse per_dem, by(year treated2020)
    tw (line per_dem year if treated2020) (line per_dem year if  !treated2020)
    restore
    Last edited by Andrew Musau; 08 Mar 2021, 16:29.

    Comment


    • #3
      Thank you very much Andrew, this works great but the problem is that my treatment variable only takes the value 1 for the year 2020 so I only have one line in the chart. How can I generate a treatment variable that takes the value of 1 for the treated counties also before the treatment took place?

      Thank you again,

      Joan


      Comment


      • #4
        Code:
        bys county: egen treated2020= max(treat==1 & year==2020)

        Comment


        • #5
          Thank you so much this works perfectly. One last question, my treatment has 4 different intensities so treated2020 can actually take the values 1, 2, 3 or 4. I tried to adapt your command to

          #bys county: egen treated2020= max(treat==4 & year==2020)#

          but it takes the value of 1 for all observations, how can I adapt the command so that the variable treated2020 can take the values 1, 2, 3 and 4?

          Thanks again,

          Joan


          Comment


          • #6
            You can create indicators for each level separately:

            Code:
            forval i=1/4{
                bys county: egen t2020_`i'= max(treat==`i' & year==2020)
            }
            or if each county is associated with only one value in 2020,

            Code:
            bys county: egen treated2020= max(cond(year==2020, treat, .))
            Last edited by Andrew Musau; 09 Mar 2021, 06:03.

            Comment


            • #7
              Great, I managed to make it work with the above suggestion, thank you so much. Is there a way to also add the confidence intervals for each period?

              Comment


              • #8
                twoway just graphs, it does not generate confidence intervals. If you have variables with the confidence intervals, sure.

                Comment

                Working...
                X