Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to plot average of a variable against a relative time variable?

    Hi,

    I want to plot the averages of the variables Violent and Property against a variable that I have generated, which captures the relative time to treatment. I want to do this exercise for both control and treatment groups and then combine them in one graph. Is there a way to plot the average without collapsing the data? If I collapse the data, how can I use the relative time?

    clear
    input double fips float year double(Violent Property) float(t2tminor Minor_Treated)
    1011 1990 18 40 -5 1
    1011 1991 16 49 -4 1
    1011 1992 21 51 -3 1
    1011 1993 16 33 -2 1
    1011 1994 19 19 -1 1
    1011 1995 15 26 0 1
    1011 1996 11 17 1 1
    1011 1997 3 11 2 1
    1011 1998 13 35 3 1
    1011 1999 10 26 4 1
    1011 2000 0 0 5 1
    1011 2001 9 10 6 1
    1011 2002 12 13 7 1
    1011 2003 22 8 8 1
    1011 2004 19 10 9 1
    1011 2005 11 8 10 1
    1011 2006 12 5 11 1
    1011 2007 3 1 12 1
    1011 2008 10 5 13 1
    1011 2009 9 17 14 1
    1011 2010 2 1 15 1
    1011 2011 0 0 16 1
    1011 2012 0 0 17 1
    1011 2013 0 0 18 1
    1011 2014 3 4 19 1
    1011 2015 5 8 20 1
    1011 2016 12 12 21 1
    1011 2017 11 7 22 1
    1011 2018 15 18 23 1
    1011 2019 9 6 24 1
    1013 1990 67 131 -5 1
    1013 1991 73 169 -4 1
    1013 1992 26 98 -3 1
    1013 1993 40 136 -2 1
    1013 1994 53 156 -1 1
    1013 1995 40 141 0 1
    1013 1996 55 120 1 1
    1013 1997 32 75 2 1
    1013 1998 48 112 3 1
    1013 1999 23 61 4 1
    1013 2000 0 0 5 1
    1013 2001 18 52 6 1
    1013 2002 17 53 7 1
    1013 2003 32 67 8 1
    1013 2004 17 98 9 1
    1013 2005 19 79 10 1
    1013 2006 0 0 11 1
    1013 2007 15 86 12 1
    1013 2008 16 55 13 1
    1013 2009 16 69 14 1
    1013 2010 21 72 15 1
    1013 2011 0 0 16 1
    1013 2012 0 0 17 1
    1013 2013 0 0 18 1
    1013 2014 7 51 19 1
    1013 2015 34 153 20 1
    1013 2016 22 186 21 1
    1013 2017 30 131 22 1
    1013 2018 31 116 23 1
    1013 2019 29 55 24 1
    1015 1990 388 852 . 0
    1015 1991 324 773 . 0
    1015 1992 512 996 . 0
    1015 1993 522 979 . 0
    1015 1994 435 1136 . 0
    1015 1995 389 998 . 0
    1015 1996 337 991 . 0
    1015 1997 321 1043 . 0
    1015 1998 324 949 . 0
    1015 1999 279 874 . 0
    1015 2000 218 529 . 0
    1015 2001 288 924 . 0
    1015 2002 257 688 . 0
    1015 2003 263 947 . 0
    1015 2004 194 737 . 0
    1015 2005 218 555 . 0
    1015 2006 171 346 . 0
    1015 2007 202 709 . 0
    1015 2008 222 964 . 0
    1015 2009 228 1056 . 0
    1015 2010 174 935 . 0
    1015 2011 0 0 . 0
    1015 2012 0 0 . 0
    1015 2013 0 0 . 0
    1015 2014 31 96 . 0
    1015 2015 293 886 . 0
    1015 2016 284 687 . 0
    1015 2017 276 910 . 0
    1015 2018 250 961 . 0
    1015 2019 124 522 . 0
    1017 1990 79 217 . 0
    1017 1991 103 204 . 0
    1017 1992 98 222 . 0
    1017 1993 102 252 . 0
    1017 1994 89 203 . 0
    1017 1995 76 254 . 0
    1017 1996 95 205 . 0
    1017 1997 86 204 . 0
    1017 1998 78 195 . 0
    1017 1999 79 172 . 0
    end
    [/CODE]

  • #2
    I'm guessing that your time to treatment variable is t2tminor, and that the treatment vs control distinction is the Minor_Treated variable?? In the future, please explain all key variables when the meanings of the names are not immediately, unmistakably obvious to people who know nothing about your project and do not work in your field..

    Well, there is a major conceptual problem here. There is no "time to treatment" in the control group. Now, in your example data, the year in which treatment is given (i.e. t2tminor == 0) is always 1995. But as there are only two different fips in that part of the data, this might just be a coincidence. But, if it is not a coincidence and is indeed true of the data as a whole, then you have a simple solution:

    Code:
    replace t2tminor = year - 1995 if missing(t2tminor)
    Then after that you can
    Code:
    collapse (mean) Property Violent, by(t2tminor Minor_treated)
    and then make your graphs.

    If, however, it is not the case that treatment of the treated was always begun in 1995, then you have a much harder situation to deal with. You need to impute a time of treatment for the control fips. This would involve pairing up each control fips with a treated fips, and then imputing the treated fips' year of treatment to the control fips. The pairing of control with treated fips should be based on matching them on something that is relevant to the outcome variables Violent and Property that you are studying. So they might be fips that are closely matched on age distribution and socioeconomic status, or something like that. No doubt there will be issues in just how to code this entire process, but before getting into that, you need to settle on just what kind of matching you will do, and you may need to acquire some additional data for that purpose.

    Comment


    • #3
      Thanks for responding. I have multiple treatments with variation in treatment timing. Thus, the "year-1995" will not quite work as there are counties that experience events at a different time point. Also running the code below, I get a very spikey kind of graph

      use Unbalanced1.dta, clear
      sort fips year
      //
      replace t2tminor = 0 if missing(t2tminor)
      replace t2tmajor = 0 if missing(t2tmajor)

      //
      collapse (mean) Violent Property, by(fips t2tminor t2tmajor Treatment_Status Treatment_Group)
      //
      ************************************************** **********
      bysort fips: gen num_run = 1 if Treatment_Group ==0 | missing(Treatment_Group)
      * Process 2: Generate the Drop_County variable by taking the maximum of flag_zero_months within each county
      bysort fips: egen drop_agencyA = max(num_run)
      * Process 3: Drop the intermediate variable
      drop if drop_agencyA ==1
      ************************************************** **********
      twoway (line Violent t2tminor if Treatment_Group==1, sort), xline(0)

      Violent.png

      Comment


      • #4
        What?

        Code:
        replace t2tminor = 0 if missing(t2tminor)
        replace t2tmajor = 0 if missing(t2tmajor)
        lumps all of the control group data to relative time 0.

        Code:
        bysort fips: gen num_run = 1 if Treatment_Group ==0 | missing(Treatment_Group)
        * Process 2: Generate the Drop_County variable by taking the maximum of flag_zero_months within each county
        bysort fips: egen drop_agencyA = max(num_run)
        * Process 3: Drop the intermediate variable
        drop if drop_agencyA ==1
        results in the removal of every fips in the control group. So your graph ultimately contains no control group data.

        The spikiness of your graph arises because you did not specify the -sort- option to the -twoway- command.


        Comment


        • #5
          Could you please suggest the corrected code then? Even blocking those code chunks gives me a spikey kind of graph

          use Unbalanced1.dta, clear
          sort fips year
          //
          *replace t2tminor = 0 if missing(t2tminor)
          *replace t2tmajor = 0 if missing(t2tmajor)

          //
          collapse (mean) Violent Property, by(t2tminor t2tmajor Treatment_Status Treatment_Group)
          //
          ************************************************** **********
          *gen num_run = 1 if Treatment_Group ==0 | missing(Treatment_Group)
          * Process 2: Generate the Drop_County variable by taking the maximum of flag_zero_months within each county
          *egen drop_agencyA = max(num_run)
          * Process 3: Drop the intermediate variable
          *drop if drop_agencyA ==1
          ************************************************** **********
          twoway (line Violent t2tminor if Treatment_Group==1, sort), xline(0)
          //Graph.png
          Last edited by Anupam Ghosh; 09 Dec 2024, 19:37.

          Comment


          • #6
            The problem is that for each value of t2tminor, you have multiple observations, corresponding to the values of t2tmajor, Treatment_Status, and Treatment_Group. So for any given value of t2tminor, the graph wiggles back and forth among the different Violence or Property (as the case may be) values at that value of t2tminor.

            As your example data makes no mention of the latter variables, I can only speculate about the right way to handle this. But it is probably something like this (untested, of course):
            Code:
            keep if Treatment_Group != 0
            collapse (mean) Violent Property, by(t2tminor Treatment_Group)
            reshape wide Violent Property, i(t2tminor) j(Treatment_Group)
            graph twoway line Violent* Property* t2tminor, sort xline(0)


            Comment


            • #7
              Thanks Clyde! This worked.

              Comment

              Working...
              X