Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting incident data to weekly rates

    Hi,

    I've got a data handling challenge that I was hoping to get some assistance with. The data I'm working with contains all the crime incidents that occurred throughout the country in a given year with details such as the location (state), date, and type (property or violent crime).

    E.g.,

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2 STATE long DATE byte TYPE
    "AL" 20150101 1
    "AL" 20150102 1
    "AL" 20150430 0
    "AL" 20150115 1
    "KY" 20151226 1
    "KY" 20150402 0
    "KY" 20150117 1
    "OH" 20151019 1
    "OH" 20150912 0
    "OH" 20150216 0
    "OH" 20150216 1
    "OH" 20150830 1
    "TX" 20150227 1
    "TX" 20150303 0
    "TX" 20150913 0
    end

    I'm hoping to end up with a small data set that contains weekly crime rates by state. Something like this:

    HTML Code:
    STATE, WEEK, PROPCRIMERATE, VIOLCRIMERATE, TOTALCRIMERATE
    AL, 1, 0, 1.33, 1.33
    AL, 2, X, X, X
    ...
    AL, 52, X, X, X
    KY, 1, X, X, X
    KY, 2, X, X, X
    ...
    KY, 52, X, X, X
    ...
    So, each state would have 52 rows with an indicator for each week and the property crime, violent crime, and total crime rate for each week.

    Can anyone help out with a way to do this?

    Thanks.

    Owen Gallupe

    Note:
    TYPE (in first example data) is an indicator of the type of crime committed where 0=property crime and 1=violent crime.

    Example of a weekly crime rate calculation - AL first week of 2015:
    For ease of small numbers, let's assume that the population of AL=150.

    AL had 2 separate violent crimes in the first week of the year (occurring on 2015-01-01 and 2015-01-02 - first two rows of the first data set above). Therefore, the violent crime rate for that state for that week (expressed as the number per 100 population) would be (2/150)*100 = 1.33.


  • #2
    Owen,

    Essentially you are -collapsing- (Stata's term for aggregating) your dataset by state and week #. Once that is done, you'll probably want to use -fillin- or tsfill to fill in weeks that have no crimes in that state (so that you have a complete week==1 to 52 for each state). Also, collapse deletes your data as part of the collapse, so you will want to save your data, or rename it to "State_weekly_crime.dta" or something before collapsing.

    Code:
    * Convert DATE to Stata date format
    tostring DATE, gen(date2)
    gen date3 = date(date2, "YMD")
    format date3 %td
    drop date2
    
    * Creating week number (1-52)
    gen week = week(date3)
    label var week "Week number (1-52)"
    
    gen prop_crime = (TYPE == 1)
    gen violent_crime = (TYPE == 0)
    
    * Actually collapsing down
    preserve
    collapse (sum) prop_crime (sum) violent_crime, by(STATE week)
    This gets the data looking like this:
    Code:
    gen total_crime = prop_crime + violent_crime
    . list, sepby(STATE)
    
         +-----------------------------------------------+
         | STATE   week   prop_c~e   violen~e   total_~e |
         |-----------------------------------------------|
      1. |    AL      1          2          0          2 |
      2. |    AL      3          1          0          1 |
      3. |    AL     18          0          1          1 |
         |-----------------------------------------------|
      4. |    KY      3          1          0          1 |
      5. |    KY     14          0          1          1 |
      6. |    KY     52          1          0          1 |
         |-----------------------------------------------|
      7. |    OH      7          1          1          2 |
      8. |    OH     35          1          0          1 |
      9. |    OH     37          0          1          1 |
     10. |    OH     42          1          0          1 |
         |-----------------------------------------------|
     11. |    TX      9          1          1          2 |
     12. |    TX     37          0          1          1 |
         +-----------------------------------------------+
    Where prop_crime gives you the number of property crimes committed that week (ditto for violent_crime). You can then create your crime_rate variables as you mention above.
    Last edited by David Benson; 23 Nov 2018, 02:04.

    Comment


    • #3
      Such a simple, elegant solution! It does everything I need in a far easier way than the series of embedded loops I had envisioned. Thanks, David!

      Comment


      • #4
        I would not use Stata's weeks. They don't correspond to anything used outside of Stata. However, if you tell us that Stata's rules

        1,. Week 1 begins on 1 January, always, and so on.

        2. Except that week 52 always has 8 or 9 days.

        are exactly what you want, then fine. Otherwise, please give a precise definition for weeks as far as you are concerned.

        Comment


        • #5
          Hi Nick,

          I'll need to do some digging to check what the conventions in my field are on this point, but I see two options:

          Option 1:
          Start each week on whatever day falls on Jan 1. For 2015, each "week" would be Thursday to Wednesday since Jan 1 was a Thursday. To avoid any weeks that are not exactly 7 days, I would only have to drop offences committed on Thursday Dec 31. This option could be achieved using the method David suggested (once Dec 31 is dropped).

          Option 2:
          Have each week go from Sunday through Saturday. For 2015, to avoid any weeks that are not 7 days, I would have to drop offences that occurred Thursday Jan 1 through Saturday Jan 3 at the start of the year and Sunday Dec 27 through Thursday Dec 31 at the end of the year. I'm not sure how to achieve this but I would love to learn!

          Option 1 seems more justifiable since it retains more information but I am very open to suggestions.

          Thanks.

          Owen

          Comment


          • #6
            So Nick has a couple of Stata Journal articles about various options for handling weekly data in Stata:
            • Stata tip 68: Week assumptions Link
            • Stata tip 111: More on working with weeks Link

            Comment


            • #7
              Thanks for pointing me to these! Very helpful.

              Owen

              p.s., I should have known there would be a Stata tip on this issue...oops.

              Comment

              Working...
              X