Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting Number of Times a Code Takes A Different Value

    Hello,

    I have been working with stata for my dissertation and I consider myself a beginner. I am using Stata MP/17.0. I am attempting to create a new variable (tot_chg_code) that counts the number of times a code takes a different value for each ID. Variables of interest are ID, code, date. date is format %td. My dataset has 30,634 unique IDs and 222,381 observations. There are more than 99 unique values for code. Dates range over 30 years. I have searched the list and several help files, but I am not sure how to achieve what I am looking for. I know I need to sort the data by ID and date before I attempt to calculate tot_chg_code. I have created a simple example of my data. Dataex below.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(ID code date)
    1 19 11868
    2 29 14020
    2 39 15063
    2 49 16245
    2 49 16265
    2 59 16289
    2 59 16845
    2 69 16888
    2 59 17691
    2 49 17701
    2 59 17735
    end
    format %td date
    Once the value tot_chg_code is calculated, I will collapse the data by ID. I would like to end up with:
    ID tot_chg_code
    1 0
    2 7
    Anyone have suggestions on code I might try? Thank you.

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(ID code date)
    1 19 11868
    2 29 14020
    2 39 15063
    2 49 16245
    2 49 16265
    2 59 16289
    2 59 16845
    2 69 16888
    2 59 17691
    2 49 17701
    2 59 17735
    end
    format %td date
    
    by ID (date), sort: gen wanted = sum(code != code[_n-1])
    by ID (date): replace wanted = wanted[_N] - 1
    To get to your final result, use -collapse- or -by ID: keep if _n == 1-.

    Comment


    • #3
      Thank you, Clyde Schechter. This works great!

      Comment

      Working...
      X