Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating new variable from old variable for each unit in panel data setting

    Hi:

    I have a dataset that looks like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6 DUID double PID str8 DUPERSID str4 ICD10CDX
    "10001" 101 "10001101" "J00"
    "10001" 101 "10001101" "M76"
    "10001" 101 "10001101" "H52"
    "10001" 101 "10001101" "J32"
    "10001" 101 "10001101" "-9" 
    "10001" 102 "10001102" "J00"
    "10001" 102 "10001102" "M54"
    end
    DUID is dwelling unit; PID is person ID within each dwelling unit; DUPERSID is unique person identifier created by merging the first two ids. The unit of observation is a medical condition denoted by ICD10CDX. So the the first 5 observations are for one individual for 5 different conditions.

    I want to create a new variable say X from the variable ICD10CDX such that if if ICD10CDX=="J00"|ICD10CDX=="J32" for a given individual, then X=1. I will eventually reshape the data from long to wide format so that there is one observation per person and I want the X value for that person ( so if he had either condition J00 or J32, his X value is 1). I know I could reshape the dataset first and then create X but there are upto 70 different conditions and it will take a lot of typing.

  • #2
    I'm not sure what you mean when you say "there are up to 70 different conditions." If you mean there are up to 70 different values of the ICD10CDX variable, that is not a problem. If you mean that there are 70 different conditions that you want to trigger X = 1, that is a different matter altogether. The following works for the two conditions J00 and J32 you mentioned in #1. It would scale for up to 9 conditions, but no farther:
    Code:
    by DUID PID, sort: egen X = max(inlist(ICD10CDX, "J00", "J32"))
    If your actual application involves more than 9 codes for which you want X = 1, post back and I'll describe a different approach.

    Comment


    • #3
      Hi Clyde:

      I have 20 ICD codes that will trigger X=1.

      Comment


      • #4
        OK, so first you need to create a new Stata data set that contains a single variable, ICD10CDX, and 20 observations: one for each of those 20 ICD codes. You can do that in the Data Editor, or you can do it in a spreadsheet and then -import excel- to get it into Stata. Save it as a .dta file. Let's call it trigger_codes.dta. Then you do this
        Code:
        use data_set_from_#1_in_this_thread, clear
        merge m:1 ICD10CDX using trigger_codes, keep(match master)
        by DUID PID, sort: egen X = max(_merge == 3)

        Comment


        • #5
          Thanks, Clyde! Quick and easy.

          Comment

          Working...
          X