Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to amalgamate multiple observations for a single patients into one observation

    Hi,
    I am sure this problem has been discussed before, but I just cannot find the solution to my problem. I have a dataset of about 580 patients. Each patient can have many comorbidities attributed to them. Currently I have one variable "Comorbiditydescription" that lists all the co-morbidities for patients. It means that 1 patient is listed multiple times in the dataset as different observations, if they have more than one comorbidity. As below:

    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 PatientUR str37 Comorbiditydescription
    "1026977" "Diabetes mellitus"
    "1026977" "Hypertension requiring medication"
    "1026977" "Other "
    "1026977" "Smoker, current"

    "1034403" "Smoker, current"
    "1034403" "Substance abuse"
    "1064004" "Bleeding disorder"
    "1064004" "Hypertension requiring medication"

    "1080093" ""
    "1086290" "Hypertension requiring medication"
    "1086290" "Other "
    "1086290" "Respiratory Disease (COAD, emphysema)"

    "1089652" "Cancer, Disseminated"
    "1091738" "Other "
    "1091738" "Substance abuse"

    "1101771" "Substance abuse"

    What I would like to do is to have 1 observation per patient ID, i.e. 580 patients, and multiple variables for co-morbidities, such that if a patient had 3 different co-morbidities there would be a variable comorbidity_1, comorbidity_2, comorbidity_3 for that 1 patient. If a patient only had 1 or 2 co-morbidities, then variables co-morbidities_3 onwards would have a 0. I think this will help to tabulate the different co-morbidities easily and make analysis simpler for the other variables I have.

    Is there a code I can write that would be able to take a single variable and split its result across multiple variables against my 580 individual patients? Hope to hear from someone soon, thank you!

  • #2
    Code:
    clear
    input str8 PatientUR str37 Comorbiditydescription
    "1026977" "Diabetes mellitus"
    "1026977" "Hypertension requiring medication"
    "1026977" "Other "
    "1026977" "Smoker, current"
    "1034403" "Smoker, current"
    "1034403" "Substance abuse"
    "1064004" "Bleeding disorder"
    "1064004" "Hypertension requiring medication"
    "1080093" ""
    "1086290" "Hypertension requiring medication"
    "1086290" "Other "
    "1086290" "Respiratory Disease (COAD, emphysema)"
    "1089652" "Cancer, Disseminated"
    "1091738" "Other "
    "1091738" "Substance abuse"
    "1101771" "Substance abuse"
    end
    
    // create a variable that says:
    // this is the first, second, third, etc. comorbidity
    // (in alphabetical order of Comorbiditydescription)
    bys PatientUR (Comorbiditydescription) : gen sortnr = _n  
    
    // not necessary, but shows what the data looks like now
    list , sepby(PatientUR)
    
    // make the data wide
    reshape wide Comorbiditydescription, i(PatientUR) j(sortnr)
    
    // not necessary, shows what the data looks like now
    list
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you, it's working!

      Comment

      Working...
      X