Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to combine variables from similar variable names?

    I have a dataset with over 700 variables. Each variable has similar names such as "date" and "date_opd" or "treatment_1" or "treatment_opd_1" and some of the variables are available for some id's only (such as "admitdate" are only for id1). The names of the variables are varied, sometimes opd is at the back and sometimes in the middle.

    I would like to combine the data in each variables like "date" and "date_opd" or "treatment_1" and "treatment_opd_1" and create a new variable that has the combined data. Any ideas how to do this properly without having to generating one by one?

    example of my dataset:
    id date group onset clinical_1 treatment_1 treatment_2 admitdate outcome date_opd onset_opd clinical_opd_1 treatment_opd_1 treatment_opd_2 outcome_opd
    1 01012024 1 01012024 1 1 0 02012024 2 . . . . . .
    2 . 2 . . . . . 02022024 02022024 0 1 1 1
    codes I've tried to use to combine the variables:
    gen date_new =.
    replace date_new = date if group == 1
    replace date_new = date_opd if group == 2

    and I tried to loop:
    foreach var of varlist date-outcome {
    replace `var' = `var'_opd if group == 1|group == 2|group == 3
    }

    but the problem is if variable name is not followed by _opd it returns not found like clinical_1_opd not found (because the actual name should be clinical_opd_1) and I don't know how to fix this


    Thank you in advanced.
    Last edited by Bow Duan; 20 Feb 2024, 04:13.

  • #2
    You are not explaining yourself very well or providing an adequate data example. Please refer to FAQ Advice #12 for details on how to use the dataex command to provide data examples. A guess at what you need is the following:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id str8 date byte group str8 onset byte(clinical_1 treatment_1 treatment_2) str8 admitdate byte outcome str8(date_opd onset_opd) byte(clinical_opd_1 treatment_opd_1 treatment_opd_2 outcome_opd)
    1 "01012024" 1 "01012024" 1 1 0 "02012024" 2 "."        "."        . . . .
    2 "."        2 "."        . . . ""         . "02022024" "02022024" 0 1 1 1
    end
    
    order id group admitdate
    rename *_opd_* *[1]_*[2]_2
    rename *_opd *_2
    rename (date - outcome) =_1
    reshape long date_ onset_ clinical_1_ treatment_1_ treatment_2_ outcome_, i(id) j(which)
    rename *_ *
    drop if missing(outcome)
    Res.:

    Code:
    . l, abbrev(12)
    
         +---------------------------------------------------------------------------------------------------------+
         | id   which   group   admitdate       date      onset   clinical_1   treatment_1   treatment_2   outcome |
         |---------------------------------------------------------------------------------------------------------|
      1. |  1       1       1    02012024   01012024   01012024            1             1             0         2 |
      2. |  2       2       2               02022024   02022024            0             1             1         1 |
         +---------------------------------------------------------------------------------------------------------+

    Comment

    Working...
    X