Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create new file based on data in another


    I am stuck on how to create a new file based on a data in another.

    The pseudo-algorithm for how I am trying to get this to work would be roughly:
    - start a new file or frame
    - for each row in the existing data/frame: look for a value 1 in variables pneumo - nausea
    - for each value in that row which is 1 then copy the patid, when, arm, and variable name or label to the new dataset /frame (this could happen many times)
    - when done, save file/frame


    Given the sample data below... the final result should looke like:

    patid when arm issue
    2 scr 1 pneumo
    2 scr 1 colitis
    5 scr 1 pneumo
    1 wk1 1 diarr
    1 wk1 1 colitis
    6 wk1 2 nausea

    While this would be a fairly easy task in standard programming language, I am stumped how to do it in stata.

    I looked at trying to use frames... or replacing the 1's with the string of value label and then collapsing... but not sure this is the right approach.

    Any hints on how to do this would be appreciated.



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long patid str7 when float arm byte(pneumo diarr colitis muco nausea)
    1 "scr" 1 0 0 0 0 0
    2 "scr" 1 1 0 1 0 0
    3 "scr" 1 0 0 0 0 0
    4 "scr" 1 0 0 0 0 0
    5 "scr" 1 1 0 0 0 0
    6 "scr" 1 0 0 0 0 0
    1 "wk1" 1 0 1 1 0 0
    2 "wk1" 1 0 0 0 0 0
    3 "wk1" 1 0 0 0 0 0
    4 "wk1" 2 0 0 0 0 0
    5 "wk1" 2 0 0 0 0 0
    6 "wk1" 2 0 0 0 0 1
    end
    label values pneumo pneumo6_
    label values diarr diarr6_
    label values colitis colitis6_
    label values muco muco6_
    label values nausea nausea6_





  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long patid str7 when float arm byte(pneumo diarr colitis muco nausea)
    1 "scr" 1 0 0 0 0 0
    2 "scr" 1 1 0 1 0 0
    3 "scr" 1 0 0 0 0 0
    4 "scr" 1 0 0 0 0 0
    5 "scr" 1 1 0 0 0 0
    6 "scr" 1 0 0 0 0 0
    1 "wk1" 1 0 1 1 0 0
    2 "wk1" 1 0 0 0 0 0
    3 "wk1" 1 0 0 0 0 0
    4 "wk1" 2 0 0 0 0 0
    5 "wk1" 2 0 0 0 0 0
    6 "wk1" 2 0 0 0 0 1
    end
    label values pneumo pneumo6_
    label values diarr diarr6_
    label values colitis colitis6_
    label values muco muco6_
    label values nausea nausea6_
    
    egen wanted = anymatch(pneumo - nausea), values(1)
    keep if wanted
    
    rename (pneumo-nausea) whatever= 
    reshape long whatever, i(patid when arm) j(issue) string 
    keep if whatever 
    drop whatever wanted 
    
    list , sepby(patid)
    
         +------------------------------+
         | patid   when   arm     issue |
         |------------------------------|
      1. |     1    wk1     1   colitis |
      2. |     1    wk1     1     diarr |
         |------------------------------|
      3. |     2    scr     1   colitis |
      4. |     2    scr     1    pneumo |
         |------------------------------|
      5. |     5    scr     1    pneumo |
         |------------------------------|
      6. |     6    wk1     2    nausea |
         +------------------------------+
    
    .

    Comment


    • #3
      That works perfectly. Thanks loads. Looks easier to accomplish in STATA in the end! Amazing!

      Comment


      • #4
        If you have time... could I add in one small question... how can the code be changed to save the matching data value.

        For example, if the values in pneumo - nausea were values in the range 1 - 5 ... is there a way to edit the reshape to include the value as well? So the data would have the addition of the val column:
        patid when arm issue val
        2 scr 1 pneumo 1
        2 scr 1 colitis 2
        5 scr 1 pneumo 1
        1 wk1 1 diarr 2
        1 wk1 1 colitis 1
        6 wk1 2 nausea 4


        For example, instead of:
        reshape long whatever, i(patid when arm) j(issue) string

        Something like:
        reshape long whatever, i(patid when arm `value') j(issue) string

        ? .. but the trick is how to get the 'value' in this 'i' list. Can it reference what is being reshaped... so something like `var'[i]?


        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte patid str3 when byte(arm pneumo diarr colitis muco nausea)
        1 "scr" 1 0 0 0 0 0
        2 "scr" 1 1 0 2 0 0
        3 "scr" 1 0 0 0 0 0
        4 "scr" 1 0 0 0 0 0
        5 "scr" 1 1 0 0 0 0
        6 "scr" 1 0 0 0 0 0
        1 "wk1" 1 0 2 1 0 0
        2 "wk1" 1 0 0 0 0 0
        3 "wk1" 1 0 0 0 0 0
        4 "wk1" 2 0 0 0 0 0
        5 "wk1" 2 0 0 0 0 0
        6 "wk1" 2 0 0 0 0 4
        end
        Already done this to correct the values:
        egen wanted = anymatch(pneumo - nausea), values(1,2,3,4,5)

        Thanks for any suggestions or pointers!


        Comment

        Working...
        X