Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • change from wide to long

    I need to conver this data to long format


    Code:
    clear
    input float(id treatment dead opyear revised yearofRIP yearof_revised)
     1 1 1 2001 1 2005 2004
     2 0 0 2001 1    . 2007
    54 1 0 2001 0    .    .
    89 1 1 2006 0 2010    .
    45 0 0 2005 1    . 2008
    76 1 1 2009 0 2015    .
    19 0 1 2008 0 2016    .
    45 1 1 2007 0 2020    .
    end
    label values treatment q1
    label def q1 0 "control", modify
    label def q1 1 "treatment", modify
    label values dead q2
    label def q2 0 "alive", modify
    label def q2 1 "dead", modify
    label values revised q3
    label def q3 0 "success", modify
    label def q3 1 "revised", modify
    With a plan for this to look like this:

    ID - YEAR - STATE (1 if alive if died | revised == 0 , 2 if died == 1, 3 if revised == 1

    I tried this:

    Code:
    reshape long died revised, i(id) j(state)
    Could you pls guide me ?

  • #2
    I see problems here on different levels.

    1. Identifier 45 occurs twice. I am assuming optimistically that is a typo.

    2. reshape long absolutely requires variable prefixes, not variable names.

    3. There is no consistency over naming the year variables.

    4. You don't have a variable died in your data example, or even one whose name starts that way.

    Otherwise this may help. On the other hand, your goal just seems to be the same layout with different variables, so I am confused. I will stick to the thread title.

    Code:
    clear
    input float(id treatment dead opyear revised yearofRIP yearof_revised)
     1 1 1 2001 1 2005 2004
     2 0 0 2001 1    . 2007
    54 1 0 2001 0    .    .
    89 1 1 2006 0 2010    .
    45 0 0 2005 1    . 2008
    76 1 1 2009 0 2015    .
    19 0 1 2008 0 2016    .
    46 1 1 2007 0 2020    .
    end
    label values treatment q1
    label def q1 0 "control", modify
    label def q1 1 "treatment", modify
    label values dead q2
    label def q2 0 "alive", modify
    label def q2 1 "dead", modify
    label values revised q3
    label def q3 0 "success", modify
    label def q3 1 "revised", modify
    
    rename (opyear yearofRIP yearof_revised) (yearop yeardeath yearrevised) 
    
    gen op = 1 
    
    reshape long year, i(id) j(event) string 
    
    sort id year
    
    list, sepby(id) 
    
         +--------------------------------------------------------+
         | id     event   treatment    dead   revised   year   op |
         |--------------------------------------------------------|
      1. |  1        op   treatment    dead   revised   2001    1 |
      2. |  1   revised   treatment    dead   revised   2004    1 |
      3. |  1     death   treatment    dead   revised   2005    1 |
         |--------------------------------------------------------|
      4. |  2        op     control   alive   revised   2001    1 |
      5. |  2   revised     control   alive   revised   2007    1 |
      6. |  2     death     control   alive   revised      .    1 |
         |--------------------------------------------------------|
      7. | 19        op     control    dead   success   2008    1 |
      8. | 19     death     control    dead   success   2016    1 |
      9. | 19   revised     control    dead   success      .    1 |
         |--------------------------------------------------------|
     10. | 45        op     control   alive   revised   2005    1 |
     11. | 45   revised     control   alive   revised   2008    1 |
     12. | 45     death     control   alive   revised      .    1 |
         |--------------------------------------------------------|
     13. | 46        op   treatment    dead   success   2007    1 |
     14. | 46     death   treatment    dead   success   2020    1 |
     15. | 46   revised   treatment    dead   success      .    1 |
         |--------------------------------------------------------|
     16. | 54        op   treatment   alive   success   2001    1 |
     17. | 54     death   treatment   alive   success      .    1 |
     18. | 54   revised   treatment   alive   success      .    1 |
         |--------------------------------------------------------|
     19. | 76        op   treatment    dead   success   2009    1 |
     20. | 76     death   treatment    dead   success   2015    1 |
     21. | 76   revised   treatment    dead   success      .    1 |
         |--------------------------------------------------------|
     22. | 89        op   treatment    dead   success   2006    1 |
     23. | 89     death   treatment    dead   success   2010    1 |
     24. | 89   revised   treatment    dead   success      .    1 |
         +--------------------------------------------------------+
    It may or may not be a good idea to clean up by dropping observations for which is missing.

    Following @Clyde Schechter's usage in this forum, I strongly recommend the term layout in this context rather than format. Format can mean so many things:and is so often used casually that it is in danger of becoming vacuous: display format, file format, format meaning data structure, format meaning variable or storage types.



    Comment


    • #3
      thank you for your kind reply.

      With regards to your comment regarding dropping the missing observations,

      For example for ID 2, observation 6, that person did not die, in fact it is the reason why -year- for event = death is recorded as missing.

      Is your advice....?
      (1) to drop all missing observations
      (2) give the person who is still alive at the end of the study, the last date of the study

      I think I would choose (1) as opting for (2) would mean that the person still died...

      Comment


      • #4
        I understand that missing values mean here not yet died. In other contexts missing could just mean no information.

        Whether you should drop all observations with missing values (watch the wording: an observation is never missing as such) depends on where you want to go next.

        There are many ambiguities over your classification:

        1 if alive if died | revised == 0 , 2 if died == 1, 3 if revised == 1
        Does rule 2 override rule 1? Does rule 3 override rule 1? Do you want this to apply year by year or for the last year with information?

        Comment

        Working...
        X