Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating new variable from calculated dummy variables

    I'm very new to Stata (switching from SAS), and am trying to generate the code to solve for what is probably a very simple problem for most experienced Stata users. Here goes:
    • I'm using a data set with approximately 12,950 results.
    • I've already calculated 4 dummy variables, which are: VAR1, VAR2, VAR3, and VAR4
    • I want to generate a new variable, VAR5, which will be the result of five possible outcomes. See the following table as an example:
    Possible Dummy Variables Desired
    Outcomes VAR1 VAR2 VAR3 VAR4 VAR5
    1 96 64 32 12 96
    2 . 62 30 13 62
    3 . . 24 10 24
    4 . . . 11 11
    5 . . . . .
    • What I want to have in VAR5 is the result from the earliest VAR, else "."
    • As can be seen in Outcome 1, VAR5 = the result from VAR1 (96), since it is the earliest. Outcome 2 = the VAR2 result (62) since VAR 1 = ".", Outcome 3 = the VAR3 result since VAR1 and VAR2 = ".", Outcome 4 = VAR4 result since VAR1, VAR2 and VAR3 are all ".", and, finally, Outcome 5 would return "." since VAR1, VAR2, VAR3 and VAR4 are all "."
    I would appreciate any and all assistance.

    Cheers!

  • #2
    See the description of the rowfirst() function in the output of the help egen command.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(var1 var2 var3 var4 wanted)
    96 64 32 12 96
     . 62 30 13 62
     .  . 24 10 24
     .  .  . 11 11
     .  .  .  .  .
    end
    egen var5 = rowfirst(var1-var4)
    list, clean noobs
    Code:
    . list, clean noobs
    
        var1   var2   var3   var4   wanted   var5  
          96     64     32     12       96     96  
           .     62     30     13       62     62  
           .      .     24     10       24     24  
           .      .      .     11       11     11  
           .      .      .      .        .      .
    Last edited by William Lisowski; 09 Jan 2022, 15:01.

    Comment


    • #3
      Code:
      reshape long VAR, i(Outcomes)
      
      gen obsno = _n
      gen missing = missing(VAR)
      bysort Outcomes (missing obsno) : gen VAR5 = VAR[1]
      
      drop missing obsno
      
      reshape wide

      Comment


      • #4
        Thank you so much William and Oyvind. All it took was the egen rowfirst command. See a snapshot of a sample of some of the results below.
        VAR1 VAR2 VAR3 VAR4 VAR5
        76 53 26 2 76
        . . . 20 20
        . 58 38 7 58
        . . . 2 2
        . . . 16 16
        . . . 4 4
        . . 18 3 18
        . 60 33 18 60
        . . . 21 21
        . . 62 10 62
        . . 26 3 26
        . . . 20 20
        . . . 1 1
        Again, thank you!

        -Robert

        Comment

        Working...
        X