Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New variable which combines the values of other variables together

    Hello, I have the following dataset,

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte tinh long macs byte capso
    1   1 1
    1   2 1
    1   3 1
    1   4 1
    1   5 1
    1   6 1
    1   7 1
    1   8 1
    1   9 1
    1  10 1
    1  11 1
    1  12 1
    1  13 1
    1  14 1
    1  16 1
    1  17 1
    1  18 1
    1  19 1
    1  20 1
    1  21 1
    1  23 1
    1  24 1
    1  25 1
    1  26 1
    1  27 1
    1  28 1
    1  29 1
    1  30 1
    1  31 1
    1  33 1
    1  34 1
    1  35 1
    1  36 1
    1  38 1
    1  39 1
    1  40 1
    1  41 1
    1  42 1
    1  43 1
    1  44 1
    1  45 1
    1  46 1
    1  47 1
    1  48 1
    1  49 1
    1  50 1
    1  51 1
    1  52 1
    1  53 1
    1  54 1
    1  56 1
    1  57 1
    1  58 1
    1  59 1
    1  60 1
    1  61 1
    1  62 1
    1  63 1
    1  64 1
    1  65 1
    1  67 1
    1  69 1
    1  70 1
    1  71 1
    1  72 1
    1  73 1
    1  74 1
    1  75 1
    1  76 1
    1  77 1
    1  78 1
    1  79 1
    1  81 1
    1  82 1
    1  83 1
    1  84 1
    1  85 1
    1  86 1
    1  87 1
    1  89 1
    1  90 1
    1  91 1
    1  92 1
    1  93 1
    1  94 1
    1  95 1
    1  96 1
    1  97 1
    1  98 1
    1  99 1
    1 100 1
    1 101 1
    1 103 1
    1 104 1
    1 105 1
    1 106 1
    1 107 1
    1 108 1
    1 109 1
    1 110 1
    end
    In this dataset the combination of the three variables above (tinh, macs, capso) uniquely identifies the observations, however to make later coding easier, I want to create a new variable called say 'id' which basically combines those variables in a consistent order, for example combining 'tinh' then 'macs' then 'capso'. So the new variable would take the value 111 for the first entry above , for the last entry it would take the value 11101. Hopefully thats enough to understand what I'm trying to do.

    Any ideas how to implement this?

    Thanks,
    Jad

  • #2
    try this:
    Code:
    gen str5 id=(strofreal(tinh)+strofreal(macs)+strofreal(capso))

    Comment


    • #3
      or

      Code:
      egen id = concat(tinh macs capso)
      is an alternative. Adding spaces would ensure that e.g. 1 110 1 and 11 10 1 don't map to the same identifier.

      See also https://journals.sagepub.com/doi/pdf...867X0800700407

      Comment


      • #4
        For some purposes, it is better to have numeric id variables. For instance, if you want id-level fixed effects and thus anticipate wanting to use Stata's automated factor variable capabilities -- something like regress y x i.id

        If so, you might instead want

        Code:
        gen id = tinh * 10000 + macs * 10 + capso
        which assumes that macs is no more than 3 digits and capso no more than 1 digit. You can change it appropriately if there are more digits in either of these.

        Comment

        Working...
        X