Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • summing variables

    Hi Statalisters,
    For the below sample data, I would like to create new variables by adding a variable to its subsequent one. In other words, I am trying to replicate the below line of code across my dataset. However, given the data includes several variables, I wonder if there is a smarter way to handle this.

    gen T_X_tot=X_tot_male+X_tot_female, before (X_tot_male)
    gen X_wa=X_wa_male+X_wa_female, before (X_wa_male)

    Thanks,
    NM

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int county double(T_X_tot_pop X_tot_male X_tot_female X_wa_male X_wa_female X_ba_male X_ba_female X_ia_male X_ia_female X_aa_male X_aa_female X_na_male X_na_female X_tom_male X_tom_female X_nhwa_male X_nhwa_female)
    1  16202  2932  2956  2897  2923    5    3  19  20   0   5 0 1  11   4  2887  2915
    1  24877  2635  2767  2496  2522   15    8  61 102  33 107 2 2  28  26  2411  2434
    1 247336 17419 21579 14430 17525 2396 3382  50  42 406 491 1 8 136 131 13925 16761
    1  20875  3318  3338  3259  3299   26    6  11  12   4   9 1 0  17  12  3226  3276
    1 215888 31530 40383 30822 39553  314  331 128 141 138 166 5 9 123 183 30650 39348
    1  54571  4100  5223  3549  4398  503  713  13  32  14  42 0 1  21  37  3517  4352
    1  61773  4802  6177  4077  5214  688  912  16  10   5  13 0 0  16  28  4039  5174
    1  25607  1952  2475  1930  2441    3    3   3   6   5  15 0 0  11  10  1918  2429
    1  34387  2816  3626  2799  3608    3    0   3   4   4   6 1 2   6   6  2742  3541
    1 304204 25766 34429 23102 30991 1853 2456  50  64 585 711 1 1 175 206 22795 30569
    end

  • #2
    I think you want this:
    Code:
    ds *_male
    local stubs `r(varlist)'
    local stubs: subinstr local stubs "_male" "", all
    foreach s of local stubs {
        gen `s' = `s'_male + `s'_female, before(`s'_male)
    }
    Now, in the example data, there are no missing values. In the above code, if for some category, the male or female value is missing, the total will be missing. If you would prefer instead for the created variable to provide the one non-missing value in this circumstance, change the -gen- command to -egen `s' = rowtotal(`s'_*)-.

    There is the other question you should consider: is this wide layout of the data appropriate to your needs, or should you be -reshape-ing it to long?

    Comment

    Working...
    X