Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loop for sum with prefix-suffix variables

    Hello,

    I work with almost 2*80 variables with a prefix-suffix format like: A_CH1211, A_CH1212, A_CH1213,... and B_CH1211, B_CH1212, B_CH1213...
    I need to create new variables which is the sum of A_var + B_var
    Example: AB_CH1211= A_CH1211 + B_CH1211

    I have tried several loops, read many docs, tried again but failed (and, at this point, see no solution).

    For example, I tried these loops:

    local varlist B_* A_*
    foreach `var' of local varlist {
    egen AB_`var'= sum(A_`var' + B_`var')
    }

    or

    foreach var of varlist B_CH1211-A_CH2424{
    gen AB_`var'= A_`var' + B_`var'
    }

    or worst:
    foreach x in B_CH1211- A_CH2424 {
    foreach y in B_CH1211-A_CH2424 {
    egen AB_`y'= sum(`x'`y' + `x'`y')
    }
    }


    Someone to help me ?
    Many thanks!
    Last edited by Anne Fromont; 03 Mar 2021, 05:06.

  • #2
    You can extract prefixes and suffixes. It is hard to suggest the most efficient solution for your case without seeing a sample of your variable names, but here is some technique.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(A_123 var2 B_123 var4 A_2020 var6 B_2020)
    1 2 3 4 5 2 7
    end
    
    qui ds
    local suffixes
    foreach v in `r(varlist)'{
        local suffix=cond(ustrregexm("`v'", "\w+\_\w+"), ustrregexra("`v'", "(\w+)\_(\w+)", "$2"), "")
        local suffixes "`suffixes' `suffix'"
    }
    local suffixes: list uniq suffixes
    foreach suffix of local suffixes{
        cap gen AB_`suffix'= A_`suffix'+B_`suffix'
    }
    Of course you could also have prefixes saved so that you do not manually define the variables to generate at the end.

    Res.:

    Code:
    . l
    
         +-------------------------------------------------------------------------+
         | A_123   var2   B_123   var4   A_2020   var6   B_2020   AB_123   AB_2020 |
         |-------------------------------------------------------------------------|
      1. |     1      2       3      4        5      2        7        4        12 |
         +-----------------------------------------------------------------------

    Comment


    • #3
      Thank you for your prompt response.
      Here my 80*2 variables

      ds A_CH*
      A_CH1211 A_CH1224 A_CH1321 A_CH1423 A_CH2221 A_CH2312 A_CH2332
      A_CH1212 A_CH1225 A_CH1322 A_CH1431 A_CH2222 A_CH2313 A_CH2411
      A_CH1213 A_CH1231 A_CH1323 A_CH1432 A_CH2223 A_CH2314 A_CH2412
      A_CH1214 A_CH1232 A_CH1331 A_CH1433 A_CH2224 A_CH2321 A_CH2413
      A_CH1215 A_CH1233 A_CH1332 A_CH1434 A_CH2231 A_CH2322 A_CH2421
      A_CH1216 A_CH1234 A_CH1333 A_CH2211 A_CH2232 A_CH2323 A_CH2422
      A_CH1217 A_CH1311 A_CH1411 A_CH2212 A_CH2233 A_CH2324 A_CH2423
      A_CH1218 A_CH1312 A_CH1412 A_CH2213 A_CH2234 A_CH2325 A_CH2424
      A_CH1221 A_CH1313 A_CH1413 A_CH2214 A_CH2235 A_CH2326 A_CH1421
      A_CH1222 A_CH1314 A_CH1414 A_CH2215 A_CH2236 A_CH2327
      A_CH1223 A_CH1315 A_CH1422 A_CH2216 A_CH2311 A_CH2331

      ds B_CH*
      B_CH1211 B_CH1224 B_CH1321 B_CH1423 B_CH2221 B_CH2312 B_CH2332
      B_CH1212 B_CH1225 B_CH1322 B_CH1431 B_CH2222 B_CH2313 B_CH2411
      B_CH1213 B_CH1231 B_CH1323 B_CH1432 B_CH2223 B_CH2314 B_CH2412
      B_CH1214 B_CH1232 B_CH1331 B_CH1433 B_CH2224 B_CH2321 B_CH2413
      B_CH1215 B_CH1233 B_CH1332 B_CH1434 B_CH2231 B_CH2322 B_CH2421
      B_CH1216 B_CH1234 B_CH1333 B_CH2211 B_CH2232 B_CH2323 B_CH2422
      B_CH1217 B_CH1311 B_CH1411 B_CH2212 B_CH2233 B_CH2324 B_CH2423
      B_CH1218 B_CH1312 B_CH1412 B_CH2213 B_CH2234 B_CH2325 B_CH2424
      B_CH1221 B_CH1313 B_CH1413 B_CH2214 B_CH2235 B_CH2326
      B_CH1222 B_CH1314 B_CH1414 B_CH2215 B_CH2236 B_CH2327
      B_CH1223 B_CH1315 B_CH1422 B_CH2216 B_CH2311 B_CH2331

      I don't understand many of your commands (e.g.:
      cond(ustrregexm("`v'", "\w+\_\w+"), ustrregexra("`v'", "(\w+)\_(\w+)", "$2"), "") and don't know how to adapt them to my situation.

      Could you explain me ?

      Thank you!

      Comment


      • #4
        Alternatively (assuming that for each A_CH there is a matching B_CH):

        Code:
        ds A_CH*
        local nums = subinstr("`r(varlist)'","A_CH","",.)
        foreach num of local nums{
            egen AB_CH`num'=rowtotal(*CH`num')
            order A_CH`num' B_CH`num' AB_CH`num' //for display purposes only
        }
        
        . list in 1/5,noobs
        
          +-------------------------------------------------+
          | A_CH2   B_CH2   AB_CH2   A_CH1   B_CH1   AB_CH1 |
          |-------------------------------------------------|
          |     1       3        4       4       3        7 |
          |     3       1        4       5       5       10 |
          |     1       5        6       2       3        5 |
          |     1       1        2       1       2        3 |
          |     2       5        7       4       4        8 |
          +-------------------------------------------------+
        Last edited by Ali Atia; 03 Mar 2021, 07:33.

        Comment


        • #5
          oh joy! this version works wonders! You saved my day! Thank you so much!

          Comment


          • #6
            I don't understand many of your commands (e.g.:
            cond(ustrregexm("`v'", "\w+\_\w+"), ustrregexra("`v'", "(\w+)\_(\w+)", "$2"), "") and don't know how to adapt them to my situation.

            Could you explain me ?
            This is just extracting the suffix given that the variable name has the pattern [(word/number)_(word/number)]. Otherwise, add nothing to the local. See

            Code:
            help cond()
            In addition to Ali Atia's nice code in #4, given your variable names, you can create your variables directly:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(A_CH1211 A_CH1224 var2 var4 B_CH1211  var6 B_CH1224)
            1 2 3 4 5 2 7
            end
            
            foreach bvar of varlist B_CH*{
                cap gen A`bvar'= `=subinstr("`bvar'", "B", "A", 1)'+`bvar'
            }

            Res.:

            Code:
            . l *_*
            
                 +-----------------------------------------------------------------+
                 | A_CH1211   A_CH1224   B_CH1211   B_CH1224   AB_~1211   AB_~1224 |
                 |-----------------------------------------------------------------|
              1. |        1          2          5          7          6          9 |
                 +-----------------------------------------------------------------+
            Last edited by Andrew Musau; 03 Mar 2021, 09:38.

            Comment


            • #7
              Wonderful!
              Thank you very much for these explanations. It not only help me to get the desired results but also to progress!
              The user stata community is full of resources.

              Comment

              Working...
              X