Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loops within loops

    Dear Stata Community,

    I am struggling to find the exact code that I need for a problem that I'm suffering from regarding loops. Please consider the dataset that I have below:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str25 NAME_2 float cohort int year byte sex float prop_single
    "Alfred Nzo" 4 2001 1  .4871481
    "Alfred Nzo" 5 2001 1  .3548796
    "Alfred Nzo" 7 2001 1  .2006579
    "Alfred Nzo" 1 2001 1  .9841115
    "Alfred Nzo" 2 2001 1  .9167702
    "Alfred Nzo" 8 2001 1  .1625709
    "Alfred Nzo" 6 2001 1 .25470334
    "Alfred Nzo" 3 2001 1   .733465
    "Alfred Nzo" 6 2001 2  .1979346
    "Alfred Nzo" 1 2001 2   .912183
    "Alfred Nzo" 3 2001 2  .4653061
    "Alfred Nzo" 5 2001 2   .231982
    "Alfred Nzo" 7 2001 2 .14970645
    "Alfred Nzo" 4 2001 2  .3099274
    "Alfred Nzo" 8 2001 2 .12313003
    "Alfred Nzo" 2 2001 2  .7007078
    "Alfred Nzo" 8 2007 1 .10852713
    "Alfred Nzo" 7 2007 1 .13636364
    "Alfred Nzo" 3 2007 1  .7304348
    "Alfred Nzo" 6 2007 1 .25581396
    "Alfred Nzo" 5 2007 1  .3668639
    "Alfred Nzo" 4 2007 1  .5966851
    "Alfred Nzo" 1 2007 1  .9906977
    "Alfred Nzo" 2 2007 1  .8783383
    "Alfred Nzo" 2 2007 2  .7058824
    "Alfred Nzo" 3 2007 2 .56578946
    "Alfred Nzo" 5 2007 2  .3176895
    "Alfred Nzo" 7 2007 2 .22307692
    "Alfred Nzo" 4 2007 2  .3769968
    "Alfred Nzo" 6 2007 2 .30620155
    "Alfred Nzo" 8 2007 2 .15086207
    "Alfred Nzo" 1 2007 2    .91133
    "Alfred Nzo" 3 2011 1  .7712138
    "Alfred Nzo" 7 2011 1 .27651966
    "Alfred Nzo" 5 2011 1  .4576594
    "Alfred Nzo" 6 2011 1  .3560865
    "Alfred Nzo" 2 2011 1  .9244756
    "Alfred Nzo" 1 2011 1  .9647782
    "Alfred Nzo" 4 2011 1  .6136534
    "Alfred Nzo" 8 2011 1  .1868743
    "Alfred Nzo" 6 2011 2 .26902887
    "Alfred Nzo" 1 2011 2  .9275892
    "Alfred Nzo" 4 2011 2  .4600991
    "Alfred Nzo" 5 2011 2  .3561721
    "Alfred Nzo" 8 2011 2 .19050895
    "Alfred Nzo" 7 2011 2 .21895006
    "Alfred Nzo" 2 2011 2  .7673267
    "Alfred Nzo" 3 2011 2  .6037222
    "Alfred Nzo" 8 2016 1 .25346786
    "Alfred Nzo" 5 2016 1  .6180733
    "Alfred Nzo" 1 2016 1  .9944791
    "Alfred Nzo" 6 2016 1  .4975174
    "Alfred Nzo" 7 2016 1  .3585608
    "Alfred Nzo" 4 2016 1   .750934
    "Alfred Nzo" 2 2016 1   .969993
    "Alfred Nzo" 3 2016 1  .8779677
    "Alfred Nzo" 1 2016 2   .958457
    "Alfred Nzo" 8 2016 2 .19960213
    "Alfred Nzo" 4 2016 2  .5736559
    "Alfred Nzo" 3 2016 2   .706066
    "Alfred Nzo" 2 2016 2  .8428621
    "Alfred Nzo" 6 2016 2  .3318556
    "Alfred Nzo" 5 2016 2 .46064675
    "Alfred Nzo" 7 2016 2 .25751367
    "Amajuba"    7 2001 1 .24217688
    "Amajuba"    2 2001 1  .9649758
    "Amajuba"    4 2001 1  .6699314
    "Amajuba"    3 2001 1  .8415385
    "Amajuba"    8 2001 1  .1866197
    "Amajuba"    6 2001 1  .3270677
    "Amajuba"    5 2001 1  .4948689
    "Amajuba"    1 2001 1  .9886312
    "Amajuba"    2 2001 2  .9235424
    "Amajuba"    4 2001 2  .5979544
    "Amajuba"    3 2001 2  .7669322
    "Amajuba"    7 2001 2 .29088914
    "Amajuba"    8 2001 2 .23188406
    "Amajuba"    1 2001 2  .9777778
    "Amajuba"    5 2001 2  .4419841
    "Amajuba"    6 2001 2 .33955225
    "Amajuba"    1 2011 1  .9821935
    "Amajuba"    5 2011 1  .5980952
    "Amajuba"    2 2011 1  .9570273
    "Amajuba"    7 2011 1  .3751743
    "Amajuba"    8 2011 1 .29491526
    "Amajuba"    6 2011 1 .51566577
    "Amajuba"    4 2011 1  .7557677
    "Amajuba"    3 2011 1  .8693803
    "Amajuba"    3 2011 2  .8124658
    "Amajuba"    6 2011 2   .483559
    "Amajuba"    4 2011 2  .6828551
    "Amajuba"    7 2011 2  .4224806
    "Amajuba"    2 2011 2   .907056
    "Amajuba"    8 2011 2  .3364681
    "Amajuba"    1 2011 2   .968595
    "Amajuba"    5 2011 2  .5971643
    "Amajuba"    3 2016 1  .9459636
    "Amajuba"    6 2016 1  .6269369
    "Amajuba"    2 2016 1  .9828022
    "Amajuba"    4 2016 1  .8735806
    end
    label values cohort cohort
    label def cohort 1 "15-19", modify
    label def cohort 2 "20-24", modify
    label def cohort 3 "25-29", modify
    label def cohort 4 "30-34", modify
    label def cohort 5 "35-39", modify
    label def cohort 6 "40-44", modify
    label def cohort 7 "45-49", modify
    label def cohort 8 "50-54", modify
    label values year YEAR
    label def YEAR 2001 "2001", modify
    label def YEAR 2007 "2007", modify
    label def YEAR 2011 "2011", modify
    label def YEAR 2016 "2016", modify
    label values sex SEX
    label def SEX 1 "male", modify
    label def SEX 2 "female", modify
    This data shows the proportion single persons for each municipality (NAME_2), age group (cohort), sex, and year. I need to do something that is (perhaps deceptively) simple. I need an average of the proportion of people single in cohort 7 and cohort 8 by municipality, year, and sex: simply B = ( Cohort_7_single + Cohort_8_single)/2.

    I have written the following loop to do this over year and sex:


    gen B = .

    levelsof year, local(year)

    foreach var of local year {
    forvalues i = 1(1)2 {
    gen prop_single_7 = prop_single if cohort == 7
    gen prop_single_8 = prop_single if cohort == 8
    sum prop_single_7 if year == `var' & sex == `i'
    gen B1 = `r(mean)'
    sum prop_single_8 if year == `var' & sex == `i'
    gen B2 = `r(mean)'
    replace B = (B1+B2)/2 if year == `var' & sex == `i'
    drop B1 B2 prop_single_7 prop_single_8
    }
    }

    I have tried adding another loop, looping over NAME_2, but I'm not having much success, unfortunately, i.e.

    levelsof NAME_2, local(name)

    foreach place of local name {
    levelsof year, local(year)
    foreach var of local year {
    forvalues i = 1(1)2 {
    gen prop_single_7 = prop_single if cohort == 7
    gen prop_single_8 = prop_single if cohort == 8
    sum prop_single_7 if year == `var' & sex == `i' & NAME_2 == "`place'"
    gen B1 = `r(mean)'
    sum prop_single_8 if year == `var' & sex == `i' & NAME_2 == "`place'"
    gen B2 = `r(mean)'
    replace B = (B1+B2)/2 if year == `var' & sex == `i' & NAME_2 == "`place'"
    drop B1 B2 prop_single_7 prop_single_8
    }
    }
    }

    I get the 'invalid syntax' message after prop_single_7 and prop_single_8 was generated, indicating that the issue might lie with summing prop_single_7 and then assigning it to variable B1.

    Does anyone know how I can approach this issue going forward? Also, is there perhaps a more elegant solution that I could employ?

    Thank you so much for your kind support!

    Kind regards
    Christiaan

  • #2
    My guess is that the invalid syntax message arises as an indirect effect of the first space met in each name, so with your approach you need to use compound double quotes.

    I didn't test that, because your code can I think be rewritten without any loops whatsoever.

    As you've realised, if an observation is in cohort 7, it can't be in cohort 8, and vice versa. Here is a standard trick to populate observations regardless. I have used your variable names, even though they don't seem to bear any relation to the meaning of the variables.

    Code:
    egen B1 = mean(cond(cohort == 7, prop_single, .)), by(year sex NAME_2)
    egen B2 = mean(cond(cohort == 8, prop_single, .)), by(year sex NAME_2)
    gen B = (B1 + B2) / 2
    That's it (again, I think). There is no need to create any other variables. Naturally you can drop B1 B2 if they are no use.

    See also https://journals.sagepub.com/doi/pdf...867X1101100210 -- especially Section 9. Some of my papers survive better than others; that one covers several tricks that are all obvious when you see the point but may be harder to re-invent.
    Last edited by Nick Cox; 10 Jan 2025, 04:18.

    Comment


    • #3
      Dear Nick

      The code that you have provided does exactly what I wanted it to do. Thank you so much for your kind assistance and for referring me to the further documentation!

      Kindest regards,
      Christiaan

      Comment

      Working...
      X