Loops within loops

Christiaan de Swardt

Join Date: Oct 2024
Posts: 14

Loops within loops

10 Jan 2025, 02:39

Dear Stata Community,

I am struggling to find the exact code that I need for a problem that I'm suffering from regarding loops. Please consider the dataset that I have below:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str25 NAME_2 float cohort int year byte sex float prop_single
"Alfred Nzo" 4 2001 1  .4871481
"Alfred Nzo" 5 2001 1  .3548796
"Alfred Nzo" 7 2001 1  .2006579
"Alfred Nzo" 1 2001 1  .9841115
"Alfred Nzo" 2 2001 1  .9167702
"Alfred Nzo" 8 2001 1  .1625709
"Alfred Nzo" 6 2001 1 .25470334
"Alfred Nzo" 3 2001 1   .733465
"Alfred Nzo" 6 2001 2  .1979346
"Alfred Nzo" 1 2001 2   .912183
"Alfred Nzo" 3 2001 2  .4653061
"Alfred Nzo" 5 2001 2   .231982
"Alfred Nzo" 7 2001 2 .14970645
"Alfred Nzo" 4 2001 2  .3099274
"Alfred Nzo" 8 2001 2 .12313003
"Alfred Nzo" 2 2001 2  .7007078
"Alfred Nzo" 8 2007 1 .10852713
"Alfred Nzo" 7 2007 1 .13636364
"Alfred Nzo" 3 2007 1  .7304348
"Alfred Nzo" 6 2007 1 .25581396
"Alfred Nzo" 5 2007 1  .3668639
"Alfred Nzo" 4 2007 1  .5966851
"Alfred Nzo" 1 2007 1  .9906977
"Alfred Nzo" 2 2007 1  .8783383
"Alfred Nzo" 2 2007 2  .7058824
"Alfred Nzo" 3 2007 2 .56578946
"Alfred Nzo" 5 2007 2  .3176895
"Alfred Nzo" 7 2007 2 .22307692
"Alfred Nzo" 4 2007 2  .3769968
"Alfred Nzo" 6 2007 2 .30620155
"Alfred Nzo" 8 2007 2 .15086207
"Alfred Nzo" 1 2007 2    .91133
"Alfred Nzo" 3 2011 1  .7712138
"Alfred Nzo" 7 2011 1 .27651966
"Alfred Nzo" 5 2011 1  .4576594
"Alfred Nzo" 6 2011 1  .3560865
"Alfred Nzo" 2 2011 1  .9244756
"Alfred Nzo" 1 2011 1  .9647782
"Alfred Nzo" 4 2011 1  .6136534
"Alfred Nzo" 8 2011 1  .1868743
"Alfred Nzo" 6 2011 2 .26902887
"Alfred Nzo" 1 2011 2  .9275892
"Alfred Nzo" 4 2011 2  .4600991
"Alfred Nzo" 5 2011 2  .3561721
"Alfred Nzo" 8 2011 2 .19050895
"Alfred Nzo" 7 2011 2 .21895006
"Alfred Nzo" 2 2011 2  .7673267
"Alfred Nzo" 3 2011 2  .6037222
"Alfred Nzo" 8 2016 1 .25346786
"Alfred Nzo" 5 2016 1  .6180733
"Alfred Nzo" 1 2016 1  .9944791
"Alfred Nzo" 6 2016 1  .4975174
"Alfred Nzo" 7 2016 1  .3585608
"Alfred Nzo" 4 2016 1   .750934
"Alfred Nzo" 2 2016 1   .969993
"Alfred Nzo" 3 2016 1  .8779677
"Alfred Nzo" 1 2016 2   .958457
"Alfred Nzo" 8 2016 2 .19960213
"Alfred Nzo" 4 2016 2  .5736559
"Alfred Nzo" 3 2016 2   .706066
"Alfred Nzo" 2 2016 2  .8428621
"Alfred Nzo" 6 2016 2  .3318556
"Alfred Nzo" 5 2016 2 .46064675
"Alfred Nzo" 7 2016 2 .25751367
"Amajuba"    7 2001 1 .24217688
"Amajuba"    2 2001 1  .9649758
"Amajuba"    4 2001 1  .6699314
"Amajuba"    3 2001 1  .8415385
"Amajuba"    8 2001 1  .1866197
"Amajuba"    6 2001 1  .3270677
"Amajuba"    5 2001 1  .4948689
"Amajuba"    1 2001 1  .9886312
"Amajuba"    2 2001 2  .9235424
"Amajuba"    4 2001 2  .5979544
"Amajuba"    3 2001 2  .7669322
"Amajuba"    7 2001 2 .29088914
"Amajuba"    8 2001 2 .23188406
"Amajuba"    1 2001 2  .9777778
"Amajuba"    5 2001 2  .4419841
"Amajuba"    6 2001 2 .33955225
"Amajuba"    1 2011 1  .9821935
"Amajuba"    5 2011 1  .5980952
"Amajuba"    2 2011 1  .9570273
"Amajuba"    7 2011 1  .3751743
"Amajuba"    8 2011 1 .29491526
"Amajuba"    6 2011 1 .51566577
"Amajuba"    4 2011 1  .7557677
"Amajuba"    3 2011 1  .8693803
"Amajuba"    3 2011 2  .8124658
"Amajuba"    6 2011 2   .483559
"Amajuba"    4 2011 2  .6828551
"Amajuba"    7 2011 2  .4224806
"Amajuba"    2 2011 2   .907056
"Amajuba"    8 2011 2  .3364681
"Amajuba"    1 2011 2   .968595
"Amajuba"    5 2011 2  .5971643
"Amajuba"    3 2016 1  .9459636
"Amajuba"    6 2016 1  .6269369
"Amajuba"    2 2016 1  .9828022
"Amajuba"    4 2016 1  .8735806
end
label values cohort cohort
label def cohort 1 "15-19", modify
label def cohort 2 "20-24", modify
label def cohort 3 "25-29", modify
label def cohort 4 "30-34", modify
label def cohort 5 "35-39", modify
label def cohort 6 "40-44", modify
label def cohort 7 "45-49", modify
label def cohort 8 "50-54", modify
label values year YEAR
label def YEAR 2001 "2001", modify
label def YEAR 2007 "2007", modify
label def YEAR 2011 "2011", modify
label def YEAR 2016 "2016", modify
label values sex SEX
label def SEX 1 "male", modify
label def SEX 2 "female", modify

This data shows the proportion single persons for each municipality (NAME_2), age group (cohort), sex, and year. I need to do something that is (perhaps deceptively) simple. I need an average of the proportion of people single in cohort 7 and cohort 8 by municipality, year, and sex: simply B = ( Cohort_7_single + Cohort_8_single)/2.

I have written the following loop to do this over year and sex:

gen B = .

levelsof year, local(year)

foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i'
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i'
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i'
drop B1 B2 prop_single_7 prop_single_8
}
}

I have tried adding another loop, looping over NAME_2, but I'm not having much success, unfortunately, i.e.

levelsof NAME_2, local(name)

foreach place of local name {
levelsof year, local(year)
foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i' & NAME_2 == "`place'"
drop B1 B2 prop_single_7 prop_single_8
}
}
}

I get the 'invalid syntax' message after prop_single_7 and prop_single_8 was generated, indicating that the issue might lie with summing prop_single_7 and then assigning it to variable B1.

Does anyone know how I can approach this issue going forward? Also, is there perhaps a more elegant solution that I could employ?

Thank you so much for your kind support!

Kind regards
Christiaan

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35060
#2

10 Jan 2025, 03:20

My guess is that the invalid syntax message arises as an indirect effect of the first space met in each name, so with your approach you need to use compound double quotes.

I didn't test that, because your code can I think be rewritten without any loops whatsoever.

As you've realised, if an observation is in cohort 7, it can't be in cohort 8, and vice versa. Here is a standard trick to populate observations regardless. I have used your variable names, even though they don't seem to bear any relation to the meaning of the variables.

Code:

egen B1 = mean(cond(cohort == 7, prop_single, .)), by(year sex NAME_2) egen B2 = mean(cond(cohort == 8, prop_single, .)), by(year sex NAME_2) gen B = (B1 + B2) / 2

That's it (again, I think). There is no need to create any other variables. Naturally you can drop B1 B2 if they are no use.

See also https://journals.sagepub.com/doi/pdf...867X1101100210 -- especially Section 9. Some of my papers survive better than others; that one covers several tricks that are all obvious when you see the point but may be harder to re-invent.

Last edited by Nick Cox; 10 Jan 2025, 04:18.
Comment
Christiaan de Swardt

Join Date: Oct 2024

Posts: 14
#3

10 Jan 2025, 05:28

Dear Nick

The code that you have provided does exactly what I wanted it to do. Thank you so much for your kind assistance and for referring me to the further documentation!

Kindest regards,
Christiaan
Comment

Announcement

Loops within loops

Comment

Comment