Dear Stata Community,
I am struggling to find the exact code that I need for a problem that I'm suffering from regarding loops. Please consider the dataset that I have below:
This data shows the proportion single persons for each municipality (NAME_2), age group (cohort), sex, and year. I need to do something that is (perhaps deceptively) simple. I need an average of the proportion of people single in cohort 7 and cohort 8 by municipality, year, and sex: simply B = ( Cohort_7_single + Cohort_8_single)/2.
I have written the following loop to do this over year and sex:
gen B = .
levelsof year, local(year)
foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i'
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i'
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i'
drop B1 B2 prop_single_7 prop_single_8
}
}
I have tried adding another loop, looping over NAME_2, but I'm not having much success, unfortunately, i.e.
levelsof NAME_2, local(name)
foreach place of local name {
levelsof year, local(year)
foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i' & NAME_2 == "`place'"
drop B1 B2 prop_single_7 prop_single_8
}
}
}
I get the 'invalid syntax' message after prop_single_7 and prop_single_8 was generated, indicating that the issue might lie with summing prop_single_7 and then assigning it to variable B1.
Does anyone know how I can approach this issue going forward? Also, is there perhaps a more elegant solution that I could employ?
Thank you so much for your kind support!
Kind regards
Christiaan
I am struggling to find the exact code that I need for a problem that I'm suffering from regarding loops. Please consider the dataset that I have below:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str25 NAME_2 float cohort int year byte sex float prop_single "Alfred Nzo" 4 2001 1 .4871481 "Alfred Nzo" 5 2001 1 .3548796 "Alfred Nzo" 7 2001 1 .2006579 "Alfred Nzo" 1 2001 1 .9841115 "Alfred Nzo" 2 2001 1 .9167702 "Alfred Nzo" 8 2001 1 .1625709 "Alfred Nzo" 6 2001 1 .25470334 "Alfred Nzo" 3 2001 1 .733465 "Alfred Nzo" 6 2001 2 .1979346 "Alfred Nzo" 1 2001 2 .912183 "Alfred Nzo" 3 2001 2 .4653061 "Alfred Nzo" 5 2001 2 .231982 "Alfred Nzo" 7 2001 2 .14970645 "Alfred Nzo" 4 2001 2 .3099274 "Alfred Nzo" 8 2001 2 .12313003 "Alfred Nzo" 2 2001 2 .7007078 "Alfred Nzo" 8 2007 1 .10852713 "Alfred Nzo" 7 2007 1 .13636364 "Alfred Nzo" 3 2007 1 .7304348 "Alfred Nzo" 6 2007 1 .25581396 "Alfred Nzo" 5 2007 1 .3668639 "Alfred Nzo" 4 2007 1 .5966851 "Alfred Nzo" 1 2007 1 .9906977 "Alfred Nzo" 2 2007 1 .8783383 "Alfred Nzo" 2 2007 2 .7058824 "Alfred Nzo" 3 2007 2 .56578946 "Alfred Nzo" 5 2007 2 .3176895 "Alfred Nzo" 7 2007 2 .22307692 "Alfred Nzo" 4 2007 2 .3769968 "Alfred Nzo" 6 2007 2 .30620155 "Alfred Nzo" 8 2007 2 .15086207 "Alfred Nzo" 1 2007 2 .91133 "Alfred Nzo" 3 2011 1 .7712138 "Alfred Nzo" 7 2011 1 .27651966 "Alfred Nzo" 5 2011 1 .4576594 "Alfred Nzo" 6 2011 1 .3560865 "Alfred Nzo" 2 2011 1 .9244756 "Alfred Nzo" 1 2011 1 .9647782 "Alfred Nzo" 4 2011 1 .6136534 "Alfred Nzo" 8 2011 1 .1868743 "Alfred Nzo" 6 2011 2 .26902887 "Alfred Nzo" 1 2011 2 .9275892 "Alfred Nzo" 4 2011 2 .4600991 "Alfred Nzo" 5 2011 2 .3561721 "Alfred Nzo" 8 2011 2 .19050895 "Alfred Nzo" 7 2011 2 .21895006 "Alfred Nzo" 2 2011 2 .7673267 "Alfred Nzo" 3 2011 2 .6037222 "Alfred Nzo" 8 2016 1 .25346786 "Alfred Nzo" 5 2016 1 .6180733 "Alfred Nzo" 1 2016 1 .9944791 "Alfred Nzo" 6 2016 1 .4975174 "Alfred Nzo" 7 2016 1 .3585608 "Alfred Nzo" 4 2016 1 .750934 "Alfred Nzo" 2 2016 1 .969993 "Alfred Nzo" 3 2016 1 .8779677 "Alfred Nzo" 1 2016 2 .958457 "Alfred Nzo" 8 2016 2 .19960213 "Alfred Nzo" 4 2016 2 .5736559 "Alfred Nzo" 3 2016 2 .706066 "Alfred Nzo" 2 2016 2 .8428621 "Alfred Nzo" 6 2016 2 .3318556 "Alfred Nzo" 5 2016 2 .46064675 "Alfred Nzo" 7 2016 2 .25751367 "Amajuba" 7 2001 1 .24217688 "Amajuba" 2 2001 1 .9649758 "Amajuba" 4 2001 1 .6699314 "Amajuba" 3 2001 1 .8415385 "Amajuba" 8 2001 1 .1866197 "Amajuba" 6 2001 1 .3270677 "Amajuba" 5 2001 1 .4948689 "Amajuba" 1 2001 1 .9886312 "Amajuba" 2 2001 2 .9235424 "Amajuba" 4 2001 2 .5979544 "Amajuba" 3 2001 2 .7669322 "Amajuba" 7 2001 2 .29088914 "Amajuba" 8 2001 2 .23188406 "Amajuba" 1 2001 2 .9777778 "Amajuba" 5 2001 2 .4419841 "Amajuba" 6 2001 2 .33955225 "Amajuba" 1 2011 1 .9821935 "Amajuba" 5 2011 1 .5980952 "Amajuba" 2 2011 1 .9570273 "Amajuba" 7 2011 1 .3751743 "Amajuba" 8 2011 1 .29491526 "Amajuba" 6 2011 1 .51566577 "Amajuba" 4 2011 1 .7557677 "Amajuba" 3 2011 1 .8693803 "Amajuba" 3 2011 2 .8124658 "Amajuba" 6 2011 2 .483559 "Amajuba" 4 2011 2 .6828551 "Amajuba" 7 2011 2 .4224806 "Amajuba" 2 2011 2 .907056 "Amajuba" 8 2011 2 .3364681 "Amajuba" 1 2011 2 .968595 "Amajuba" 5 2011 2 .5971643 "Amajuba" 3 2016 1 .9459636 "Amajuba" 6 2016 1 .6269369 "Amajuba" 2 2016 1 .9828022 "Amajuba" 4 2016 1 .8735806 end label values cohort cohort label def cohort 1 "15-19", modify label def cohort 2 "20-24", modify label def cohort 3 "25-29", modify label def cohort 4 "30-34", modify label def cohort 5 "35-39", modify label def cohort 6 "40-44", modify label def cohort 7 "45-49", modify label def cohort 8 "50-54", modify label values year YEAR label def YEAR 2001 "2001", modify label def YEAR 2007 "2007", modify label def YEAR 2011 "2011", modify label def YEAR 2016 "2016", modify label values sex SEX label def SEX 1 "male", modify label def SEX 2 "female", modify
I have written the following loop to do this over year and sex:
gen B = .
levelsof year, local(year)
foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i'
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i'
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i'
drop B1 B2 prop_single_7 prop_single_8
}
}
I have tried adding another loop, looping over NAME_2, but I'm not having much success, unfortunately, i.e.
levelsof NAME_2, local(name)
foreach place of local name {
levelsof year, local(year)
foreach var of local year {
forvalues i = 1(1)2 {
gen prop_single_7 = prop_single if cohort == 7
gen prop_single_8 = prop_single if cohort == 8
sum prop_single_7 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B1 = `r(mean)'
sum prop_single_8 if year == `var' & sex == `i' & NAME_2 == "`place'"
gen B2 = `r(mean)'
replace B = (B1+B2)/2 if year == `var' & sex == `i' & NAME_2 == "`place'"
drop B1 B2 prop_single_7 prop_single_8
}
}
}
I get the 'invalid syntax' message after prop_single_7 and prop_single_8 was generated, indicating that the issue might lie with summing prop_single_7 and then assigning it to variable B1.
Does anyone know how I can approach this issue going forward? Also, is there perhaps a more elegant solution that I could employ?
Thank you so much for your kind support!
Kind regards
Christiaan
Comment