Dear members of Statalist,
I am trying to derive pack years of cigarette from information on peoples smoking history using forvalue looping command. My main challenge is to take into account the different types of cigarette smoked (1= filtered, 2= handrolled) during same years of smoking. Example data set below shows 2 smoking patterns in 5 peoples lives
The data for 131-2 is read as person smoked 5 filtered cigarettes from 14 to 18 years (including both years=5 yrs in total) and smoked 10 filtered cigarettes (cigarette type is coded 1 if filtered and 2 if handrolled) [Pattern 1) and filtered from 19 to 25 years including both periods [Pattern 2]. Packday1 and packday2 are derived by running code for step 1, as given below.I am following 3 steps for deriving pack-years which is intensity of smoking (packs per day) multiplied by total duration in years over the persons whole life. 1) Create pack intensity of smoking for each pattern (packday1, packday2 ) , 2) Create age specific intensities and 3) Its not a problem.
Code for Step 1 is given below , giving me packday1 and packday2 ( as shown in above data.)
forvalues a=1/2{
gen packday`a' =.
replace packday`a' = (cig`a'_amt/20) if cig`a'_type==1 // 1 pack of filtered cig=20 cigs
replace packday`a' = (cig`a'_amt/4) if cig`a'_type==2 // 1 pack of handrolled cig=4 cigs
replace packday`a'= 0 if cig`a'_type==88 // 88 or 888 represents absence of smoking
}
Issue is in running the code for the second step in people like 132-1 and 133-1. Code I used is given below.
*Create 30 variables=0, representing each year of life over 30 years.
forvalues i = 1(1)30 {
gen a`i' =0
}
*Replace the life year variables with corresponding packday(intensity) of smoking, if the person smoked (i.e, Gives me intensity per year of life)
forval b=1 (1) 30{
replace a`b' = packday1 if cig1_from <= `b'& (cig1_to+1) >`b' & cig1_type !=88
}
forval b=1 (1) 30{
replace a`b' = packday2 if cig2_from <= `b'& (cig2_to+1) >`b' & cig2_type !=88
}
For 132-1 whose data reads as "s/he smoked 5 filtered cigs between from 16 to 25 years. But S/he smoked 5 handrolled cigs in this same time period as well". For this person, after running the step 2 code, age variables from 16 to 25 has to be replaced by total intensity of 1.5 (intensity of filtered + handrolled) but running the above code is giving me only 1.25, the value of unfiltered cig. A similar issue for 133-1. How should I fix this error due to difference in type of cig smoked?
Any help is appreciated.
Thanks
Thekke purakkal
I am trying to derive pack years of cigarette from information on peoples smoking history using forvalue looping command. My main challenge is to take into account the different types of cigarette smoked (1= filtered, 2= handrolled) during same years of smoking. Example data set below shows 2 smoking patterns in 5 peoples lives
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str5 ID float(cig1_from cig1_to cig1_type cig1_amt cig2_from cig2_to cig2_type cig2_amt packday1 packday2) "131-1" 18 28 1 15 888 888 88 88 .75 0 "131-2" 14 18 1 5 19 25 1 10 .25 .5 "132-1" 16 25 1 5 16 25 2 5 .25 1.25 "132-2" 15 16 1 5 17 21 1 20 .25 1 "133-1" 14 23 1 20 20 30 2 24 1 6 end
Code for Step 1 is given below , giving me packday1 and packday2 ( as shown in above data.)
forvalues a=1/2{
gen packday`a' =.
replace packday`a' = (cig`a'_amt/20) if cig`a'_type==1 // 1 pack of filtered cig=20 cigs
replace packday`a' = (cig`a'_amt/4) if cig`a'_type==2 // 1 pack of handrolled cig=4 cigs
replace packday`a'= 0 if cig`a'_type==88 // 88 or 888 represents absence of smoking
}
Issue is in running the code for the second step in people like 132-1 and 133-1. Code I used is given below.
*Create 30 variables=0, representing each year of life over 30 years.
forvalues i = 1(1)30 {
gen a`i' =0
}
*Replace the life year variables with corresponding packday(intensity) of smoking, if the person smoked (i.e, Gives me intensity per year of life)
forval b=1 (1) 30{
replace a`b' = packday1 if cig1_from <= `b'& (cig1_to+1) >`b' & cig1_type !=88
}
forval b=1 (1) 30{
replace a`b' = packday2 if cig2_from <= `b'& (cig2_to+1) >`b' & cig2_type !=88
}
For 132-1 whose data reads as "s/he smoked 5 filtered cigs between from 16 to 25 years. But S/he smoked 5 handrolled cigs in this same time period as well". For this person, after running the step 2 code, age variables from 16 to 25 has to be replaced by total intensity of 1.5 (intensity of filtered + handrolled) but running the above code is giving me only 1.25, the value of unfiltered cig. A similar issue for 133-1. How should I fix this error due to difference in type of cig smoked?
Any help is appreciated.
Thanks
Thekke purakkal
Comment