Hello, I am using NSCH 2021 dataset and STATA 16.1
I have been trying to create a sex and age matched control (No to disorder A) (1:4 ratio) for the cases (Yes to disorder A) in the dataset. I have searched Statalist and tried many approaches but could not figure out how to do it.
For more details, I included the full code here.
My code_1 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
preserve
keep if adhd_ts==0
rename * *_control
rename age_control age
rename sex_control sex
tempfile control
save `control'
restore
keep if adhd_ts==1
joinby age sex using `control'
set seed 1234
gen double shuffle = runiform()
by hhid_case (shuffle), sort:keep if _n==1 //hhid is unique number for every sample
drop shuffle
-> by this, I can make 1:1 randomly sex/age matched groups but I think the result got mixed between case and control.
My code_2 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
gen ok=(adhd_ts==0)
gen random=runiform()
sort ok random
gen insample=ok&(_N-_n)<13302 // 13302 is 4 times of the cases (adhd_ts==1)
drop if insample==0&adhd_ts==0
-> by this, I can make randomly selected control group with 1:4 ratio to the case groups, but there are not age/sex matched
My code_3 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
calipmatch, generate(newvar) casevar(adhd_ts) maxmatches(4) calipermatch(sex age) caliperwidth(1 1)
-> by this, I thought I succeeded, but when I did t-test for age and chi-square for sex, there were significant difference between case group vs. control group. (Maybe due to the width? but I don't think I can set it as 0 0)
My 4th try included kmatch, as below,
kmatch em adhd_ts (sex age), gen
but I don't think I applied it in the right way since the dataset didn't change anything except additional _KM_ variables.
Please provide any advice or resources to help me to figure this out. Thank you in advance.
I have been trying to create a sex and age matched control (No to disorder A) (1:4 ratio) for the cases (Yes to disorder A) in the dataset. I have searched Statalist and tried many approaches but could not figure out how to do it.
For more details, I included the full code here.
My code_1 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
preserve
keep if adhd_ts==0
rename * *_control
rename age_control age
rename sex_control sex
tempfile control
save `control'
restore
keep if adhd_ts==1
joinby age sex using `control'
set seed 1234
gen double shuffle = runiform()
by hhid_case (shuffle), sort:keep if _n==1 //hhid is unique number for every sample
drop shuffle
-> by this, I can make 1:1 randomly sex/age matched groups but I think the result got mixed between case and control.
My code_2 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
gen ok=(adhd_ts==0)
gen random=runiform()
sort ok random
gen insample=ok&(_N-_n)<13302 // 13302 is 4 times of the cases (adhd_ts==1)
drop if insample==0&adhd_ts==0
-> by this, I can make randomly selected control group with 1:4 ratio to the case groups, but there are not age/sex matched
My code_3 is below:
gen adhd_ts=0
replace adhd_ts=1 if adhd==1|ts==1
replace adhd_ts=2 if adhd==1&ts==1
// I would like to create matched control using [adhd_ts==0] as a pool, and [adhd_ts==1] and [adhd_ts==2] as control (create two control groups for each case group)
drop if adhd_ts==2 // trying to create a matched control for adhd_ts==1 cases first
calipmatch, generate(newvar) casevar(adhd_ts) maxmatches(4) calipermatch(sex age) caliperwidth(1 1)
-> by this, I thought I succeeded, but when I did t-test for age and chi-square for sex, there were significant difference between case group vs. control group. (Maybe due to the width? but I don't think I can set it as 0 0)
My 4th try included kmatch, as below,
kmatch em adhd_ts (sex age), gen
but I don't think I applied it in the right way since the dataset didn't change anything except additional _KM_ variables.
Please provide any advice or resources to help me to figure this out. Thank you in advance.
Comment