Hi Listers,
I started writing code to run a series of simulations to estimate power for a clustered randomised trial comparing 2 groups. I am interested in determining what happens if the size and number of clusters varies between the 2 arms of interest.
I am very new to writing this kind of code so I am slowly building it.
As a first step, I started writing the code to simulate the data for a study with 10 clusters of size 5 for a 2 arm study and an ICC of 0.25 - assuming success rate is 20% in arm 1 vs. 50% in arm 2.
The simulated data however does not keep these rates and estimates ICC to be higher than specified. I would welcome your suggestions on how to fix the code below:
clear
set seed 12345
set obs 10
g clusternum = _n
g obs_per_cluster = 5
expand obs_per_cluster
*Create group variable
expand 2
sum clusternum
local mid = `r(N)'/2
local mid2 = `mid'+1
di `mid'
g group = 1 in 1/`mid'
replace group = 2 in `mid2'/`r(N)'
g pid = _n
* Set icc same in both groups
local icc = 0.25
g sigma = sqrt((`icc' * _pi^2) / (3-3 *`icc'))
g double pid_u = rnormal(0, sigma)
* Create outcome variable - group1 = 20% vs. group 2 50%
g pr1 = logit(0.2) if group==1
g xbu1 = pr1+ pid_u
g byte out1 = rbinomial(1, invlogit(xbu1)) if group==1
tab out1
g pr2 = logit(0.5) if group==2
g xbu2 = pr2+ pid_u
g byte out2 = rbinomial(1, invlogit(xbu2)) if group==2
tab out2
g out = out1 if group==1
replace out = out2 if group==2
tab out group, col
* Run model
xtlogit out i.group, i (pid) re nolog
I started writing code to run a series of simulations to estimate power for a clustered randomised trial comparing 2 groups. I am interested in determining what happens if the size and number of clusters varies between the 2 arms of interest.
I am very new to writing this kind of code so I am slowly building it.
As a first step, I started writing the code to simulate the data for a study with 10 clusters of size 5 for a 2 arm study and an ICC of 0.25 - assuming success rate is 20% in arm 1 vs. 50% in arm 2.
The simulated data however does not keep these rates and estimates ICC to be higher than specified. I would welcome your suggestions on how to fix the code below:
clear
set seed 12345
set obs 10
g clusternum = _n
g obs_per_cluster = 5
expand obs_per_cluster
*Create group variable
expand 2
sum clusternum
local mid = `r(N)'/2
local mid2 = `mid'+1
di `mid'
g group = 1 in 1/`mid'
replace group = 2 in `mid2'/`r(N)'
g pid = _n
* Set icc same in both groups
local icc = 0.25
g sigma = sqrt((`icc' * _pi^2) / (3-3 *`icc'))
g double pid_u = rnormal(0, sigma)
* Create outcome variable - group1 = 20% vs. group 2 50%
g pr1 = logit(0.2) if group==1
g xbu1 = pr1+ pid_u
g byte out1 = rbinomial(1, invlogit(xbu1)) if group==1
tab out1
g pr2 = logit(0.5) if group==2
g xbu2 = pr2+ pid_u
g byte out2 = rbinomial(1, invlogit(xbu2)) if group==2
tab out2
g out = out1 if group==1
replace out = out2 if group==2
tab out group, col
* Run model
xtlogit out i.group, i (pid) re nolog
Comment