Hi Clyde Schechter, Carlo Lazzaro and stata users,
I am currently calculating the statistical power of an RCT about migrants' entrepreneurship in Peru. The setup of this RCT is as follows: N is fixed to 1200 participants in 5 regions (clusters), so we have 600 T and 600 C. The participants are equally distributed between clusters, but the clusters size are different (ie. in cluster 1 we have 100 T and 100 C, but in cluster 2 we have 200 T and 200 C, and so on). All these participants were part of the baseline stage of the RCT, but in the midline stage we only have reached 70% of the sample (840 participants, 420 T and 420 C). Based on this information and taking into account that there is a endline stage, I need to recalculate the statistical power. In particular, I want to calculate the MDE (minimal detectable effect) and the minimal required sample to have a statistical effect of the treatment on this population.
So, in order to make it, I used the following code, based on a JPAL code (https://github.com/J-PAL/Sample_Size_and_Power):
However, the above code does not consider the presence of the attrition. I looked for a paper about it, and I found that in "Accounting for Student Attrition in Power Calculations: Benchmarks and Guidance" by Rickles et al (2019) it is proposed the following MDE (attached). Do you know how to implement it in STATA? or, how can I change the code above to include the attrition issue?
Thanks in advance!
I am currently calculating the statistical power of an RCT about migrants' entrepreneurship in Peru. The setup of this RCT is as follows: N is fixed to 1200 participants in 5 regions (clusters), so we have 600 T and 600 C. The participants are equally distributed between clusters, but the clusters size are different (ie. in cluster 1 we have 100 T and 100 C, but in cluster 2 we have 200 T and 200 C, and so on). All these participants were part of the baseline stage of the RCT, but in the midline stage we only have reached 70% of the sample (840 participants, 420 T and 420 C). Based on this information and taking into account that there is a endline stage, I need to recalculate the statistical power. In particular, I want to calculate the MDE (minimal detectable effect) and the minimal required sample to have a statistical effect of the treatment on this population.
So, in order to make it, I used the following code, based on a JPAL code (https://github.com/J-PAL/Sample_Size_and_Power):
Code:
global outcome "business_profit" //SPECIFY the outcome and treatment variable global treatment "capitalsemilla" global cluster_var "strata" //SPECIFY - the cluster variable local power = 0.8 //SPECIFY - desired power local nratio = 1 //SPECIFY - the ratio of experimental group to control group local alpha =0.05 //SPECIFY - the significance level quietly sum $outcome if !missing($outcome) //sum the outcome at baseline and record the mean and the standard deviation local sd = `r(sd)' local baseline = `r(mean)' bysort $cluster_var: gen control_cluster = _n==1 count if control_cluster & $treatment==0 //count the number of control clusters local num_clusters_control=`r(N)' //SPECIFY number of clusters in the control group local kratio = 1 //SPECIFY - The ratio of the number of treatment clusters to the number of control clusters local cluster_size_control = 50 //SPECIFY - number of people in each cluster. local mratio=1 //SPECIFY - the ratio of the cluster size in the treatment and the control loneway $outcome $cluster_var //The loneway command calculates the one-way ANOVA by a group variable. //It gives the within-group variation and the between group variation of a variable. //It also produces the intra-cluster correlation coefficient (ICC) local rho = `r(rho)' power twomeans `baseline', cluster k1(`num_clusters_control') kratio(`kratio') mratio(`mratio') m1(`cluster_size_control') power(`power') sd(`sd') rho(`rho') alpha(`alpha') table local mde_cluster = round(`r(delta)',0.0001) di as error "The MDE is `mde_cluster' given `num_clusters_control' clusters in the control, ratio of the number of treatment and control clusters as `kratio', `cluster_size_control' units in the control, the ratio of units in each treatment and control cluster of `mratio', and power `power'." drop control_cluster
Thanks in advance!
Comment