Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Power calculation with attrition

    Hi Clyde Schechter, Carlo Lazzaro and stata users,

    I am currently calculating the statistical power of an RCT about migrants' entrepreneurship in Peru. The setup of this RCT is as follows: N is fixed to 1200 participants in 5 regions (clusters), so we have 600 T and 600 C. The participants are equally distributed between clusters, but the clusters size are different (ie. in cluster 1 we have 100 T and 100 C, but in cluster 2 we have 200 T and 200 C, and so on). All these participants were part of the baseline stage of the RCT, but in the midline stage we only have reached 70% of the sample (840 participants, 420 T and 420 C). Based on this information and taking into account that there is a endline stage, I need to recalculate the statistical power. In particular, I want to calculate the MDE (minimal detectable effect) and the minimal required sample to have a statistical effect of the treatment on this population.

    So, in order to make it, I used the following code, based on a JPAL code (https://github.com/J-PAL/Sample_Size_and_Power):

    Code:
    global outcome "business_profit"                                                    //SPECIFY the outcome and treatment variable
    global treatment "capitalsemilla"
    global cluster_var "strata"                                                        //SPECIFY - the cluster variable
    
    local power = 0.8                                                                //SPECIFY - desired power
    local nratio = 1                                                                //SPECIFY - the ratio of experimental group to control group
    local alpha =0.05                                                                //SPECIFY - the significance level
        
    quietly sum $outcome if !missing($outcome)                                        //sum the outcome at baseline and record the mean and the standard deviation
    local sd = `r(sd)'
    local baseline = `r(mean)'
    
    bysort $cluster_var: gen control_cluster = _n==1                                        
    count if control_cluster & $treatment==0                                         //count the number of control clusters
    
    local num_clusters_control=`r(N)'                                                //SPECIFY number of clusters in the control group 
        
    local kratio = 1                                                                //SPECIFY - The ratio of the number of treatment clusters to the number of control clusters
    
    local cluster_size_control = 50                                                    //SPECIFY - number of people in each cluster. 
    local mratio=1                                                                    //SPECIFY - the ratio of the cluster size in the treatment and the control
    
    loneway $outcome $cluster_var                                                     //The loneway command calculates the one-way ANOVA by a group variable. 
                                                                                    //It gives the within-group variation and the between group variation of a variable. 
                                                                                    //It also produces the intra-cluster correlation coefficient (ICC)
        
    local rho = `r(rho)'
        
    
    power twomeans `baseline', cluster k1(`num_clusters_control') kratio(`kratio') mratio(`mratio') m1(`cluster_size_control') power(`power') sd(`sd') rho(`rho')  alpha(`alpha') table
    
        
    local mde_cluster = round(`r(delta)',0.0001)
        
    di as error "The MDE is `mde_cluster' given `num_clusters_control' clusters in the control, ratio of the number of treatment and control clusters as `kratio', `cluster_size_control' units in the control, the ratio of units in each treatment and control cluster of `mratio', and power `power'."
    
    drop control_cluster
    However, the above code does not consider the presence of the attrition. I looked for a paper about it, and I found that in "Accounting for Student Attrition in Power Calculations: Benchmarks and Guidance" by Rickles et al (2019) it is proposed the following MDE (attached). Do you know how to implement it in STATA? or, how can I change the code above to include the attrition issue?

    Thanks in advance!


    Attached Files

  • #2
    Juan:
    sorry, but I'm not familiar with the paper/statistic you mentioned.
    That said, my guess is that it needs ad hoc coding.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X