I would like to test whether associate clinicians (md==0) are non-inferior to physicians (md==1) in the occurrence of a surgical complication (iat). To complete two one-sided tests I propose to use the Stata tostt package (mean equivalence t tests). I am writing with a question about the power calculation and the interpretation.
First, I want to calculate the potential difference that my sample size is powered to assess.
Here are my variables:
summarize iat if md==0
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
iatrogenic | 1,119 .2457551 .4307265 0 1
. summarize iat if md==1
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
iatrogenic | 171 .2923977 .4561999 0 1
By trial and error I find that I would have sufficient power to detect a difference of 13.5% between the means, given that the following code calculates a minimum sample size of 171, which is the number in my smaller group:
Is this correct?
I plug in this value:
I see output for two-sample t-test with equal variances and two-sample unpaired t-test for mean equivalence with equal variances. Is this the same as the original plan for two one-sided tests? If not, what command would you recommend instead?
Ho: |θ| >= Δ:
t1 = 5.095 t2 = 2.479
Ho1: Δ-θ <= 0 Ho2: θ+Δ <= 0
Ha1: Δ-θ > 0 Ha2: θ+Δ > 0
Pr(T > t1) = 0.0000 Pr(T > t2) = 0.0067
Relevance test conclusion for α = 0.05, and Δ = 0.135:
Ho test for difference: Fail to reject
Ho test for equivalence: Reject
Conclusion from combined tests: Equivalence
Would it be fair to report that I sought to assess an equivalence margin of 13.5% given that this was the difference in means that my sample was powered to detect? I will appreciate your confirmation about this approach and interpretation, or else your recommendations about alternatives. Thank you.
First, I want to calculate the potential difference that my sample size is powered to assess.
Here are my variables:
summarize iat if md==0
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
iatrogenic | 1,119 .2457551 .4307265 0 1
. summarize iat if md==1
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
iatrogenic | 171 .2923977 .4561999 0 1
By trial and error I find that I would have sufficient power to detect a difference of 13.5% between the means, given that the following code calculates a minimum sample size of 171, which is the number in my smaller group:
Code:
sampsi 0 0.135, sd1(.43) sd2(.46) power(.8)
I plug in this value:
Code:
tostt iat, by(md) eqvtype(delta) eqvlevel(0.135) relevance
Ho: |θ| >= Δ:
t1 = 5.095 t2 = 2.479
Ho1: Δ-θ <= 0 Ho2: θ+Δ <= 0
Ha1: Δ-θ > 0 Ha2: θ+Δ > 0
Pr(T > t1) = 0.0000 Pr(T > t2) = 0.0067
Relevance test conclusion for α = 0.05, and Δ = 0.135:
Ho test for difference: Fail to reject
Ho test for equivalence: Reject
Conclusion from combined tests: Equivalence
Would it be fair to report that I sought to assess an equivalence margin of 13.5% given that this was the difference in means that my sample was powered to detect? I will appreciate your confirmation about this approach and interpretation, or else your recommendations about alternatives. Thank you.
Comment