Greetings,
I'm using Stata to conduct a difference of proportions test from survey data with a complex sample design. For this kind of task, I have some colleagues who prefer to use their preferred software package (Stata, SAS, R, etc.) to calculate group estimates and SEs, but then they conduct the statistical test in Excel by plugging in the group estimates and SEs into an Excel formula that uses the typical equation for a t-test. Wanting to conform to their process, I tried it out but noticed Stata and Excel produce different pooled estimates and therefore a different test statistic. I'm wondering if someone can explain the reason for the difference: what is Stata doing to calculate the test statistic? How would one adjust the t-test equation to replicate Stata's result "by hand" (i.e., using an Excel formula)?
I'm using Stata to conduct a difference of proportions test from survey data with a complex sample design. For this kind of task, I have some colleagues who prefer to use their preferred software package (Stata, SAS, R, etc.) to calculate group estimates and SEs, but then they conduct the statistical test in Excel by plugging in the group estimates and SEs into an Excel formula that uses the typical equation for a t-test. Wanting to conform to their process, I tried it out but noticed Stata and Excel produce different pooled estimates and therefore a different test statistic. I'm wondering if someone can explain the reason for the difference: what is Stata doing to calculate the test statistic? How would one adjust the t-test equation to replicate Stata's result "by hand" (i.e., using an Excel formula)?
Code:
/* example dataset (borrowed from another example on Statalist: https://www.statalist.org/forums/forum/general-stata-discussion/general/432450-survey-adjustment-for-t-tests-and-difference-in-proportions) */ sysuse auto, clear gen mkr = substr(make,1,2) /*artificial PSU */ svyset mkr [pw = turn] gen hiprice = price>8000 /* outcome */ /* test difference of proportions low-price domenstic vs. high-price domestic */ svy: tab hiprice foreign, row se matrix list e(b) lincom _b[p11]-_b[p21] /* t-test formula to mimic Excel style formula, using estimates and SEs from above tabulation */ di (.7467 - .6855)/sqrt( (.1007)^2 + (.154)^2 ) /* pooled standard error using above equation */ di sqrt( (.1007)^2 + (.154)^2 )
Comment