Dear Statalists,
I'm trying to analyse data from a very simple survey, in which I have asked 65 individuals a multiple choice question to test their knowledge about a certain topic. The question had 4 given answers, of which only one was correct. The correct answer was chosen by 10 individuals. Although my sample was a convenience sample rather than a truly representative random sample of the population, I would like to report a 95% confidence interval around the point estimate.
Having familiarized myself with different possibilities to address this seemingly simple problem in STATA, I realized that STATA offers several possibilities (I work with version 13.1). I think that the svy: commands are not required here because I neither have sampling weights nor clustered samples or anything else that would require to make things more complicated than needed (am I right with this assumption?). Basically, I have a proportion (10/65), and I need to construct a confidence interval around that point estimate.
STATA offers the commands "proportion" and "ci". I am aware that "proportion" does not require binomial variables, whereas "ci" does. However, when I use a binomial dummy variable that codes whether the answer was correct or not, I should be able to use either command and get a valid result. Remarkably, however, the methods used to calculate the confidence interval vary substantially. ci allows to select one of 5 well known methods of calculating binomial confidence intervals, whereas "proportion" uses a logit transform as default and allows to use bootstrap or jackknife techniques. While I have read literature about the pro's and con's of different binomial approaches, I was wondering why the STATA command "proportion" uses a different approach such that the results from "proportion" are not reproducible with "ci" and vice versa; and whether "proportion" or "ci" is most appropriate for my data.
Cheers,
Patrick
I'm trying to analyse data from a very simple survey, in which I have asked 65 individuals a multiple choice question to test their knowledge about a certain topic. The question had 4 given answers, of which only one was correct. The correct answer was chosen by 10 individuals. Although my sample was a convenience sample rather than a truly representative random sample of the population, I would like to report a 95% confidence interval around the point estimate.
Having familiarized myself with different possibilities to address this seemingly simple problem in STATA, I realized that STATA offers several possibilities (I work with version 13.1). I think that the svy: commands are not required here because I neither have sampling weights nor clustered samples or anything else that would require to make things more complicated than needed (am I right with this assumption?). Basically, I have a proportion (10/65), and I need to construct a confidence interval around that point estimate.
STATA offers the commands "proportion" and "ci". I am aware that "proportion" does not require binomial variables, whereas "ci" does. However, when I use a binomial dummy variable that codes whether the answer was correct or not, I should be able to use either command and get a valid result. Remarkably, however, the methods used to calculate the confidence interval vary substantially. ci allows to select one of 5 well known methods of calculating binomial confidence intervals, whereas "proportion" uses a logit transform as default and allows to use bootstrap or jackknife techniques. While I have read literature about the pro's and con's of different binomial approaches, I was wondering why the STATA command "proportion" uses a different approach such that the results from "proportion" are not reproducible with "ci" and vice versa; and whether "proportion" or "ci" is most appropriate for my data.
Cheers,
Patrick
Comment