Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Differences in standard errors using 'mean' or 'proportion' for indicator variables?

    Dear Statalist,

    my question is about the differences between the 'mean' and 'proportion' command in Stata, and whether the calculation of standard errors differs between these two commands.

    For example, with a survey data set, I would like to calculate the proportion of female individuals in the sample, and the gender of individuals is coded in a dummy variable (say, 'gender' with 0==male, 1==female). I could then either run:

    svy: mean gender

    or

    svy: proportion gender

    Both would give me the same point estimate of the proportion of female students in the sample. Confidence intervals differ, though. But it seems that standard errors do not.

    From the Stata documentation (r.pdf, page 1684) I get that with the 'proportion' command, Stata uses a logit transformation on the estimated proportion so that endpoints of the confidence intervals lie within 0 and 1. My statistics knowledge is limited, but I think that this does not affect standard errors. Is that correct?

    In addition, if standard errors do not differ, this means that hypothesis testing after estimating proportions is also not affected, right?

    Thank you!

    Paul

    PS: I am using Stata 13.0.


  • #2
    Paul is correct, the logit transformation used in proportion does not
    affect the reported standard error.

    Paul then asks

    In addition, if standard errors do not differ, this means that hypothesis
    testing after estimating proportions is also not affected, right?
    This is also correct, and easy to verify.

    Consider the following minimal example using the auto data.

    Code:
    . sysuse auto
    (1978 Automobile Data)
            
    . svyset _n
    
          pweight: <none>
              VCE: linearized
      Single unit: missing
         Strata 1: <one>
             SU 1: <observations>
            FPC 1: <zero>
    
    . svy: proportion foreign 
    (running proportion on estimation sample)
    
    Survey: Proportion estimation
    Number of strata =       1          Number of obs    =      74
    Number of PSUs   =      74          Population size  =      74
                                        Design df        =      73
    
    --------------------------------------------------------------
                 |             Linearized
                 | Proportion   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
    foreign      | 
        Domestic |   .7027027   .0534958      .5865827    .7974684
         Foreign |   .2972973   .0534958      .2025316    .4134173
    --------------------------------------------------------------
    
    . test _b[Foreign] = 0.25
    
    Adjusted Wald test
    
     ( 1)  [foreign]Foreign = .25
    
           F(  1,    73) =    0.78
                Prob > F =    0.3795
    
    . svy: mean foreign
    (running mean on estimation sample)
    
    Survey: Mean estimation
    
    Number of strata =       1          Number of obs    =      74
    Number of PSUs   =      74          Population size  =      74
                                        Design df        =      73
    
    --------------------------------------------------------------
                 |             Linearized
                 |       Mean   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
         foreign |   .2972973   .0534958      .1906803    .4039143
    --------------------------------------------------------------
    
    . test _b[foreign] = 0.25
    
    Adjusted Wald test
    
     ( 1)  foreign = .25
    
           F(  1,    73) =    0.78
                Prob > F =    0.3795

    Comment


    • #3
      Hi Jeff,

      many thanks for your swift and clear reply!

      Best

      Paul

      Comment

      Working...
      X