Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference between two means with highly overlapping 95% CIs is significant at p < 0.001 (?)

    Greetings,

    I'm running Stata 15.1 on a Mac OS and currently working with survey data that's been merged with census data (using respondents' self-reported county and zip code of residence). I'm trying to determine whether the means across political groups ('party3'--a categorical variable) for a continuous outcome are significantly different from zero:

    Code:
    . reg segindex i.party3  if  white==1 [pweight=weight_pre], cluster(inputstate)
    (sum of wgt is 39,638.3774779992)
    
    Linear regression                               Number of obs     =     41,878
    F(2, 50)          =       9.09
    Prob > F          =     0.0004
    R-squared         =     0.0066
    Root MSE          =     .98795
    
    (Std. Err. adjusted for 51 clusters in inputstate)
    
    Robust
    segindex       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]
    
    party3
    2    -.1022216   .0369211    -2.77   0.008    -.1763797   -.0280635
    3    -.1774221   .0444548    -3.99   0.000    -.2667122    -.088132
    
    _cons    .0772096   .1093238     0.71   0.483    -.1423737    .2967929
    
    
    . margins i.party3, post
    
    Adjusted predictions    Number    of    obs     =    41,878
    Model VCE    : Robust
    
    Expression   : Linear prediction, predict()
    
                    
    Delta-method
    Margin   Std. Err.      t    P>t        [95% Conf.    Interval]
                    
    party3
    1     .0772096   .1093238     0.71    0.483        -.1423737    .2967929
    2     -.025012   .1101313    -0.23    0.821        -.2462172    .1961932
    3    -.1002126   .1150351    -0.87    0.388        -.3312674    .1308423
                    
    
    . margins, coeflegend
    
    Adjusted predictions    Number    of    obs     =    41,878
    Model VCE    : Robust
    
    Expression   : Linear prediction, predict()
    
                    
    Margin  Legend
                    
    party3
    1     .0772096  _b[1bn.party3]
    2     -.025012  _b[2.party3]
    3    -.1002126  _b[3.party3]
                    
    
    . test _b[3.party3]=_b[1bn.party3]
    
    ( 1)  - 1bn.party3 + 3.party3 = 0
    
    F(  1,    50) =   15.93
    Prob > F =    0.0002
    What's confusing me here is that the test reports a < 0.001 p-value, but the confidence intervals for each of the means overlap considerably. I understand that means with overlapping confidence intervals can still be significant at the p < 0.05 level. But the p < 0.001 level? That just doesn't make sense to me. Can anyone tell me what's going on here (or what I'm missing)? Thanks in advance for your time.
    Last edited by Zach Goldberg; 08 Apr 2022, 22:38.

  • #2
    ...Interestingly, when I don't cluster the SEs over states of residence, the CIs no longer overlap:

    Code:
    . reg segindex i.party3  if  white==1 [pweight=weight_pre]
    (sum of wgt is 39,638.3774779992)
    
    Linear regression                               Number of obs     =    41,878
    F(2, 41875)       =    82.07
    Prob > F          =    0.0000
    R-squared         =    0.0066
    Root MSE          =    .98795
    
        
    Robust
    segindex       Coef.   Std. Err.      t    P>t     [95% Conf.    Interval]
        
    party3 
    2    -.1022216   .0197792    -5.17   0.000    -.1409892    -.063454
    3    -.1774221   .0139319   -12.73   0.000    -.2047289    -.1501154
    
    _cons    .0772096   .0091031     8.48   0.000     .0593674    .0950518
        
    
    . margins i.party3, post 
    
    Adjusted predictions                            Number of obs     =    41,878
    Model VCE    : Robust
    
    Expression   : Linear prediction, predict()
    
        
    Delta-method
    Margin   Std. Err.      t    P>t     [95% Conf.    Interval]
        
    party3 
    1     .0772096   .0091031     8.48   0.000     .0593674    .0950518
    2     -.025012   .0175599    -1.42   0.154    -.0594298    .0094058
    3    -.1002126   .0105467    -9.50   0.000    -.1208842    -.0795409
    Can anyone explain what's going on here? Thanks!

    Comment

    Working...
    X