Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighted tab and table results differ

    I'm getting different point estimates when using tab and aweight instead of table and pweight. Everything I've seen (such as this discussion http://www.stata.com/statalist/archi.../msg00423.html) indicates that aweight and pweight produce the same point estimate but different variances.

    I'm using the new 1972-2016 GSS data and a recoded White Baptist variable, but the difference is apparent in the standard race variable. I'm using wtssall. If you run
    Code:
    tab race [aweight=wtssall]
    the white point estimate is 50,320.945. On the other hand, if you run
    Code:
    table race [pweight=wtssall]
    the white point estimate is 50,321.7. Why is this occurring? And when are each appropriate?
    Last edited by Simon Brauer; 19 May 2017, 16:04.

  • #2
    Welcome to Statalist, Simon.

    The description of your problem is a little sparse, which may be why you have yet to receive any suggestions. You should review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    In this case, both tab and table produce more than a single number of output, and it would be useful to have seen the entire output from each. Is every estimate lower in tab than in table? If so, then the next question is, have you reviewed the output of help weight? In particular it says

    For most Stata commands, the recorded scale of aweights is irrelevant; Stata internally rescales them to sum to N, the number of observations in your data, when it uses them.
    which suggests to me that your tab results reflect the weights, rescaled to a smaller total. By the way, that same output describes the general use of each of the weights.

    I'm also led to ask if you have any missing values for race? That perhaps could affect your results.

    And pushing back yet farther, have you compared the unweighted results from tab and table: are they identical?

    Comment


    • #3
      William gives good advice. In addition:

      I'm getting different point estimates when using tab and aweight instead of table and pweight. Everything I've seen (such as this discussion http://www.stata.com/statalist/archi.../msg00423.html) indicates that aweight and pweight produce the same point estimate but different variances.
      You talk about point estimates in the context of a regression. For example, running

      Code:
      sysuse auto
      reg price mpg [pw= weight]
      reg price mpg [aw= weight]
      Will yield the same coefficients for _b[mpg] and _b[_cons] but different standard errors. Including the robust option with aweights should result in the same standard errors.

      Code:
      reg price mpg [aw= weight], robust
      Running tab or table on the other hand is just gives a summary of the data. The difference between

      the white point estimate is 50,320.945.
      and

      the white point estimate is 50,321.7.
      is most likely due to rounding. You do not show exactly what you type and what Stata outputs, so we do not know where these numbers come from. The following example, however, illustrates how rounding can affect comparisons between the two


      Code:
      . webuse lbw
      (Hosmer & Lemeshow data)
      
      . tab race [aweight=age]
      
             race |      Freq.     Percent        Cum.
      ------------+-----------------------------------
            white | 100.352459       53.10       53.10
            black | 24.0983607       12.75       65.85
            other | 64.5491803       34.15      100.00
      ------------+-----------------------------------
            Total |        189      100.00
      
      . table race [pweight=age]
      
      ----------------------
           race |      Freq.
      ----------+-----------
          white |      2,332
          black |        560
          other |      1,500
      ----------------------
      
      
      *// Compare the percentage freq.(aweights) to Freq. (pweights): White
      . di 0.531*(2332+ 560+1500 )
      2332.152
      
      
      *// Compare  Freq. (pweights) to percentage freq.(aweights): White
      . di 2332/ (2332+ 560+1500)
      .53096539
      Last edited by Andrew Musau; 21 May 2017, 07:50.

      Comment


      • #4
        Thank you William and Andrew, both for the specific details addressing the questions and broader suggestions on how to post thorough questions on Statalist.

        Following Andrew's suggestion, it does seem to be the result of rounding differences and possibly the rescaling of weights when using aweight (code below). The weights sum to .99 greater than N.

        Code:
        tab race [aweight=wtssall]
        
            race of |
         respondent |      Freq.     Percent        Cum.
        ------------+-----------------------------------
              white | 50,320.945       80.56       80.56
              black | 8,431.5265       13.50       94.06
              other | 3,713.5289        5.94      100.00
        ------------+-----------------------------------
              Total |     62,466      100.00
        
        
        table race [pweight=wtssall]
        
        ----------------------
        race of   |
        responden |
        t         |      Freq.
        ----------+-----------
            white |   50,321.7
            black |   8,431.66
            other |   3,713.59
        ----------------------
        
        di 0.8056 * (50321.7 + 8431.66 + 3713.59)
        50323.375
        
        di 50321.7 / (50321.7 + 8431.66 + 3713.59)
        .80557319
        
        
        
        sum wtssall
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
             wtssall |     62,466    1.000016    .4619267   .3918251   8.739876
        
        display r(sum)
        62466.99

        Comment

        Working...
        X