Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I calculate the standard deviation with weight??

    Dear All,
    I have survey data. It's collected through stratified sampling method. I set a weight which means the inverse of the probability of the observation is included. Therefore,when I calculate the mean or run regression, I should use "pweight". But pweight can't be used to calculate sd, then what should I do to calculate the standard deviation? (I use "collapse" to calculate mean\median\sd)
    Thank you!
    Last edited by Stephen Wee; 22 Feb 2020, 00:33.

  • #2
    You didn't get a quick answer. You will increase your chances of a useful answer by following the FAQ on asking questions – provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    The conventional way to calculate summary statistics is the summarize command. It does allow weights. I don't do survey data so I can't comment on whether this is appropriate for survey data.

    Comment


    • #3
      my guess is that what you are calling the "standard deviation" is what Stata calls the standard error for survey data; first, of course, you need to -svyset- your data; then
      Code:
      help mean
      within the help for -mean-, click on "Methods and Formulas" under "links to pdf documentation"; there you will see a formula for the variance followed by the following statement: "The standard error of the mean is the square root of the variance."

      since you provided no real information on the survey, I have no idea whether -collapse- is giving you anything meaningful

      if this is not what you mean, please clarify

      Comment


      • #4
        Unfortunately, sum does not work with pweights. Instead, svyset your data, and check out -help svy_estat-. Here is an example:
        Code:
        . webuse nhanes2f, clear
        
        . svyset
        
              pweight: finalwgt
                  VCE: linearized
          Single unit: missing
             Strata 1: stratid
                 SU 1: psuid
                FPC 1: <zero>
        
        . svy: mean health weight height
        (running mean on estimation sample)
        
        Survey: Mean estimation
        
        Number of strata =      31      Number of obs   =       10,335
        Number of PSUs   =      62      Population size =  116,997,257
                                        Design df       =           31
        
        --------------------------------------------------------------
                     |             Linearized
                     |       Mean   Std. Err.     [95% Conf. Interval]
        -------------+------------------------------------------------
              health |   3.607662   .0228896      3.560979    3.654346
              weight |   71.91131   .1670327      71.57065    72.25198
              height |   168.4647   .1471856      168.1645    168.7649
        --------------------------------------------------------------
        
        . estat sd
        
        -------------------------------------
                     |       Mean   Std. Dev.
        -------------+-----------------------
              health |   3.607662    1.148297
              weight |   71.91131    15.43409
              height |   168.4647    9.702569
        -------------------------------------
        
        .
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Originally posted by Richard Williams View Post
          Unfortunately, sum does not work with pweights. Instead, svyset your data, and check out -help svy_estat-. Here is an example:
          Code:
          . webuse nhanes2f, clear
          
          . svyset
          
          pweight: finalwgt
          VCE: linearized
          Single unit: missing
          Strata 1: stratid
          SU 1: psuid
          FPC 1: <zero>
          
          . svy: mean health weight height
          (running mean on estimation sample)
          
          Survey: Mean estimation
          
          Number of strata = 31 Number of obs = 10,335
          Number of PSUs = 62 Population size = 116,997,257
          Design df = 31
          
          --------------------------------------------------------------
          | Linearized
          | Mean Std. Err. [95% Conf. Interval]
          -------------+------------------------------------------------
          health | 3.607662 .0228896 3.560979 3.654346
          weight | 71.91131 .1670327 71.57065 72.25198
          height | 168.4647 .1471856 168.1645 168.7649
          --------------------------------------------------------------
          
          . estat sd
          
          -------------------------------------
          | Mean Std. Dev.
          -------------+-----------------------
          health | 3.607662 1.148297
          weight | 71.91131 15.43409
          height | 168.4647 9.702569
          -------------------------------------
          
          .
          Thank you. I will try this.

          Comment

          Working...
          X