Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate weights?

    Hi experts!

    If for the sub sample(total number=n), black is 50% and whites is 50%; and for the full sample(total number=N), black is 75% and white is 25%. So in the full sample, the probability(p) for black is 0.75 and the p for white is 0.25. How can I calculate the weights(wt) that I want to include as [pweight=wt] when I run the regression model for the subsample?

    Let's say Pi is the probability for each participant in the sub sample (0.75 or 0.25 in this case), is the weight= (Pi / sum of Pi) *N? N is the total number of the full sample. Thanks everyone!
    Last edited by Dan Su; 08 Mar 2017, 13:53. Reason: weighting

  • #2
    I know if

    NB = # blacks in population
    nb = # blacks in subsample
    NW = # whites in population
    nb = # whites in subsample

    then
    weight for blacks = NB/nb
    weight for whites = NW/nw

    In this simple example, it should work out to be a weight of 1.5 for blacks and .5 for whites.

    However, since I still need to use other covariates such as age, gender, etc. besides race to predict the probability being selected for the subsample, I can't simply calculated the weights like that. If I already use "logistic" to predict the weights for each person in the subsample, what's the formula to calculate the weights?

    Comment


    • #3
      Steve Samuels Hi Steve, can you help me with this? Thanks!

      Comment


      • #4

        This "formula" is correct.

        That said, this text may be helpful to you.

        Sampling weights: There are several types of weights that can be associated with a survey. Perhaps the most common is the sampling weight. A sampling weight is a probability weight that has had one or more adjustments made to it. Both a sampling weight and a probability weight are used to weight the sample back to the population from which the sample was drawn. By definition, a probability weight is the inverse of the probability of being included in the sample due to the sampling design (except for a certainty PSU, see below). The probability weight, called a pweight in Stata, is calculated as N/n, where N = the number of elements in the population and n = the number of elements in the sample.For example, if a population has 10 elements and 3 are sampled at random with replacement, then the probability weight would be 10/3 = 3.33.
        Best regards,

        Marcos

        Comment


        • #5
          Originally posted by Marcos Almeida View Post
          This "formula" is correct.

          That said, this text may be helpful to you.

          Thanks Marcos! Just want to make sure, the weight for each participant: weight= (Pi / sum of Pi) *N is correct? N is the total number of the full sample. Pi is the probability for each participant being selected for the sub sample. Thanks!

          Comment


          • #6
            Hello Dan,

            The core principle is: n or percentage for a given level of a variable in the population and n or percentage for the very same level of a given variable in the sample.

            There are examples of weights in this text, relying on one or more variables.

            I believe they apply exactly to your needs.
            Best regards,

            Marcos

            Comment

            Working...
            X