How to calculate weights?

Dan Su

Join Date: Mar 2017

Posts: 29
#1

How to calculate weights?

08 Mar 2017, 13:52

Hi experts!

If for the sub sample(total number=n), black is 50% and whites is 50%; and for the full sample(total number=N), black is 75% and white is 25%. So in the full sample, the probability(p) for black is 0.75 and the p for white is 0.25. How can I calculate the weights(wt) that I want to include as [pweight=wt] when I run the regression model for the subsample?

Let's say Pi is the probability for each participant in the sub sample (0.75 or 0.25 in this case), is the weight= (Pi / sum of Pi) *N? N is the total number of the full sample. Thanks everyone!

Last edited by Dan Su; 08 Mar 2017, 13:53. Reason: weighting
Tags: pweight, sample, statistics, weighting, weights
Dan Su

Join Date: Mar 2017

Posts: 29
#2

08 Mar 2017, 14:10

I know if

NB = # blacks in population
nb = # blacks in subsample
NW = # whites in population
nb = # whites in subsample

then
weight for blacks = NB/nb
weight for whites = NW/nw

In this simple example, it should work out to be a weight of 1.5 for blacks and .5 for whites.

However, since I still need to use other covariates such as age, gender, etc. besides race to predict the probability being selected for the subsample, I can't simply calculated the weights like that. If I already use "logistic" to predict the weights for each person in the subsample, what's the formula to calculate the weights?
Comment
Dan Su

Join Date: Mar 2017

Posts: 29
#3

10 Mar 2017, 11:37

Steve Samuels Hi Steve, can you help me with this? Thanks!
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

10 Mar 2017, 13:50

This "formula" is correct.

That said, this text may be helpful to you.

Sampling weights: There are several types of weights that can be associated with a survey. Perhaps the most common is the sampling weight. A sampling weight is a probability weight that has had one or more adjustments made to it. Both a sampling weight and a probability weight are used to weight the sample back to the population from which the sample was drawn. By definition, a probability weight is the inverse of the probability of being included in the sample due to the sampling design (except for a certainty PSU, see below). The probability weight, called a pweight in Stata, is calculated as N/n, where N = the number of elements in the population and n = the number of elements in the sample.For example, if a population has 10 elements and 3 are sampled at random with replacement, then the probability weight would be 10/3 = 3.33.

Best regards,

Marcos
Comment
Dan Su

Join Date: Mar 2017

Posts: 29
#5

13 Mar 2017, 08:08

Originally posted by Marcos Almeida View Post

This "formula" is correct.

That said, this text may be helpful to you.

Thanks Marcos! Just want to make sure, the weight for each participant: weight= (Pi / sum of Pi) *N is correct? N is the total number of the full sample. Pi is the probability for each participant being selected for the sub sample. Thanks!
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

13 Mar 2017, 09:10

Hello Dan,

The core principle is: n or percentage for a given level of a variable in the population and n or percentage for the very same level of a given variable in the sample.

There are examples of weights in this text, relying on one or more variables.

I believe they apply exactly to your needs.

Best regards,

Marcos
Comment

Announcement

How to calculate weights?

Comment

Comment

Comment

Comment

Comment