Sampling weights or post-stratification

Marie Lyfe

Join Date: Jan 2022

Posts: 1
#1

Sampling weights or post-stratification

13 Jan 2022, 10:10

Hello,

Background:
We administered a questionnaire to the entire student body of a school (i.e., 461 students), but only 298 responded to the questionnaire (i.e., 163 non-respondents). We want our sample to be representative of our reference population jointly by gender (male and female) and by field of study (there are 4 fields of study in our school), i.e. 8 distinct categories. After comparing the distribution of our reference population (461) and our sample (298) for each of the 8 gender-stream categories, we found that some categories in our sample were under- or over-represented compared to the reference population.

The gender-field of study information is available for all 461 students (i.e. for both respondents and non-respondents). It is also important to note that some of the respondents did not answer all the questions: for some questions there are not 298 answers but 235 for example.

Questions:
I am a little confused. I don't know if I should use sampling weights (pweights) or poststratification and if the notation in my formula is correct.

Here is my reasoning: since the study was sent to the entire reference population and not a sample:
1. I don't need to adjust for sampling design by calculating the inverse of the probability of inclusion in the sample (pweight).
2. what I want to do is purely adjust for non-response, which varies from question to question.

So I think poststratification is the best solution. This method also allows to adjust the weighting for each question according to the number of answers to the total number of students in the school (i.e., 461).

I use the following formula:
svyset _n, poststrata(sexfil) postweight(n_type) fpc(n_pop)

Sexfil N_type N_pop wgt

1 100 461 100/62

2 134 461 134/60

3 51 461 51/36

4 66 461 66/47

5 29 461 29/29

6 16 461 16/13

7 11 461 11/9

8 54 461 54/42

where sexfil indicates the category to which the student belongs, n_type indicates the total number of students in the reference population in each category and n_pop indicates the total number of students in the reference population. [The latest column represents the inverse of the probability of inclusion in the sample for each category (assuming 298 respondents) in case I would have use the pweight option : svyset [pw=wgt], strata(sexfil) fpc(n_pop).]

Does this seem correct to you ?

Hopefully my question is clear, thanks for your help!
Tags: None

1 like

Sexfil	N_type	N_pop	wgt
1	100	461	100/62
2	134	461	134/60
3	51	461	51/36
4	66	461	66/47
5	29	461	29/29
6	16	461	16/13
7	11	461	11/9
8	54	461	54/42

Announcement

Sampling weights or post-stratification