You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
Without a reference to the specific LQAS manual, it's difficult to know what the local consultant did. However it is apparent that the consultant ignored the multistage character of the design.
What is not clear is whether your target population is 1) all people in the 21 districts served by the provider organizations or 2) all people in the 21 districts who might meet the criteria for service by a PO.
To address your questions
1. svyset statement
.As only one sub-county is selected per district, you are right to omit a sub-county stage. You have also omitted a respondent stage, which would ordinarily mean that all respondents are analyzed (fpc = 1). I would modify the statement to:
Code:
svyset District [pweight=sampweights], fpc(fpc1) || PO (fpc2) || _n
Technically , this is valid only for a design in which respondents were randomly selected with replacement. This, of course, is not true, but it is a way to avoid the assumption that fpc = 1 at the respondent stage.
2. Sampling weight
Add a factor for number of respondents in each PO to your formula. This treats the response rate in the PO as a sampling rate, so can be considered a non-response adjustment.
sampweight = (21/6)*(#of sub-counties in discrict/1)*(#of POs in sub-county/3)*( #members in PO/# responding in PO)
3. finite population corrections
fpc1 = 6/21
fpc2 for a PO = 3/(number of POs in the selected sub-county.
Equivalently
fpc1 = 21
fpc2 for the selected sub-county = (number of POs in the selected sub-county.
.
4. Post-stratification
I'm not sure what you meant by postweights to "account for non-response across districts". There is potentially different response rate in each PO. The last term in the sampling weight definitions automatically corrects for differential response. (It does not remove bias caused by differences between responders and non-responders.)
I think what you are seeking is how to use external information so that the sample better represents the population. When this is done to, say, match the age distribution of the sample to that of the population, you use the poststratum option in svyset. However you may have external information on several factors for all people served by a PO in all 21 districts, for example:
• male-female percentages
* whether the PO serves a rural area or a more urban area
• the number of people served by each PO (this can be a rough count)
• whether a district is "large" or "small"
These last two are particularly important. If there are a few "large" districts or "large" POs in a subcounty, a simple random sample is apt to miss them. The preferred method for sampling units of different sizes is sampling with probability proportional to size (PPS).
If the weighted sample distributions for these factors differs much from the external information, you can try to apply post-stratification techniques. For a single classification, you can use the poststratum option of svyset, as mentioned.. To simultaneously post-stratify on several factors, use ipfweight by Michael Bergmann or survwgt rake by Nick Winter, both at SSC. John D'Souza's calibrate (followed by calibest) (SSC) can control for difference in sample and population means of quantitative characteristics.
Reference:
Battaglia, M. P., Hoaglin, D. C., & Frankel, M. R. (2013). Practical considerations in raking survey data. Survey Practice, 2(5).
(This illustrates method of ipfweight and survwgt rake)
Correction: The Stata Manual references in Post #5 are not correct. A corrected and perhaps clearer version
The reduced standard error after svyset after svyset with the post-stratification option is not a bug. svyset not only computes post-stratified weights (as the other commands do), but Stata's svy commands modify the variance calculations by considering each observation's difference from it's post-stratum mean See: Equation (1) in the Methods and Formula section of the poststratification Manual entry (p 56, Stata 14 SVY manual)
This implies that if you Lars do the simultaneous post-stratification by organization, age, and gender, then that would be preferred.
Comment