Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does subpopulation size matter?

    Hi,
    I've been analyzing the survey which is General Social Survey (n=23,093) in Canada. While I'm interested in a small subpopulation (n=214), I'm wondering if I can use this small subpopulation for logistic regressions. Since I could apply survey weights which make the subpopulation representing about 260,000, I thought it would be okay to use it. However, I now want to make sure of it.
    Thank you for your help in advance.

    Raphael from Edmonton, AB

  • #2
    Hello Raphael,

    Welcome to the Stata Forum!

    According to Heeringa, West and Berglung ( Applied Survey Data Analysis, CRC Press, chapters 4 and 7), yes, it matters.

    The authors recommended the unconditional subclass analysis, because it preserves the main survey design and it provides larger standard errors. It may be considered as less biased, so to speak,if compared to the conditional analysis.

    Under the conditional analysis, though, subpopulations may entail fewer degrees of freedom, and that can increase the number of significant p-values (biased results). So far, so good.

    Overall, among the pitfalls, it is necessary to remark: small sample size associated with high design effects; uneven distribution of cases between strata and clusters.

    Still according to the authors,it is important to know the distribution of the subpopulation across PSUs and strata. They classified the distribution in three classes: strong concentration in some PSUs and Strata; uneven and sparse distribution; even and sparse distribution.

    Again according to the authors, "analysts often push the limits" in the first two classes.

    In short, the best scenario (you may read it as "the less biased" alternative) happens when there is an even and sparse pattern of distribution.

    Hopefully that helped!

    Best,

    Marcos
    Best regards,

    Marcos

    Comment

    Working...
    X