Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic regression with complex survey design and bootstrap weights

    Hi everyone, total STATA noob here - I've just started to learn the package to make use of some statistics canada survey data from the GSS, which has bootstrap weights to properly estimate the variability of the estimate (I think!)

    I have the survey setup syntax:

    Code:
    . svyset [pweight=wght_per], bsrweight(wtbs_001- wtbs_500) bsn(25) vce(bootstrap) dof(500) mse
    with 500 bootstrap weights per observation.

    then, running:

    Code:
    . svy: logistic binge_yesno ib(0).sex_01 ib(0).vismin ib(0).imprel
    I get odds ratios that are in agreement with what I have had before in SPSS (which can't handle bootstrap weights), but the weights make the standard errors HUGE and nothing is even close to significant in the subpopulation I am using - which is totally cool, as long as I haven't done something totally horrible which I think I may have.

    Major questions: Is there something that I must do to adjust these bootstrap weights for a subpopulation? the overall survey has ~33000 observations, with my population of interest being ~1250.

    When using survey design and running logistic regression, I don't seem to get any pseudo R-squared values, is there some way to get this or is this prohibited by design? Similarly, I am not sure if I am then able to run goodness of fit tests that give any useful results.

    Another important note is that in my subpopulation, the survey is estimating the values for 1.2 mil people.

    Thanks so much for reading this, I'm very clueless and would appreciate any advice on anything you pick up.

  • #2
    Some insights to one element of your question can be found here:

    https://www.statalist.org/forums/for...n-svy-logistic

    Question:
    Can I get a pseudo r-squared in SVY logistic ?

    In short the answer is no. However, there are suggestions that other models (non SVY) may give you something close to what you want.

    Comment


    • #3

      Use the subpop() option to get correct standard errors. For example, if you have a 0-1 indicator Z for your subpopulation, use one of:
      Code:
      svy, subpop(if  Z):  logistic binge_yesno ib(0).sex_01 ib(0).vismin ib(0).imprel
      svy, subpop(if  Z==1): logistic binge_yesno ib(0).sex_01 ib(0).vismin ib(0).imprel
      Unfortunately, this (correct) analysis will increase standard errors even further. For more information see: https://www.stata.com/manuals/svysub...estimation.pdf
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment

      Working...
      X