Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • NIU (Not in Universe) Cases and Svyset

    Hello all,

    I am working on an NHIS dataset that has a lot of missing cases, leading partly to a problem I am trying to resolve (details here: http://www.statalist.org/forums/foru...lation-members).

    I am posting a separate question because of its generality.

    The dataset has a lot of respondents who were deemed ineligible for certain questions. For example, I am interested in HIV testing, a question that was asked only of sample adults aged 18+.
    years. Respondents under 18 yrs are tagged with an NIU code for the HIV and related variables.

    Before I do my svyset, do i delete these NIU responses from the data, since my key dependent variable is restricted to only those aged 18+ years or is it better to use the svy subpop option to restrict the sample to only those who are eligible respondents?

    Thanks - Yy



  • #2
    Hello Yawo,

    In order to try to provide an answer for your question, I decided to perform a couple of estimations so as to help you with the reflection about the results:

    Code:
    .   use http://www.stata-press.com/data/r14/nhanes2.dta
    
    . svyset psu [pweight=finalwgt], strata(strata)
    
          pweight: finalwgt
              VCE: linearized
      Single unit: missing
         Strata 1: strata
             SU 1: psu
            FPC 1: <zero>
    
    . gen highbp2 = highbp
    
    . replace highbp2 = . if race ==1
    (9,065 real changes made, 9,065 to missing)
    
    . svy, subpop(if race ==2): logit highbp2 sex age
    (running logit on estimation sample)
    
    Survey: Logistic regression
    
    Number of strata   =        30                Number of obs     =       10,013
    Number of PSUs     =        60                Population size   =  113,415,086
                                                  Subpop. no. obs   =        1,086
                                                  Subpop. size      =   11,189,236
                                                  Design df         =           30
                                                  F(   2,     29)   =        83.52
                                                  Prob > F          =       0.0000
    
    ------------------------------------------------------------------------------
                 |             Linearized
         highbp2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             sex |  -.1886847   .1284104    -1.47   0.152    -.4509337    .0735642
             age |   .0584824   .0044914    13.02   0.000     .0493098    .0676551
           _cons |  -2.347611    .312217    -7.52   0.000    -2.985243   -1.709979
    ------------------------------------------------------------------------------
    Note: 1 stratum omitted because it contains no subpopulation members.
    
    . drop if race ==1
    (9,065 observations deleted)
    
    . svy, subpop(if race ==2): logit highbp2 sex age
    (running logit on estimation sample)
    
    Survey: Logistic regression
    
    Number of strata   =        30                 Number of obs     =       1,281
    Number of PSUs     =        58                 Population size   =  14,100,211
                                                   Subpop. no. obs   =       1,086
                                                   Subpop. size      =  11,189,236
                                                   Design df         =          28
                                                   F(   0,     28)   =           .
                                                   Prob > F          =           .
    
    ------------------------------------------------------------------------------
                 |             Linearized
         highbp2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             sex |  -.1886847          .        .       .            .           .
             age |   .0584824          .        .       .            .           .
           _cons |  -2.347611          .        .       .            .           .
    ------------------------------------------------------------------------------
    Note: 1 stratum omitted because it contains no subpopulation members.
    Note: Missing standard errors because of stratum with single sampling unit.

    Hopefully that helps!

    Best,

    Marcos
    Last edited by Marcos Almeida; 29 Dec 2016, 07:22.
    Best regards,

    Marcos

    Comment


    • #3
      thanks - Marcos. Your illustration makes sense .... so, it is better to restrict the data using the sub-pop command.

      Now, just as a final shot, do you have any guidance on merging PSU's with single observations to adjoining ones ?

      thanks - Yy

      Comment


      • #4
        Hello Yawo,

        I suggest you take a look at the options in svyset for the option - sorry for the pun - singleunit(). The default, as you realized, is singleunit(missing).

        Best,

        Marcos
        Best regards,

        Marcos

        Comment

        Working...
        X