Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • svy: subpop question

    Have a question regarding subpop.

    I am working with a large national healthcare dataset (NHAMCS).

    I created a new variable for a particular diagnosis using the following command:
    gen COPDD=0
    replace COPDD=1 if (DIAG1R==149121 | DIAG1R==149122 | DIAG1R==149190 | DIAG1R==149280 | DIAG1R==149600) & DIAG2R!=148100 & DIAG2R!=148210 & DIAG2R!=148242 & DIAG2R!=148290 & DIAG2R!=148500 & DIAG2R!=148600 & DIAG3R!=148100 & DIAG3R!=148210 & DIAG3R!=148242 & DIAG3R!=148290 & DIAG3R!=148500 & DIAG3R!=148600


    I now want to confirm successful creation of the new variable by tabulating DIAG1R.

    If I use "svy: tabulate DIAG1R if COPDD==1", the output looks correct in terms of my having successfully isolated the "DIAG1R" variable values I was concerned with:
    svy: tabulate DIAG1R if COPDD==1
    (running tabulate on estimation sample)

    Number of strata = 32 Number of obs = 693
    Number of PSUs = 267 Population size = 2,905,236
    Design df = 235

    ----------------------
    Diagnosis |
    #1 - |
    numeric |
    recode | proportion
    ----------+-----------
    CB_ACUTE | .516
    CB_W_AB | .04
    CB_NOS | .0124
    EMPHYSEM | .0247
    COPD | .4069

    Total | 1
    ----------------------
    Key: proportion = cell proportion


    BUT, if I use "svy, subpop(if COPDD==1): tabulate DIAG1R" stata tells me there are "too many values".

    what am I doing wrong?

    Thanks for any help

  • #2
    First, your original code is much lengthier and difficult to follow than it needs to be. You can accomplish the same thing with:

    Code:
    gen COPDD = inlist(DIAG1R, 149121, 149122, 149190, 149280, 149600) ///
        & !inlist(DIAG2R, 148100, 148210, 148242, 148290, 148500, 148600) ///
        & !inlist(DIAG3R, 148100, 148210, 148242, 148290, 148500, 148600)
    making the code much easier to understand, and saving yourself a lot of keystrokes as well.

    Next, I don't understand how either of the svy: tabulate commands would help you verify that you have correctly generated the variable COPDD.

    Be that as it may, the difference between -svy: tabulate if...- and -svy, subpop(if...): tabulate- is that the former completely ignores all observations in which the -if- condition is not met. By contrast, the latter has to take those observations into account in order to correctly calculate the estimates. Consequently, -svy, subpop(if...)- uses a larger (often much larger) estimation sample that would encompass more values of DIAG1R. For details and a deeper explanation and references, see the subpopulation estimation chapter of the [SVY] manual that is part of your Stata installation.

    Comment


    • #3
      Thank you for your prompt response, and advice regarding coding. I understand the concept of "subpop" and that it includes all observations for the purpose of estimation procedures.

      With regard to verifying the correct variable, though, since I am replacing "COPDD" with "1" based on specific criteria for "DIAG1R", I would expect that if I tabulate "DIAG1R "for the subpop of "COPDD==1", that it would return only the values for "DIAG1R" which I originally defined (while using the observations from the entire dataset to calculate standard errors).

      Am I incorrect in my understanding?

      Comment


      • #4
        If you are just trying to check your coding, is it necessary for the standard errors to be right (or to even use the svy: prefix at all)?
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Not necessary for the standard errors to be right, but I want to make sure that when I use the svy, subpop(if COPDD==1) command in my subsequent analyses, that it is in fact limiting the calculations to the subpopulation of interest. So the fact that it doesn't seem to be returning the expected values for DIAG1R makes me concerned.

          Comment


          • #6
            Clyde's argument as to why you get the error when using subpop is interesting and would not have occurred to me. It sounds to me like the cases you want are getting selected correctly, and so long as you don't do something like tab that generates an error maybe you are ok.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment

            Working...
            X