Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing rho test in svy, subpop(): heckprobit

    Dear All,

    I am running a probit regression with sample selection. Here, the parameter rho is supposed to indicate the presence of sample selection. As my data has a complex survey design, I am using the svy prefix with subpop() option.

    However, when I run the model, I am not getting the Wald test results for rho. I understand lrtest can't be performed in the presence of weights, but not sure why a Wald test result is not available as well.

    For instance when I run the following (hypothetical example):
    Code:
    webuse nhanes2f
    
    svyset psuid [pweight=finalwgt], strata(stratid)
    
    heckprob heartatk black zinc age age2 weight, select(rural = female  orace )
    
    heckprob heartatk black zinc age age2 weight, select(rural = female  orace ) vce(robust)
    In both, instances, I got either the likelihood test or Wald test results in the end.

    But when I run:
    Code:
    svy: heckprob heartatk black zinc age age2 weight, select(rural = female  orace )
    
    svy, subpop(black ): heckprob heartatk zinc age age2 weight, select(rural = female  orace )
    In neither case, I am getting the Wald test results. I tried testparm _b[rho] as well but turns out it is stored as a scalar.

    Please kindly assist. Thank you in advance.

  • #2
    See Jeff Pitblado (StataCorp)'s reply #17 from the following thread on statistics based on the fitted log-likelihood in -svy- estimations: https://www.statalist.org/forums/for...-working/page2. His recommendation is that in the absence of stratification, you can use -pweights- instead of -svy- estimation, clustering on the PSU variable. So in your example, ignoring that we have stratification and given that "psuid" is your PSU variable (should have more than 30 levels), something like:

    Code:
    webuse nhanes2f, clear
    set seed 03152022
    replace psuid= runiformint(1, 200)
    svyset psuid [pweight=finalwgt]
    svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
    heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
    Res.:

    Code:
     svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
    (running heckprob on estimation sample)
    
    Survey: Probit model with sample selection
    
    Number of strata   =         1                Number of obs     =        9,957
    Number of PSUs     =       200                Population size   =  113,438,880
                                                  Design df         =          199
                                                  F(   5,    195)   =         1.53
                                                  Prob > F          =       0.1820
    
    ------------------------------------------------------------------------------
                 |             Linearized
                 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    heartatk     |
           black |  -.0627992    .049964    -1.26   0.210    -.1613261    .0357276
            zinc |   .0003236   .0007469     0.43   0.665    -.0011492    .0017964
             age |   .0298621   .0173487     1.72   0.087    -.0043488     .064073
            age2 |   -.000204   .0001283    -1.59   0.113    -.0004571    .0000491
          weight |   .0003338   .0008906     0.37   0.708    -.0014223      .00209
           _cons |  -.5114438   .6924987    -0.74   0.461    -1.877021    .8541336
    -------------+----------------------------------------------------------------
    rural        |
          female |   -.097242   .0297931    -3.26   0.001    -.1559927   -.0384912
           orace |  -.5042324   .1298253    -3.88   0.000    -.7602422   -.2482226
           _cons |   -.475335   .0207137   -22.95   0.000    -.5161816   -.4344884
    -------------+----------------------------------------------------------------
         /athrho |  -3.105201   .7721744    -4.02   0.000    -4.627895   -1.582506
    -------------+----------------------------------------------------------------
             rho |  -.9959912   .0061786                     -.9998089   -.9189924
    ------------------------------------------------------------------------------
    
    . 
    . heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
    
    Probit model with sample selection              Number of obs     =      9,957
                                                          Selected    =      3,417
                                                          Nonselected =      6,540
    
                                                    Wald chi2(5)      =       7.81
    Log pseudolikelihood = -7.28e+07                Prob > chi2       =     0.1671
    
                                    (Std. Err. adjusted for 200 clusters in psuid)
    ------------------------------------------------------------------------------
                 |               Robust
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    heartatk     |
           black |  -.0627992    .049964    -1.26   0.209    -.1607269    .0351284
            zinc |   .0003236   .0007469     0.43   0.665    -.0011403    .0017875
             age |   .0298621   .0173487     1.72   0.085    -.0041407     .063865
            age2 |   -.000204   .0001283    -1.59   0.112    -.0004556    .0000475
          weight |   .0003338   .0008906     0.37   0.708    -.0014116    .0020793
           _cons |  -.5114438   .6924987    -0.74   0.460    -1.868716    .8458288
    -------------+----------------------------------------------------------------
    rural        |
          female |   -.097242   .0297931    -3.26   0.001    -.1556354   -.0388485
           orace |  -.5042324   .1298253    -3.88   0.000    -.7586853   -.2497795
           _cons |   -.475335   .0207137   -22.95   0.000    -.5159331   -.4347368
    -------------+----------------------------------------------------------------
         /athrho |  -3.105201   .7721744    -4.02   0.000    -4.618635   -1.591766
    -------------+----------------------------------------------------------------
             rho |  -.9959912   .0061786                     -.9998053   -.9204197
    ------------------------------------------------------------------------------
    Wald test of indep. eqns. (rho = 0): chi2(1) =    16.17   Prob > chi2 = 0.0001
    
    .
    Last edited by Andrew Musau; 15 Mar 2022, 05:24.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      See Jeff Pitblado (StataCorp)'s reply #17 from the following thread on statistics based on the fitted log-likelihood in -svy- estimations: https://www.statalist.org/forums/for...-working/page2. His recommendation is that in the absence of stratification, you can use -pweights- instead of -svy- estimation, clustering on the PSU variable. So in your example, ignoring that we have stratification and given that "psuid" is your PSU variable (should have more than 30 levels), something like:

      Code:
      webuse nhanes2f, clear
      set seed 03152022
      replace psuid= runiformint(1, 200)
      svyset psuid [pweight=finalwgt]
      svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
      heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
      Res.:

      Code:
      svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
      (running heckprob on estimation sample)
      
      Survey: Probit model with sample selection
      
      Number of strata = 1 Number of obs = 9,957
      Number of PSUs = 200 Population size = 113,438,880
      Design df = 199
      F( 5, 195) = 1.53
      Prob > F = 0.1820
      
      ------------------------------------------------------------------------------
      | Linearized
      | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      heartatk |
      black | -.0627992 .049964 -1.26 0.210 -.1613261 .0357276
      zinc | .0003236 .0007469 0.43 0.665 -.0011492 .0017964
      age | .0298621 .0173487 1.72 0.087 -.0043488 .064073
      age2 | -.000204 .0001283 -1.59 0.113 -.0004571 .0000491
      weight | .0003338 .0008906 0.37 0.708 -.0014223 .00209
      _cons | -.5114438 .6924987 -0.74 0.461 -1.877021 .8541336
      -------------+----------------------------------------------------------------
      rural |
      female | -.097242 .0297931 -3.26 0.001 -.1559927 -.0384912
      orace | -.5042324 .1298253 -3.88 0.000 -.7602422 -.2482226
      _cons | -.475335 .0207137 -22.95 0.000 -.5161816 -.4344884
      -------------+----------------------------------------------------------------
      /athrho | -3.105201 .7721744 -4.02 0.000 -4.627895 -1.582506
      -------------+----------------------------------------------------------------
      rho | -.9959912 .0061786 -.9998089 -.9189924
      ------------------------------------------------------------------------------
      
      .
      . heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
      
      Probit model with sample selection Number of obs = 9,957
      Selected = 3,417
      Nonselected = 6,540
      
      Wald chi2(5) = 7.81
      Log pseudolikelihood = -7.28e+07 Prob > chi2 = 0.1671
      
      (Std. Err. adjusted for 200 clusters in psuid)
      ------------------------------------------------------------------------------
      | Robust
      | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      heartatk |
      black | -.0627992 .049964 -1.26 0.209 -.1607269 .0351284
      zinc | .0003236 .0007469 0.43 0.665 -.0011403 .0017875
      age | .0298621 .0173487 1.72 0.085 -.0041407 .063865
      age2 | -.000204 .0001283 -1.59 0.112 -.0004556 .0000475
      weight | .0003338 .0008906 0.37 0.708 -.0014116 .0020793
      _cons | -.5114438 .6924987 -0.74 0.460 -1.868716 .8458288
      -------------+----------------------------------------------------------------
      rural |
      female | -.097242 .0297931 -3.26 0.001 -.1556354 -.0388485
      orace | -.5042324 .1298253 -3.88 0.000 -.7586853 -.2497795
      _cons | -.475335 .0207137 -22.95 0.000 -.5159331 -.4347368
      -------------+----------------------------------------------------------------
      /athrho | -3.105201 .7721744 -4.02 0.000 -4.618635 -1.591766
      -------------+----------------------------------------------------------------
      rho | -.9959912 .0061786 -.9998053 -.9204197
      ------------------------------------------------------------------------------
      Wald test of indep. eqns. (rho = 0): chi2(1) = 16.17 Prob > chi2 = 0.0001
      
      .
      Thank you so much. So in the presence of stratification, I will just have to accept that as it is? Is there anyway to test rho manually, given I have coefficient, se and CI interval?

      Comment


      • #4
        My understanding is that such statistics which depend on the log-likelihood are invalid with -svy- estimation.

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          My understanding is that such statistics which depend on the log-likelihood are invalid with -svy- estimation.
          Thanks a lot for the help!

          Comment


          • #6
            I found the following Stata FAQ that gives more details: https://www.stata.com/support/faqs/s...od-ratio-test/. Likelihood-ratio tests are invalid both with p-weighted data and -svy- estimation. Wald tests are fine with both. With -svy- estimation, you get an adjusted Wald test where the adjustment is needed if the total number of clusters is small \((\lessapprox100)\). So just test whether the coefficient on /athro is equal to zero. Note that /athro is just a transformation of rho.

            Code:
            webuse nhanes2f, clear
            set seed 03152022
            replace psuid= runiformint(1, 95)
            svyset psuid [pweight=finalwgt]
            heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
            test  _b[/athrho]=0
            svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
            test  _b[/athrho]=0
            Res.:

            Code:
            .
            . heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
            
            Probit model with sample selection              Number of obs     =      9,957
                                                                  Selected    =      3,417
                                                                  Nonselected =      6,540
            
                                                            Wald chi2(5)      =       6.64
            Log pseudolikelihood = -7.28e+07                Prob > chi2       =     0.2491
            
                                             (Std. Err. adjusted for 95 clusters in psuid)
            ------------------------------------------------------------------------------
                         |               Robust
                         |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            heartatk     |
                   black |  -.0627992   .0431665    -1.45   0.146    -.1474041    .0218056
                    zinc |   .0003236   .0007356     0.44   0.660    -.0011182    .0017654
                     age |   .0298621   .0168563     1.77   0.076    -.0031757    .0628999
                    age2 |   -.000204   .0001258    -1.62   0.105    -.0004506    .0000426
                  weight |   .0003338   .0008814     0.38   0.705    -.0013936    .0020613
                   _cons |  -.5114438   .6634953    -0.77   0.441    -1.811871    .7889831
            -------------+----------------------------------------------------------------
            rural        |
                  female |   -.097242   .0311652    -3.12   0.002    -.1583246   -.0361593
                   orace |  -.5042324   .1375608    -3.67   0.000    -.7738466   -.2346181
                   _cons |   -.475335    .021763   -21.84   0.000    -.5179897   -.4326803
            -------------+----------------------------------------------------------------
                 /athrho |  -3.105201   .7593972    -4.09   0.000    -4.593592   -1.616809
            -------------+----------------------------------------------------------------
                     rho |  -.9959912   .0060764                     -.9997953     -.92416
            ------------------------------------------------------------------------------
            Wald test of indep. eqns. (rho = 0): chi2(1) =    16.72   Prob > chi2 = 0.0000
            
            . 
            . test  _b[/athrho]=0
            
             ( 1)  [/]athrho = 0
            
                       chi2(  1) =   16.72
                     Prob > chi2 =    0.0000
            
            . svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
            (running heckprob on estimation sample)
            
            Survey: Probit model with sample selection
            
            Number of strata   =         1                Number of obs     =        9,957
            Number of PSUs     =        95                Population size   =  113,438,880
                                                          Design df         =           94
                                                          F(   5,     90)   =         1.27
                                                          Prob > F          =       0.2835
            
            ------------------------------------------------------------------------------
                         |             Linearized
                         |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            heartatk     |
                   black |  -.0627992   .0431665    -1.45   0.149    -.1485074    .0229089
                    zinc |   .0003236   .0007356     0.44   0.661     -.001137    .0017842
                     age |   .0298621   .0168563     1.77   0.080    -.0036065    .0633308
                    age2 |   -.000204   .0001258    -1.62   0.108    -.0004538    .0000458
                  weight |   .0003338   .0008814     0.38   0.706    -.0014162    .0020838
                   _cons |  -.5114438   .6634953    -0.77   0.443    -1.828829    .8059417
            -------------+----------------------------------------------------------------
            rural        |
                  female |   -.097242   .0311652    -3.12   0.002    -.1591212   -.0353628
                   orace |  -.5042324   .1375608    -3.67   0.000    -.7773626   -.2311022
                   _cons |   -.475335    .021763   -21.84   0.000     -.518546    -.432124
            -------------+----------------------------------------------------------------
                 /athrho |  -3.105201   .7593972    -4.09   0.000    -4.613001     -1.5974
            -------------+----------------------------------------------------------------
                     rho |  -.9959912   .0060764                     -.9998031   -.9212762
            ------------------------------------------------------------------------------
            
            
            
            .
            . test  _b[/athrho]=0
            
            Adjusted Wald test
            
             ( 1)  [/]athrho = 0
            
                   F(  1,    94) =   16.72
                        Prob > F =    0.0001
            
            .
            Last edited by Andrew Musau; 20 Mar 2022, 07:54.

            Comment


            • #7
              Originally posted by Andrew Musau View Post
              I found the following Stata FAQ that gives more details: https://www.stata.com/support/faqs/s...od-ratio-test/. Likelihood-ratio tests are invalid both with p-weighted data and -svy- estimation. Wald tests are fine with both. With -svy- estimation, you get an adjusted Wald test where the adjustment is needed if the total number of clusters is small \((\lessapprox100)\). So just test whether the coefficient on /athro is equal to zero. Note that /athro is just a transformation of rho.

              Code:
              webuse nhanes2f, clear
              set seed 03152022
              replace psuid= runiformint(1, 95)
              svyset psuid [pweight=finalwgt]
              heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
              test _b[/athrho]=0
              svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
              test _b[/athrho]=0
              Res.:

              Code:
              .
              . heckprob heartatk black zinc age age2 weight [pweight=finalwgt], select(rural = female orace) vce(cluster psuid) nolog
              
              Probit model with sample selection Number of obs = 9,957
              Selected = 3,417
              Nonselected = 6,540
              
              Wald chi2(5) = 6.64
              Log pseudolikelihood = -7.28e+07 Prob > chi2 = 0.2491
              
              (Std. Err. adjusted for 95 clusters in psuid)
              ------------------------------------------------------------------------------
              | Robust
              | Coef. Std. Err. z P>|z| [95% Conf. Interval]
              -------------+----------------------------------------------------------------
              heartatk |
              black | -.0627992 .0431665 -1.45 0.146 -.1474041 .0218056
              zinc | .0003236 .0007356 0.44 0.660 -.0011182 .0017654
              age | .0298621 .0168563 1.77 0.076 -.0031757 .0628999
              age2 | -.000204 .0001258 -1.62 0.105 -.0004506 .0000426
              weight | .0003338 .0008814 0.38 0.705 -.0013936 .0020613
              _cons | -.5114438 .6634953 -0.77 0.441 -1.811871 .7889831
              -------------+----------------------------------------------------------------
              rural |
              female | -.097242 .0311652 -3.12 0.002 -.1583246 -.0361593
              orace | -.5042324 .1375608 -3.67 0.000 -.7738466 -.2346181
              _cons | -.475335 .021763 -21.84 0.000 -.5179897 -.4326803
              -------------+----------------------------------------------------------------
              /athrho | -3.105201 .7593972 -4.09 0.000 -4.593592 -1.616809
              -------------+----------------------------------------------------------------
              rho | -.9959912 .0060764 -.9997953 -.92416
              ------------------------------------------------------------------------------
              Wald test of indep. eqns. (rho = 0): chi2(1) = 16.72 Prob > chi2 = 0.0000
              
              . 
              . test _b[/athrho]=0
              
              ( 1) [/]athrho = 0
              
              chi2( 1) = 16.72
              Prob > chi2 = 0.0000
              
              . svy: heckprob heartatk black zinc age age2 weight, select(rural = female orace)
              (running heckprob on estimation sample)
              
              Survey: Probit model with sample selection
              
              Number of strata = 1 Number of obs = 9,957
              Number of PSUs = 95 Population size = 113,438,880
              Design df = 94
              F( 5, 90) = 1.27
              Prob > F = 0.2835
              
              ------------------------------------------------------------------------------
              | Linearized
              | Coef. Std. Err. t P>|t| [95% Conf. Interval]
              -------------+----------------------------------------------------------------
              heartatk |
              black | -.0627992 .0431665 -1.45 0.149 -.1485074 .0229089
              zinc | .0003236 .0007356 0.44 0.661 -.001137 .0017842
              age | .0298621 .0168563 1.77 0.080 -.0036065 .0633308
              age2 | -.000204 .0001258 -1.62 0.108 -.0004538 .0000458
              weight | .0003338 .0008814 0.38 0.706 -.0014162 .0020838
              _cons | -.5114438 .6634953 -0.77 0.443 -1.828829 .8059417
              -------------+----------------------------------------------------------------
              rural |
              female | -.097242 .0311652 -3.12 0.002 -.1591212 -.0353628
              orace | -.5042324 .1375608 -3.67 0.000 -.7773626 -.2311022
              _cons | -.475335 .021763 -21.84 0.000 -.518546 -.432124
              -------------+----------------------------------------------------------------
              /athrho | -3.105201 .7593972 -4.09 0.000 -4.613001 -1.5974
              -------------+----------------------------------------------------------------
              rho | -.9959912 .0060764 -.9998031 -.9212762
              ------------------------------------------------------------------------------
              
              
              
              .
              . test _b[/athrho]=0
              
              Adjusted Wald test
              
              ( 1) [/]athrho = 0
              
              F( 1, 94) = 16.72
              Prob > F = 0.0001
              
              .
              This was incredibly helpful. Thank you so very much!

              Comment

              Working...
              X