Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks Mike, I will look at this closely over the weekend. As you note, there should not be very large differences in the estimated SEs.

    Comment


    • #17
      Mike Lacy, it appears that the default and bootstrap SEs will differ quite a bit. To obtain similar results, you will need to specify that the regression reports bootstrap SEs, so that the marginal effects are computed with Model VCE: Bootstrap. The simplest way to illustrate is to use a linear model (the equivalent here is heckman) with no higher-order terms so that the marginal effect is equal to the estimated coefficient. With your bootstrapping approach, you get a similar SE as specifying -vce(bootstrap)- in the regression and this differs from the default SE.


      Code:
      use http://www.stata-press.com/data/r15/school, clear
      set seed 46534
      expand 5
      replace years = years + ceil(runiform() * 5) if (years < 6)
      set maxiter 100
      
      cap prog drop Mdydxyears
      prog Mdydxyears, rclass
      syntax, atspec(string) atvalrt(real) [wantse]
      local nose = cond("`wantse'" != "", "", "nose")
      heckman private c.years_sqrt, select(vote=c.years_sqrt loginc logptax) nolog
      if (_rc == 0 ) {  
          margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt' )
          return scalar dydx = el(r(b),1,1)
      }
      else {
         return scalar dydx = .
      }
      end
      local atval = 9
      local atvalrt = sqrt(`atval')
      cap gen years_sqrt = sqrt(years)
      
      
      *DEFAULT VCE+ MARGINS
      heckman private c.years_sqrt, select(vote=c.years_sqrt loginc logptax) nolog
      margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt')
      
      *BOOTSTRAPPING PROGRAM
      bootstrap dydx = r(dydx), reps(250) seed(83956): Mdydxyears, atspec(atmeans at(years_sqrt = `atvalrt')) atvalrt(`atvalrt')
      
      *VCE BOOTSTRAP + MARGINS
      heckman private c.years_sqrt, select(vote=c.years_sqrt loginc logptax) nolog vce(bootstrap, reps(250))
      margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt')
      Res.:

      Code:
      . heckman private c.years_sqrt, select(vote=c.years_sqrt loginc logptax) nolog
      
      Heckman selection model                         Number of obs     =        475
      (regression model with sample selection)              Selected    =        295
                                                            Nonselected =        180
      
                                                      Wald chi2(1)      =       1.61
      Log likelihood = -331.5698                      Prob > chi2       =     0.2047
      
      ------------------------------------------------------------------------------
                   |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      private      |
        years_sqrt |  -.0195032   .0153787    -1.27   0.205    -.0496449    .0106386
             _cons |   .1557381   .0463517     3.36   0.001     .0648904    .2465857
      -------------+----------------------------------------------------------------
      vote         |
        years_sqrt |  -.1679662   .0580665    -2.89   0.004    -.2817745   -.0541579
            loginc |   .9901304   .1960525     5.05   0.000     .6058745    1.374386
           logptax |  -1.273358   .2539647    -5.01   0.000    -1.771119   -.7755959
             _cons |   -.196412   1.835648    -0.11   0.915    -3.794216    3.401392
      -------------+----------------------------------------------------------------
           /athrho |   -.092222   .2044884    -0.45   0.652    -.4930119    .3085678
          /lnsigma |  -1.280428   .0423607   -30.23   0.000    -1.363453   -1.197402
      -------------+----------------------------------------------------------------
               rho |  -.0919615   .2027591                     -.4566037    .2991337
             sigma |   .2779185   .0117728                      .2557761    .3019777
            lambda |  -.0255578    .056616                      -.136523    .0854074
      ------------------------------------------------------------------------------
      LR test of indep. eqns. (rho = 0):   chi2(1) =     0.15   Prob > chi2 = 0.7010
      
      .
      . margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt')
      
      Conditional marginal effects                    Number of obs     =        475
      Model VCE    : OIM
      
      Expression   : Linear prediction, predict()
      dy/dx w.r.t. : years_sqrt
      at           : years_sqrt      =           3
                     loginc          =    9.971017 (mean)
                     logptax         =    6.939496 (mean)
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
       years_sqrt |  -.0195032   .0153787    -1.27   0.205    -.0496449    .0106386
      ------------------------------------------------------------------------------
      
      .
      .
      .
      . *BOOTSTRAPPING
      
      .
      . bootstrap dydx = r(dydx), reps(250) seed(83956): Mdydxyears, atspec(atmeans at(years_sqrt = `atvalrt')) atvalrt(`atvalrt')
      (running Mdydxyears on estimation sample)
      
      Bootstrap replications (250)
      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
      .x..xx.x..x.........x.x...x.x...x...x.x....x...xx.    50
      ...xx.x...x....x..xx.x........xx.x......xxx.x.....   100
      .x...xx.........x.x.x....x...xxx..x.x...x.xx..x...   150
      ....xx...x...xx....xx.xx.xx.x..xx......x.x.xx.....   200
      .x..x.x...x..x..xxxx.....x...x...x.xx....x.x......   250
      
      Bootstrap results                               Number of obs     =        475
                                                      Replications      =        170
      
            command:  Mdydxyears, atspec(atmeans at(years_sqrt = 3)) atvalrt(3)
               dydx:  r(dydx)
      
      ------------------------------------------------------------------------------
                   |   Observed   Bootstrap                         Normal-based
                   |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              dydx |  -.0195032   .0063436    -3.07   0.002    -.0319363     -.00707
      ------------------------------------------------------------------------------
      Note: One or more parameters could not be estimated in 80 bootstrap replicates;
            standard-error estimates include only complete replications.
      
      .
      
      .
      . *VCE BOOTSTRAP
      
      . 
      . heckman private c.years_sqrt, select(vote=c.years_sqrt loginc logptax) nolog vce(bootstrap, reps(250))
      (running heckman on estimation sample)
      
      Bootstrap replications (250)
      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
      x.xx.x....x.x......xx.x.x.....x.x......x..........    50
      x.x..xx..xx.x...xx.x.x.x..x....x.x......x...x.xxx.   100
      .xxx..x.....xx...x.....xx...x........x.xx.......x.   150
      ...x.x.........x...x...x..x.x............x..xx..xx   200
      x.....xx.x.x......xx.....x...x......x..x.xx..x...x   250
      
      Heckman selection model                         Number of obs     =        475
      (regression model with sample selection)              Selected    =        295
                                                            Nonselected =        180
      
                                                      Wald chi2(1)      =       9.06
      Log likelihood = -331.5698                      Prob > chi2       =     0.0026
      
      ------------------------------------------------------------------------------
                   |   Observed   Bootstrap                         Normal-based
                   |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      private      |
        years_sqrt |  -.0195032   .0064781    -3.01   0.003       -.0322   -.0068064
             _cons |   .1557381   .0326727     4.77   0.000     .0917008    .2197754
      -------------+----------------------------------------------------------------
      vote         |
        years_sqrt |  -.1679662    .072871    -2.30   0.021    -.3107908   -.0251417
            loginc |   .9901304   .1806636     5.48   0.000     .6360363    1.344224
           logptax |  -1.273358   .2570003    -4.95   0.000    -1.777069   -.7696463
             _cons |   -.196412   1.930115    -0.10   0.919    -3.979367    3.586543
      -------------+----------------------------------------------------------------
           /athrho |   -.092222   .0654878    -1.41   0.159    -.2205757    .0361317
          /lnsigma |  -1.280428     .08698   -14.72   0.000    -1.450905    -1.10995
      -------------+----------------------------------------------------------------
               rho |  -.0919615    .064934                     -.2170667     .036116
             sigma |   .2779185   .0241733                      .2343581    .3295755
            lambda |  -.0255578   .0186531                     -.0621172    .0110016
      ------------------------------------------------------------------------------
      LR test of indep. eqns. (rho = 0):   chi2(1) =     0.15   Prob > chi2 = 0.7010
      Note: One or more parameters could not be estimated in 74 bootstrap replicates;
            standard-error estimates include only complete replications.
      
      . 
      . margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt')
      
      Conditional marginal effects                    Number of obs     =        475
      Model VCE    : Bootstrap
      
      Expression   : Linear prediction, predict()
      dy/dx w.r.t. : years_sqrt
      at           : years_sqrt      =           3
                     loginc          =    9.971017 (mean)
                     logptax         =    6.939496 (mean)
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
        years_sqrt |  -.0195032   .0064781    -3.01   0.003       -.0322   -.0068064
      ------------------------------------------------------------------------------
      
      .
      Last edited by Andrew Musau; 21 Mar 2022, 08:25.

      Comment


      • #18
        Hi Andrew, Sorry to be slow in responding. I would not have thought of the idea that getting bootstrap SEs for the regression coefficients themselves would be so helpful--good idea and interesting. So, there are at least two sources of bias in estimating the SE of the derivatives, first the bias that comes from estimating the SE coefficients of the regression model itself, and second, the bias from the Delta method approximations of the SE of the the probability derivatives based on the SEs of those coefficients. What you suggest makes me think the first part the critical issue. I'll see if I can check that with some bootstrap experiments.

        Now, of course, there's the question of how well the sampling distribution and hence the CIs behave, but that's another problem

        Comment


        • #19
          I did a couple of simple bootstrap experiments to see how well using bootstrap standard errors for the underlying regression model would improve the estimates of the standard error given by -margins- for derivative estimates of nonlinear functions based on the coefficients from the regression. The short story is that it didn't work as consistently as one might hope. I see three ways to get such estimates of SE(dydx):

          1) Use the purely asymptotic estimates given by -margins- by default, which I'd call "pure asymptotic."
          2) Run the regression command with the vce(bootstrap) command to get the SE of the _b matrix, and let -margins- calculate its standard errors of dydx using those SE values. I'd call this "partial bootstrap," since the bootstrapping only affects the SE(_b) and not any other approximations involved in getting the SE(dydx).
          3. Run the regression command and the margins command within a wrapper program fed to -bootstrap-, which I'd call the "pure bootstrap" method.

          In the current case, the regression model of interest is -heckprobit-, and the derivative of interest is that for the probability of the outcome with respect to a variable X, where that X enters into the model as both a linear term and a sqrt() term. Andrew Musau suggested using what I call the "partial bootstrap,"
          which should improve the SE(dydx) by improving the SE(_b) on which it is based. My experiments -- two simple cases considered below-- show one situation
          in which this partial bootstrap approach works great, and another case in which it does almost no good at all. While these are not comprehensive experiments in the least, I'd say that until proven otherwise, I wouldn't trust the "partial bootstrap" approach to getting SE(dydx). I show the code for my experiments and a brief results summary below:

          Code:
          cap prog drop Mdydxyears
          prog Mdydxyears, rclass
          // wrapper for Musau approach to getting SE(dydx(years)) with years and sqrt(years)
          syntax, atspec(string) atvalrt(real) [wantse] [vce(string)]  // vce option allows vce(bootstrap)
          local nose = cond("`wantse'" != "", "", "nose") 
          heckprobit private c.years_sqrt##c.years_sqrt, ///
             select(vote=c.years_sqrt##c.years_sqrt loginc logptax) nolog vce(`vce')
          if (_rc == 0 ) { worked ok   
              margins, dydx(years_sqrt) atmeans at(years_sqrt = `atvalrt' ) expression(predict()*1/(2*`atvalrt'))
              return scalar dydx = el(r(b),1,1)
          }
          else {
             return scalar dydx = .
          }
          end
          // 
          //
          use http://www.stata-press.com/data/r15/school, clear
          set seed 46534
          // Expand the data set, tweak the years variable to reduce rho and make
          // the -heckprobit- estimation less brittle.  Keep iterations reasonable.
          expand 5
          replace years = years + ceil(runiform() * 5) if (years < 6)
          set maxiter 100
          //
          local atval =  9 //   I also tried 16, see results below
          local atvalrt = sqrt(`atval')
          gen years_sqrt = sqrt(years)
          //
          di "Asymptotic SE for regression and margins"
          Mdydxyears, atspec(atmeans at(years = `atval'))  atvalrt(`atvalrt') wantse
          //
          di "Bootstrap SE for regression, asymptotic for margins"
          Mdydxyears, atspec(atmeans at(years = `atval'))  atvalrt(`atvalrt') ///
             wantse  vce("bootstrap, rep(1000)")  
          //    
          di "Bootstrap around heckprobit and margins;  -margins- SE only depends on the coeff. estimates"
          bootstrap dydx = r(dydx), reps(1000) seed(83956): ///
             Mdydxyears, atspec(atmeans at(years = `atval'))  atvalrt(`atvalrt')
          Summary of results:

          Derivative of sqrt term (SE) at years = 9. vce(bootstrap) did the trick
          pure AS -0.0353 (.0246) bad
          boot SE -0.0353 (.0167) good
          pure boot -0.0353 (.0175)

          Derivative of sqrt term at years = 16. vce(bootstrap) didn't help.
          pure AS -.00141 (.00656) bad
          boot SE -.00141 (.00559) bad
          pure boot -.0041 (.0133)

          I don't think we can solve this whole thing here, but this long thread shows some useful ways to get the dydx() for nonlinear functions of predictors, but does not reveal a way to get a good SE estimate for this dydx without resorting to the "pure" bootstrap. Thanks again to Andrew for joining in here.






          Comment

          Working...
          X