Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap failures

    Dear Statalist,

    I'm attempting to bootstrap the standard errors for a specific coefficient in a regression and am seeing a surprising number of failed runs. For instance, in trying to get the "BWTT" coefficient:

    . bootstrap _b[BWTT], reps(50): areg fsizeGini BWTT BW YFE*, absorb(CID)
    (running areg on estimation sample)

    Bootstrap replications (50)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    x...xx.x.x.....xxxx...xxx..xxx.....xxx..xxxxxxx..x 50


    On the other hand, running bootstrap by hand produces no failed runs:

    forvalues i = 1/50 {
    preserve
    bsample
    quietly: areg fsizeGini BWTT BW YFE*, absorb(CID)
    disp(_b[BWTT])
    restore
    }

    -.00125099
    -.00105616
    -.00158265
    -.00156752
    -.00127805
    -.00112946
    -.00168347
    -.00132481
    ... etc. [no failed runs]

    For reference, areg fsizeGini BWTT BW YFE*, absorb(CID) vce(bootstrap) runs fine for this example but fails for some other ones I'm considering.

    What am I misunderstanding about bootstrap or the Stata code? Thanks!

    Cory

    Stata version is 11.2

  • #2
    Hi, Cory!
    Here are my first impression as I play around this syntax using the auto data set. I've been advised to bootstrap my SEs before and then advised against it, for a convenience sample. So I have a couple thoughts on both syntax and experience:

    I've Linked Scott Long's syntax to this for reference and as a great template.

    * Set your seed to keep results consistent:
    You should set the random seed so the results are consistent, if you haven't done this. "set seed #" will ensure replicable results. This is not likely relevant to your problem. I ran this on the auto dataset, just to see myself what happens.

    * Reps are too low:
    I'm going to quote Scott Long from his great book on Workflow using Stata. "For real world applications, 1000 reps is needed!" (This is based on one of his syntax files in Chapter 7, using version 8). I've heard (and likely used) 500. I never had convergence issues on that sample (but I also never used areg).

    What I'd do if I were to bootstrap:
    * Compare results at different seeds, and consider that higher == better. My guess is that this will result in fewer errors but that there is something else going on.
    * Compare results using version control (version #) and see if earlier versions produce different results. I'd be interested myself.
    * Compare using areg; reg; xtreg, re; xtreg, fe; reg, vce(cluster var). Perhaps this is an issue with the model. I'd prefer the vce and the extreg, re options (see below. You can also compare xtreg, fe with areg, which should be identical.

    Consider dropping the bootstrap:
    * I understand bootstrapping is more acceptable in economics but I've heard that it is actually quite limited (I can elaborate with more time). The basic point is that great models cannot overcome data limitations.

    * Related to this, there is a recent argument by Gary King (see second link and citation) that is convincing to me: robust SEs, as we've learned in economics, is something that can fix some problems with heteroskedasticity. King and his coauthor, as I will put colloquially, are saying that, if there's smoke there's fire. The problem is that adding ", robust" helps us with smoke, but the problem is that fire. In models this is often with poor model fit, which can be ameliorated using old fashioned regression diagnostics.

    * Finally, I don't have the taste for fixed effects. While I've published using areg (sibling fixed effects), it's destructive to data; also, it - from a statistical view - doesn't model the data. "Chainsaw surgery." I've since instead used multilevel modeling or "random effects" to both "do away" with clustering but also to examine clustering as a feature sui generis. Since you have Stata 11, I'd use those with xtreg, with the "re" option. That - combined with old fashioned regression diagnostics may well solve your problems.

    Perhaps so others can help as well, it's a bit difficult without knowing more of the data, but there are more thoughts:
    * These data are cross national, so what is the super population to which we are drawing inferences? This has actually always confused me. If I sample 1,000 kids in the USA about attitudes, then we need interferences to the population that is used for interferenceH
    * How large are the clusters for the fixed effects options? Perhaps you're overloading with parameters. Using a random effects model helps overcome this. It also keeps the data intact.
    * How large is the sample? Bootstrapping needs a sample size that is sufficiently large. This problem is compounded with the chainsaw massacre of fixed effects.

    In short, I'd follow Scott Long's steps, if you really intend to use the bootstrap, but in reality, I'd run this as a multilevel model and run regression diagnostics to ensure you're modeling the data as intended.

    I hope you share more aspects of you data and output, along with additional results, if you choose to follow some of the above steps. I'd be curious to see what you find!

    Hope this helps, and good luck!

    - Nate
    P.S. Why I don't worry about SEs (and more about regression diagnostics):
    King, Gary, and Margaret E Roberts. 2015. “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It.” Political Analysis 23 (2): 159–179. Copy at http://j.mp/1BQDeQT
    http://gking.harvard.edu/publications/how-Robust-Standard-Errors-Expose-Methodological-Problems-They-Do-Not-Fix
    Last edited by Nathan E. Fosse; 08 Jul 2015, 15:38.
    Nathan E. Fosse, PhD
    [email protected]

    Comment


    • #3
      Thanks for your thoughts and the link. I agree that it may or may not make sense to bootstrap or use FEs depending on your setting, but for now let's just take it as a given that I'd like to use this regression model and I'd like to bootstrap my standard errors, or at least see the results of bootstrapped errors.

      1. I have set the seed (not shown in my original post, sorry) so this shouldn't be an issue.
      2. I agree with your point that I should look at regression diagnostics. This is why I included the "bsample" section of my code.My understanding is that the bootstrap [some output]: [some program] should run the command after using a bootstrap sample/bsample and look at the distribution of results. So the two programs should give (probabilistically) identical results.
      3. In my case, the bootstrap [...]: areg [...] command produces ~50% failed runs. The "bsample" procedure produces 0%. So there must be something wrong with the bootstrap [...]: areg [...] command or my understanding.
      4. I'd prefer to use the bootstrap [...]: areg [...] version as opposed to hand coding my own bootstrap using bsample. But the 50% errors is very worrying when it seems there should be 0%.

      So, my question is, what could be going on with bootstrap [...]: areg [...] to make it give so many errors?

      Comment


      • #4
        Cory:
        I would -set trace on- and scrutinize Stata behaviour.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          I would -set trace on- and scrutinize Stata behaviour.
          The idea is fine, but that tends to produce so much output that it would be hard to make sense of. There are ways to reduce the amount of output, but in this case you can get even more focussed output by using the noisily option in bootstrap.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Maarten:
            smarter than me as always!
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Maartan and Carlo provide the best advice in far fewer words. Anyway, I looked at the manuals a bit more since this is a bit disconcerting. I hope the links to stata.com serve as guideposts.

              First link: areg technical note recommends against using areg when clusters are a function of sample size. In those instances, they recommend xtreg with the fe option
              Second link: in using xtreg they recommend using the vce(boot) option "whenever possible."

              In short, sounds like Stata wants something like
              xtreg y x1 x2, fe vce(boot)

              Hope this helps,

              - Nate

              Nathan E. Fosse, PhD
              [email protected]

              Comment


              • #8
                Thanks for this advice guys! Very good suggestions. I'll look into this more today & report back.

                Comment


                • #9
                  In similar situations, I've sometimes found it useful to capture the actual bootstrap data set on which the error occurred, and inspect it later. To do this, I enclose the estimation command being bootstrapped in a small wrapper program, trap the error, and save the data set to a file name generated on the fly. Here's a sketch:
                  Code:
                  program wrapper, rclass  // rclass is easier for me
                  capture areg y x1 x2 ...
                  if (_rc > 0 )  {   // error
                    local fname = "error" + string(trunc(runiform() * 1e5))  // kludgy unique filename
                    save "`fname'"
                  }
                  else { 
                     return scalar b1 = _b[x1]
                     return scalar b2 = _b[x2]
                  ......
                  }
                  end
                  bootstrap ..... : wrapper
                  Now, you can open any of the "error*" data sets, and analyze them for potential oddities. I have found this approach to be less messy and more informative than -trace- or the -noisily- option.

                  Regards, Mike

                  Comment


                  • #10
                    After using the suggested diagnostics, I am getting the error "collinearity in replicate sample is not the same as the full sample, posting missing values." Seems other users have faced this issue before although if they found a solution none was posted:
                    http://www.stata.com/statalist/archi.../msg00052.html
                    http://www.stata.com/statalist/archi.../msg00432.html

                    The problem year is the controls (YFE* is a set of year fixed effects). If the bootstrapped sample drops all observations of a certain year, Stata throws the above error and does not record the result. xtreg and areg both have this issue in the bootstrap [...]: areg/xtreg [...] formulation. I don't see an option to suppress this check. I could stratify on year, although this is a different formulation.

                    Now, I'm not an expert on bootstrap but this seems like an issue of implementation. Bootstrap should be fine for the main coefficient even if not all fixed effects are accounted for. I'm guessing this must be what the vce(bootstrap, [...]) does which in this example works fine:

                    areg fsizeGini BWTT BW YFE*, absorb(CID) vce(bootstrap, reps(50))

                    Bootstrap replications (50)
                    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
                    .................................................. 50

                    Comment


                    • #11
                      Unfortunately, even this formulation has problems in the main results I want. I'm hoping to use clustered standard errors and to bootstrap on clusters. This gives for both areg and xtreg:

                      areg fsizeGini BWTT BW, absorb(CID) vce(bootstrap, reps(50) cluster(BWy))
                      (running areg on estimation sample)

                      Bootstrap replications (50)
                      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
                      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 50
                      insufficient observations to compute bootstrap standard errors
                      no results will be saved

                      When I use the (very helpful) noisily command to diagnose, each regression gives the error:
                      "insufficient observations
                      an error occurred when bootstrap executed areg, posting missing values"



                      However, doing it by hand it works out fine:

                      forvalues i = 1/50 {
                      preserve
                      bsample, cluster(BWy)
                      quietly: areg fsizeGini BWTT BW, absorb(CID)
                      disp(_b[BWTT])
                      restore
                      }

                      .00154858
                      .00105005
                      .001479
                      .00074453
                      .00073884
                      .00052895
                      .00132382
                      .00101373
                      ... etc.


                      So, what gives? Thanks again for all the help!

                      Cory

                      Comment


                      • #12
                        To anyone else with this issue: my current workaround has been to create a modified copy of bootstrap.ado and _loop_bs (used by the former) that disables the checks that cause the errors. I have not been able to discover another way to deal with the issue, other than to program your own bootstrap with bsample.

                        I'm open to suggestions or to hearing that there is a good reason why Stata enforces no dropped covariates in bootstrap, but I sure as heck can't think of any reason.

                        Comment


                        • #13
                          I wish you'd explain your design. I've never encountered one in which "year" would be a natural cluster.
                          Steve Samuels
                          Statistical Consulting
                          [email protected]

                          Stata 14.2

                          Comment


                          • #14
                            Steve, I am not clustering on year so I'm not sure how I gave that impression. I have included year dummies/fixed effects in the regression. These seem to be causing an issue with bootstrap in that, if the resample does not include a particular year the whole bootstrap iteration is dropped. This to me seems an issue as, at a minimum, estimates would not be reported for a representative set of bootstrap samples.

                            While I am sure my design has issues, they are beside the point for the narrow, technical question which I am trying to answer (i.e. how to get bootstrapped standard errors). I'm happy to provide more details about the model if it would help answer the technical question.

                            Comment


                            • #15
                              Thanks. I misunderstood your comment about bootstrap dropping all the observations in a year.
                              Steve Samuels
                              Statistical Consulting
                              [email protected]

                              Stata 14.2

                              Comment

                              Working...
                              X