Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrapping

    How would I bootstrap data that I have in Excel? I found online tutorial on https://www.youtube.com/watch?v=_8-2QBL-9UM however it generates random data. I want to use my data from Excel. Many thanks!

  • #2
    The first step is to import your data into Stata. For that you can use import excel.

    After that you forget that tutorial you found on youtube. Once you know enough about bootstrap and programming in Stata, then what he does is kinda cute, but even than not very useful.

    Instead, it is often as simple as your_estimation_command y x1 x2 x3, some_options vce(bootstrap, reps(2000)). For example:

    Code:
    sysuse nlsw88, clear
    reg wage ttl_exp grade i.race, vce(bootstrap, reps(2000))
    glm wage ttl_exp grade i.race, link(log) family(poisson) eform vce(bootstrap, reps(500))
    You just type in Stata help your_estimation_command (where you replace your_estimation_command with the command you want to use), look in the help-file for the vce() option, and in the description of that option you will see if the bootstrap is supported or not.

    If the bootstrap is not implemented for your estimation command, you should try to figure out why. For example, bootstrapping time-series is complicated because taking random samples from that would destroy the association across time, while that is the very thing your are trying to model. If there is no obvious reason why you should not use the bootstrap, then you can implement it with the bootstrap command. See help bootstrap, and particularly don't forget the pdf-manual (link at the top of the help-file) which contains more information.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Many thanks for your answer. I imported excel successfully and used this command:

      reg pi_obs in_obs c_obs r_obs m_obs, vce(bootstrap, reps(1000))

      however then I get an error message variable pi_obs not found. Please see attached Excel, pi_obs is one of the columns.

      Attached Files

      Comment


      • #4
        If you cannot find a variable that should be in the data, then you did not successfully imported the data. You did not tell us how you attempted to do so, so that is all I can tell you.

        The other thing that immediately jumps out when looking at this data, is that there is a time-series element to this data, and I literally used time-series as an example of when you probably should not use the bootstrap....
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Regarding import, I clicked on File drop-down menu, then on Import and then on Excel speadsheet (.xls). Then I browsed and chose the data file Data.xlsx. After this, a line appeared in the command window:

          import excel "C:\Users\...\Data.xlsx", sheet("Sheet1")
          (7 vars, 352 obs)

          Perhaps reg command was wrong? I have 5 variables in my DSGE model and I don't have classical regression with dependent and independent variables y and x.

          Furthermore, would panel data be better idea than time series? I have three countries in my sample, for which I run time series. Thank you.

          Comment


          • #6
            There is a little tick box on that menu saying "import first row as variable names", which you forgot to tick.

            The reg command is exactly right when that is the model you want to estimate, and exactly wrong when that is not the case. If you want to estimate a dynamic stochastic general equilibrium model, then your first step is going to be to type into Stata help dynamic stochastic general equilibrium and you see there is an entire manual dedicated to estimating these models. A quick glance at it tells me that the bootstrap is not an option, and that does not surprise me. Why do you think you need the bootstrap in this case?
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Thank you, after ticking the box the bootstrapping works fine. But, just checking, given that this is a DSGE model, the results are wrong and I can't use them?

              In my master's study, I'm trying to replicate a paper by Jerger and Röhe (in the attachment, please see page 12/27). Already on Dynare forum, they told me that bootstrapping in this case doesn't make much sense, but following that paper, this is exactly what they've done.
              Attached Files
              Last edited by Svit Valencic; 12 Mar 2024, 14:25. Reason: grammar

              Comment


              • #8
                The fact that an article gets published does not guarantee it is free from mistakes, or that you are interpreting it correctly. In this case they are using a parametric bootstrap, which is not the same as the bootstrap. Given the questions you have asked, I will guess that doing this right is beyond your capability (for now). Instead I would suggest just using the dsge command.

                That sounds harsh, but remember we all specialize in different things, which also means that we don't specialize in other things. To do this right you need to specialize in the bootstrap, and for very good reasons, not everybody does that, and you seem to be one of those people.
                Last edited by Maarten Buis; 12 Mar 2024, 14:45.
                ---------------------------------
                Maarten L. Buis
                University of Konstanz
                Department of history and sociology
                box 40
                78457 Konstanz
                Germany
                http://www.maartenbuis.nl
                ---------------------------------

                Comment


                • #9
                  I respect your opinion, but I believe that you understimate my willingness to learn. I've found an example of parametric bootstrapping in Stata here https://friosavila.github.io/playing...bootstrap.html

                  Could you at least advise me if that example is correct and if not, where to learn? I can't find any mention of parametric bootstrapping in the Manual. Thank you.

                  Comment


                  • #10
                    I don't doubt your willingness to learn, I just doubt whether you can free up enough time to just study this one subject (as a rule of thumb: such a project always takes at least twice as long as you originally thought) I would budget about 3 months for this one step of the analysis (so make that 6 months) and I have already a fair basis in both Stata and the bootstrap.

                    The kind of problems you had importing the data, and especially the inability to diagnose what was wrong, suggests to me that you have a lot to learn about basic data analysis and how to use Stata. To rectify that you should budget about 3 months for that. A good place to start is the "Getting Started" and "User's Guide" manuals. After that you can look at "Data Analysis using Stata" https://www.stata.com/bookstore/data...s-using-stata/ and/or "The Workflow of Data Analysis using Stata" https://www.stata.com/bookstore/work...nalysis-stata/ and/or "An Introduction to Stata Programming, Second Edition" https://www.stata.com/bookstore/intr...a-programming/

                    The idea you had that the bootstrap would be the right method for this type of data suggests to me a lack of basic understanding of the bootstrap. To rectify that I would budget another 3 months. Here I would study:
                    A.C. Davison & D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge: Cambridge University Press.

                    B. Efron & R.J. Tibshirani (1993) An Introduction to the Bootstrap. Boca Raton: Chapman & Hall/CRC.

                    Now you need to apply the basics you have learned to a dsge model. Here you would need to do your own research. Budgeting 3 months for that is probably a bit optimistic.

                    So adding that up is 9 months of study, and using the rule of thumb you would probably expect this to take about one and half years. Not many people can free up that amount of time.
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      Okay, you made your point. However, my Master's study advisor is helping me a lot at points like this, when I have to do complicated tasks. But before that, I want to make an effort and write down a code, even if it's wrong. After all, my DSGE model is very simple and includes just 5 observed variables. So, could you please tell me if your code, that I found on the forum, is a good beginning? Many thanks!

                      Link:
                      https://www.stata.com/statalist/arch.../msg00926.html

                      Code:

                      *---------- begin example ----------- tempname mean sd memhold tempfile results sysuse auto, clear sum mpg gen lnmpg = ln(mpg) sum lnmpg scalar `mean' = r(mean) scalar `sd' = r(sd) postfile `memhold' mean using `results' forvalues i = 1/1000{ capture drop sample gen sample = exp(`sd'*invnorm(uniform())+`mean') sum sample, meanonly post `memhold' (r(mean)) } postclose `memhold' use `results', clear sum mean, detail *-------------- end example --------------

                      Comment


                      • #12
                        Not for your problem. I am sorry, but this is the type of problem where it is way too easy to make horrible mistakes without realizing it. So there are no short-cuts: either invest the 1 1/2 years or don't do it.

                        I realize that that is not a fun answer, but sometimes questions don't have an answer or at least not an answer with a reasonable cost. I have given you all the help I can, even if that was not exactly the help you were looking for.
                        ---------------------------------
                        Maarten L. Buis
                        University of Konstanz
                        Department of history and sociology
                        box 40
                        78457 Konstanz
                        Germany
                        http://www.maartenbuis.nl
                        ---------------------------------

                        Comment

                        Working...
                        X