Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    since the upgrade to 13.1 is free, why do you need something for 13.0? see - h update-

    Comment


    • #17
      Thanks again!


      If i add [`idx'] after `xb' when generating a new dependent variable, will it then bootstrap the linear prediction as well, as Fama & French?


      Code:
      * the new dependent variable using resample residuals
              gen double `y' = `xb'[`idx'] + `residual'[`idx']


      I get some funds with 100% of the simulated alphas/t-stats above the actual. Isn't this extreme? Or just evidence of very bad skill?


      and finally, is there a way to loop this, as I do it individually for every fund (about 70 funds), for 2 time periods, and needless to say, this is a very tedious task :P



      thanks alot Jeff! Really helpful!



      //alex


      Comment


      • #18
        Changing the line
        Code:
        gen double `y' = `xb' + `residual'[`idx']
        to
        Code:
        gen double `y' = `xb'[`idx'] + `residual'[`idx']
        does in fact resample the linear predictions in the same way as
        the residuals.

        It seems odd to me that the resulting intercept estimates are 100% more
        extreme than the observed intercept, they should be estimating zero by
        construction.

        The short answer to the looping question is: yes. It is just a matter of how
        your data is structured. I would compose a program that performed the task
        for a single fund and time period, then just call this program in a loop that changes
        the fund and time period. simulate can save its results to a dataset, so
        I would recommend composing the new data file names from the fund names
        and time period.

        Comment


        • #19
          Here are the results from the bootstrap. It reports in the leftmost colum the actual and average simulated alphas, as well as the percentage of simulated alhas above actual. The rightmost column reports the same for the t-statistics of alpha.


          Any ideas on the interpretation? It seems odd to me that the worst funds ranked on their actual alpha has the highest simulated alpha values?!

          Comment


          • #20
            Hi!

            I have a question about the code. I added [`idx'] to the `xb* as well, but what I really want is to add [`idx'] to resample only the factor returns and residuals, while keeping the coefficients constant, as Fama and French does.. Now it seems to me that it resamples residuals and the coefficients*factor returns), Is this possible?

            further, any ideas as tho why the simulated alphas/t-stats are so high? They are higher for the worst funds, than the best.



            Thanks a lot!

            Comment


            • #21
              Excerpted from my previous post, the following code computes the linear
              prediction from the originally fitted model, generates a simple random sample
              with replacement of the current set of observations, then generates a new
              outcome from the resampled linear prediction and residuals.

              Code:
              tempvar xb idx y
              matrix score double `xb' = `matrix'
              gen long `idx' = ceil(_N*runiform())
              gen double `y' = `xb'[`idx'] + `residual'[`idx']
              This is the same as sampling from the x variables and residuals, then
              using the original model coefficients to generate a new outcome. The
              coefficients are not changing in the code I provided.

              any ideas as tho why the simulated alphas/t-stats are so high?
              I have no ideas. It seems odd to me that the resulting intercept estimates
              are 100% more extreme than the observed intercept, they should be estimating
              zero by construction.

              Comment


              • #22
                Hello. I have been reading through this thread the last couple of days. I think the program you wrote is doing what I am trying to do. I will try to explain it here.

                I analyze y=cons+bX+e

                Then, I predict y using the results, to get yhat.

                yhat=cons+bX

                I randomly want to add the residuals (e*) back to my yhat terms, to get yhat*

                yhat*=yhat+e*

                I then regress the bootstrapped Y*s using the original set of regressors.

                I believe this is the goal of your program above, but not 100% sure.


                Any assistance here would be great.

                Thank you.

                -Steve

                PS I am sorry my user name is not my full name. I am in the process of changing that.



                Comment


                • #23
                  Dear all,

                  I hope you are well. I am studying your Stata program for the Fama and French (2010) bootstrapping. Many thanks for your kind contribution on the programming, which helped me a lot.

                  Dear Jeff,


                  I am writing to ask can you kindly please help me to add a loop function to your original Stata program.

                  According to the literature, we need to obtain OLS-estimated alphas, factor loadings and residuals for each fund. Then, construct a sample of pseudo excess returns by randomly resampling independent variables and residuals with replacement over the full cross section of fund returns simultaneously, and impose the null of zero intercept, thereby producing a common time ordering across all funds in each bootstrap. More detailed explanation was shown in page 1 of this thread.

                  For your information, the methodology of Fama and French (2010) is also available from pages 6-8 in the following working paper
                  http://www.pensions-institute.org/workingpapers/wp1404.pdf.

                  It's shame to say that I have very little knowledge about Stata programming.

                  Many thanks. All your help is highly appreciated.

                  Catherine


                  Comment


                  • #24
                    Hi Jeff,

                    Is there a possibility to get a code for the Version 11.2? I would highly appreciate any endeavors.

                    Comment


                    • #25
                      Originally posted by Jeff Pitblado (StataCorp) View Post
                      Our code is bootstrapping the residuals, while leaving the other variables as untouched.

                      After rereading though the thread I realized that our code is not zeroing out the intercept
                      before generating the Y variable from the bootstapped residuals. Here is a modified version
                      of my code that does this

                      Code:
                      program bs_resid
                      version 13.1
                      syntax, RESidual(varname numeric) MATrix(name)
                      
                      * get the varlist for -regress-
                      local xvars : colna `matrix'
                      local CONS _cons
                      local xvars : list xvars - CONS
                      
                      * compute the linear prediction
                      tempvar xb idx y
                      matrix score double `xb' = `matrix'
                      
                      * idx randomly selects the observations with replacement
                      gen long `idx' = ceil(_N*runiform())
                      
                      * the new dependent variable using resample residuals
                      gen double `y' = `xb' + `residual'[`idx']
                      
                      regress `y' `xvars', vce(robust)
                      end
                      
                      set seed 12345
                      sysuse auto
                      
                      regress mpg turn trunk displ, vce(robust)
                      matrix b = e(b)
                      
                      * zero intercept
                      local icons = colnumb(b, "_cons")
                      matrix b[1,`icons'] = 0
                      
                      predict double resid, residuals
                      histogram resid
                      
                      simulate _b _se, reps(1000) : bs_resid, res(resid) mat(b)
                      sum
                      Hi Jeff.

                      tried your example code with my STATA 11 and having the following error message
                      Code:
                       program mysim_r
                        1. version 11
                        2. syntax name(name=bvector), res(varname)
                        3. tempvar y rid
                        4. local xvars : colnames 'bvector'
                        5. local cons _cons
                        6. local xvars: list xvars - cons
                        7. matrix score double 'y' = 'bvector'
                        8. gen long 'rid' = int(_N*runiform())+1
                        9. replace 'y' = 'y'+'res'['rid']
                       10. regress 'y' 'xvars'
                       11. end
                      
                      . set seed 54321
                      
                      . mysim_r b, res(res)
                      varlist not allowed
                      r(101);
                      I mam not sure how to solve the error. Could you please kindly help me to solve it?

                      Best Regards
                      Fan

                      Comment


                      • #26
                        Dear Jeff,

                        thank you for posting the above command. I do have a similar problem for which I wanted to use your code:

                        Based on observable characteristics, I am trying to simulate hypothetical program starts for non-participants in a program evaluation (i.e. I have a data set with approximately 20.000 treated and 20.000 non-treated observations. Those treated receive treatment in 6 different time periods. In order to evaluate ATE, I want to simulate hypothetical program starts for the non-treated before I do a propensity score matching.)
                        I follow a strategy proposed in a paper by Lechner et al. (2011, EER): "We regress the log time to participation within the unemployment spell of participants on a set of personal and regional characteristics that seem important for the timing of the program; then we use the estimated coefficients together with a draw from the residual distribution to predict a corresponding value for nonparticipants."

                        I ran your code on my data:

                        program bs_resid
                        version 13.1
                        syntax, RESidual(varname numeric) MATrix(name)

                        * get the varlist for -regress-
                        local xvars : colna `matrix'
                        local CONS: _cons
                        local xvars : list xvars - CONS

                        * compute the linear prediction
                        tempvar xb idx y
                        matrix score double `xb' = `matrix'

                        * idx randomly selects the observations with replacement
                        gen long `idx' = ceil(_N*runiform())

                        * the new dependent variable using resample residuals
                        gen double `y' = `xb' + `residual'[`idx']

                        regress `y' `xvars', vce(robust)
                        end

                        set seed 12345

                        reg treat $bs_cntr_ind $empl_hist , vce(robust) , if treat>0
                        matrix b = e(b)

                        * zero intercept
                        local icons = colnumb(b, "_cons")
                        matrix b[1,`icons'] = 0

                        predict double resid, residuals
                        histogram resid

                        simulate _b _se , reps(1000) : bs_resid, res(resid) mat(b)
                        sum



                        however get the following error:
                        . simulate _b _se , reps(1000) : bs_resid, res(resid) mat(b)
                        _cons not allowed
                        an error occurred when simulate executed bs_resid


                        Any chance you could tell me what the mistake is here?

                        Thank you in advance and happy holidays to everyone.

                        Kerstin

                        Comment


                        • #27
                          Kerstin, as this thread was initiated it was referring to the asset pricing literature, where the asset pricing model says that your constant has to be 0.

                          In your application, there is no such requirement, and in fact to me seems wrong to impose the zero constant constraint.

                          Look up the first program that Jeff provided, in this first program, he was not imposing the constraint that the constant is 0. As your error message is related somehow to the constant, reverting to his initial program, which is actually the correct one for your case, might resolve the problem.

                          Comment


                          • #28
                            Dear Joro
                            thanks for your reply. I was aware of that and used the first code without the constrained constant, however it still does not work. The error I get again and again is "an error occurred when simulate executed bs_resid"
                            Would it make sense to make a new post with the problem I have?
                            Thank you very much for your help

                            Kerstin

                            Comment


                            • #29
                              Kerstin, I do not know what will maximise your chances of a useful response. On one hand it is a different issue, so another thread might be a good idea, but then you are working on a program that was firstly posted here on this thread, so it makes perfect sense to me to post your question on this thread here.

                              Speaking to the issue, the error messages that -simulate- return are not very useful because "an error occurred when simulate executed PROGRAM_NAME" , simply tells you that something went wrong. What went wrong, on the basis of this message only god knows.

                              The way how I trouble shoot simulations is that

                              1. I firstly run the program itself, that is, if Program_Name is run multiple times by -simulate-, after I have written the program I would firstly run it, and see if it goes through. When you do that you will see where exactly in the program the error occurs.

                              2. Another very useful tool for trouble shooting programs and simulations is

                              set trace on

                              this will report step by step everything that Stata does and interprets, and then it will be easier to see where the problem has occurred.

                              In short, somewhere in your do file after you have defined bs_resid, but before you have executed it multiple times through -simulate-, add the following lines of code:

                              set trace on

                              bs_resid

                              And see then what happens and whether it does not clarify what goes wrong.

                              Comment


                              • #30
                                Dear Joro

                                thank you very much. This helped me a lot.

                                Best

                                Kerstin

                                Comment

                                Working...
                                X