Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to do a macros inside a program and a macro for bootstrap

    I have 224 variables (e1,e2...e224) and I want to bootstrap the critical values for each one of them. I have been trying for some time and here is the code;

    program define Boot, rclass
    bsample
    local j 1
    while `j' <= 224 {
    centile e`j', centile(.5 2.5 5 95 97.5 99.5)
    return scalar CV1`j'=r(c_1)
    return scalar CV2`j'=r(c_2)
    return scalar CV3`j'=r(c_3)
    return scalar CV4`j'=r(c_4)
    return scalar CV5`j'=r(c_5)
    return scalar CV6`j'=r(c_6)
    local j = `j' + 1
    end

    local i = `i' + 1
    while `i' <= 224 {
    bootstrap CV1`i'=r(CV1`i') CV2`i'=r(CV2`i') CV3`i'=r(CV3`i') CV4`i'=r(CV4`i') CV5`i'=r(CV5`i') CV6`i'=r(CV6`i'), reps(50) nodots: Boot
    local i = `i' + 1
    }

    While the code works well when I specify one variable, for example e1, it does not work with the macro. I am not sure what is wrong with this code.

    Thanks
    Lilly



  • #2
    Don't you need to close your while loop after "end"? Did you mean local i 1 instead of local `i' + 1?

    Comment


    • #3
      Yes Dave, thank you. It works now. Cheers
      Last edited by Lilly; 02 May 2014, 06:36.

      Comment


      • #4
        Sorry I should have said put the closing brace before the end!

        Comment


        • #5
          Lilly,

          as a suggestion, instead of using a while loop like you're doing, consider using a forvalues loop. I believe it is more efficient. For your particular case
          Code:
          forvalues j = 1/224 {
          ...
          }
          
          forvalues i = 1/224 {
          ...
          }
          Notice that you no longer have to define the local macros (j and i) since you already do in the declaration of the forvalues loop, and that you don't need to keep adding one to the macros either, since the loop does that automatically for you.
          Alfonso Sanchez-Penalver

          Comment


          • #6
            Thank you Dave and Alfonso. I have noted something, please correct if any of the following is incorrect. The "bsample" replaces the original sample each time it runs. As a result, the observations that were not selected in the first time will never be selected in the following "bsample". Is there away to request Stata to make the bsample each time from the original data set.

            Thank you guys for your helpful comments.

            Comment


            • #7
              What does clustering make bootstrap/bsample do? Can you explain the difference between these two codes?
              1)
              program define Boot, rclass
              bsample, cluster(Number)
              forvalues j = 1/5 {
              centile e`j', centile(.5 2.5 5 95 97.5 99.5)
              return scalar CV1`j'=r(c_1)
              return scalar CV2`j'=r(c_2)
              return scalar CV3`j'=r(c_3)
              return scalar CV4`j'=r(c_4)
              return scalar CV5`j'=r(c_5)
              return scalar CV6`j'=r(c_6)
              }
              end
              forvalues i = 1/5 {
              bootstrap CV1`i'=r(CV1`i') CV2`i'=r(CV2`i') CV3`i'=r(CV3`i') CV4`i'=r(CV4`i') CV5`i'=r(CV5`i') CV6`i'=r(CV6`i'), reps(50) nodots: Boot
              }

              2)
              program define Boot, rclass
              bsample
              forvalues j = 1/5 {
              centile e`j', centile(.5 2.5 5 95 97.5 99.5)
              return scalar CV1`j'=r(c_1)
              return scalar CV2`j'=r(c_2)
              return scalar CV3`j'=r(c_3)
              return scalar CV4`j'=r(c_4)
              return scalar CV5`j'=r(c_5)
              return scalar CV6`j'=r(c_6)
              }
              end
              forvalues i = 1/5 {
              bootstrap CV1`i'=r(CV1`i') CV2`i'=r(CV2`i') CV3`i'=r(CV3`i') CV4`i'=r(CV4`i') CV5`i'=r(CV5`i') CV6`i'=r(CV6`i'), reps(50) cluster(Number) nodots: Boot
              }
              Last edited by Lilly; 03 May 2014, 00:23.

              Comment


              • #8
                Hi,
                I am trying to refine my program further. in particular I want the bsample to be drawn from the non-missing values of the variable e and I want the to program to start each time from the original sample. I am not sure why the program does not work, I receive the following variable e2 not found an error occurred when bootstrap executed Boot r(111).

                program define Boot, rclass
                1. preserve
                2. forvalues j = 1/224 {
                3. keep Number e`j'
                4. drop if mi(e`j')
                5. bsample
                6. centile e`j', centile(.5 2.5 5 95 97.5 99.5)
                7. return scalar CV1`j'=r(c_1)
                8. return scalar CV2`j'=r(c_2)
                9. return scalar CV3`j'=r(c_3)
                10. return scalar CV4`j'=r(c_4)
                11. return scalar CV5`j'=r(c_5)
                12. return scalar CV6`j'=r(c_6)
                13. }
                14. restore
                15. end

                forvalues i = 1/224 {
                2. bootstrap CV1`i'=r(CV1`i') CV2`i'=r(CV2`i') CV3`i'=r(CV3`i') CV4`i'=r(CV4`i') CV5`i'=r(CV5`i') CV6`i'=r(CV6`i'), reps(50) cluster(Number) seed(12121212) nodots: Boot
                3. estat bootstrap, all
                4. }


                Thank You

                Comment


                • #9
                  I'm not sure what you're hoping to accomplish by bootstrapping centiles (or critical values). But, from a purely mechanical standpoint, why don't you try something like the following?
                  Code:
                  version 13.1
                  
                  set more off
                  set seed 12121212
                  
                  forvalues i = 1/244 {
                  
                      preserve
                      quietly drop if mi(e`i')
                  
                  
                      display in smcl as text _newline(2) "Variable = e`i'"
                  
                      bootstrap ///
                          CV1 = r(c_1) ///
                          CV2 = r(c_2) ///
                          CV3 = r(c_3) ///
                          CV4 = r(c_4) ///
                          CV5 = r(c_5) ///
                          CV6 = r(c_6), ///
                          reps(50) cluster(Number) nodots: centile e`i', centile(.5 2.5 5 95 97.5 99.5)
                  
                      estat bootstrap, all
                  
                  
                      restore
                  }
                  
                  exit

                  Comment


                  • #10
                    Thanks Joseph. The problem is that the errors of my regression models are not normally distributed and hence I will not be able to make inferences (using the usual 1% and 5% P values). I need to determine the critical values for each model independently. Does that make sense to you?

                    Comment


                    • #11
                      Hi Lilly,

                      I believe what Joseph is trying to tell you is that you don't need to use bsample within a program that you're going to bootstrap since bootstrap takes care of setting the sample in each repetition. Also the problem in your last post seems to be with the command keep Number e`j'. That command is keeping only the two variables you are mentioning. In the first iteration of the loop, you thus drop all e`j' variables that are not equal to e1, so the second iteration of the loop cannot find e2 because you dropped it in the first iteration. What is the purpose of that line?
                      Last edited by Alfonso Sánchez-Peñalver; 04 May 2014, 08:23.
                      Alfonso Sanchez-Penalver

                      Comment


                      • #12
                        Thanks, I see your point Alfonso. The problem is that each error has different missing observation (N for e1 is 104 while N for e2 is 116), so in each loop I want the go back to the original data set and delete the missing information that is relevant to that particular e`j'. I am not sure how to instruct it to get back to the original sample.
                        My intention with bsample is to make random sample from the errors, get the critical values, and then repeat this say 100 times and get the average critical values. I am not sure if my code translates correctly. Thank you again for your helpful comments

                        Comment


                        • #13
                          Joseph gave you the idea with the preserve ... restore command. See help preserve.
                          Alfonso Sanchez-Penalver

                          Comment


                          • #14
                            I tried Joseph's code but it does not seem produce what I want. Here is what I want 1) select random sample of each errors 2) determine the critical values for each error 3)repeat steps(1&2) 50 times for each error and get the average critical values from the 50 reps. How do you think I should proceed? Thanks for your help guys.

                            Comment


                            • #15
                              I'm still not sure what you're doing with bootstrapping centiles (or critical values or whatever), but if the residuals do not distribute normally, then there are alternatives to resampling them, for example, you can try data transformations, generalized linear models and permutation (randomization) tests.
                              Last edited by Joseph Coveney; 04 May 2014, 21:16.

                              Comment

                              Working...
                              X