Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 'post' command and matrix values

    I am having trouble pulling parts of a matrix after running svy: mean on a stratified simple random sample I have drawn. I am using postfile and post commands and it seems to work inconsistently. There is clearly something I don't understand about the naming and calling of the values I am interested in. Hope someone can help.

    The following works:
    postfile buffer mhat sehat using mcs_SSRS

    ..... draw sample.....

    svyset [pweight=wt], strata(strata) fpc(fpc)
    svy: mean y
    post buffer (_b[y]) (_se[y])
    postclose buffer


    However, the following does not work --- it fails on the V_srs term:
    postfile buffer mhat sehat V_srshat using mcs_SSRS

    ..... draw sample.....

    svyset [pweight=wt], strata(strata) fpc(fpc)
    svy: mean y
    post buffer (_b[y]) (_se[y]) (_V_srs[y])
    postclose buffer
    it results in
    _V_srs not found
    post: above message corresponds to expression 3, variable V_srshat

    r(111);

    Interestingly, there is no "se" matrix listed when I type ereturn list but there is a e(b) and e(V_srs) and both can be displayed as shown below while e(se) can not.

    ereturn list

    scalars:
    e(df_r) = 109
    e(N_strata_omit) = 0
    e(singleton) = 0
    e(census) = 0
    e(N_pop) = 5616.000091552734
    e(N_psu) = 111
    e(N_strata) = 2
    e(N_over) = 1
    e(N) = 111
    e(stages) = 1
    e(k_eq) = 1
    e(rank) = 1

    macros:
    e(cmdline) : "svy : mean y"
    e(cmd) : "mean"
    e(prefix) : "svy"
    e(cmdname) : "mean"
    e(command) : "mean y,"
    e(title) : "Survey: Mean estimation"
    e(vcetype) : "Linearized"
    e(vce) : "linearized"
    e(estat_cmd) : "svy_estat"
    e(varlist) : "y"
    e(marginsnotok) : "_ALL"
    e(wtype) : "pweight"
    e(wvar) : "wt"
    e(wexp) : "= wt"
    e(singleunit) : "missing"
    e(strata1) : "strata"
    e(fpc1) : "fpc"
    e(properties) : "b V"

    matrices:
    e(b) : 1 x 1
    e(V) : 1 x 1
    e(_N_subp) : 1 x 1
    e(V_srswr) : 1 x 1
    e(V_srs) : 1 x 1
    e(_N) : 1 x 1
    e(error) : 1 x 1
    e(_N_strata_certain) : 1 x 1
    e(_N_strata_single) : 1 x 1
    e(_N_strata) : 1 x 1

    functions:
    e(sample)
    matrix list e(b)

    symmetric e(b)[1,1]
    y
    y1 .31518578


    matrix list e(V_srs)

    symmetric e(V_srs)[1,1]
    y
    y .01460002


    matrix list e(se)
    matrix e(se) not found
    r(111);

    .

  • #2
    Hi Josh
    setting aside the use of "svy", when you use most estimations commands, including regress, you can always request both coefficients and standard errors using _b[varname] _se[varname]. THat is done exactly for what you are trying to do, which is, have quick access to the data and estimates.
    if you need to get information from the output, you will need to refer to the individual values of a matrix or scalars. Otherwise, I don't think Stata will understand what is it that you are trying to "save"
    HTH
    Fernando

    Comment


    • #3
      Thanks Fernando but I am still unable to get this to work by calling individual matrix values.

      Changing my post command to:
      post buffer (r(b)) (r(V)) (r(V_srs))
      Runs without error. However, it just gives me a dataset full of missing values despite checking that matrix e(b) e(V) and e(V_srs) all exist with non-missing values as shown below

      . matrix list e(b)

      symmetric e(b)[1,1]
      y
      y1 1.9267049

      . matrix list e(V_srs)

      symmetric e(V_srs)[1,1]
      y
      y 1.4295201

      . matrix list e(V)

      symmetric e(V)[1,1]
      y
      y 1.4018838

      Comment


      • #4
        question, what do you want to do with post? are you doing a bootstrap procedure? or a simulation.
        perhaps the following code can help you sort things out:

        Code:
         webuse nhanes2f
            
            program mystats, eclass
            qui:svy: mean zinc
            matrix b=e(b),e(V),e(V_srs)
            matrix coleq b=b V V_srs
            ereturn post b
            end
            
            mystats
            ereturn display

        Comment


        • #5
          Originally posted by FernandoRios View Post
          question, what do you want to do with post? are you doing a bootstrap procedure? or a simulation.
          perhaps the following code can help you sort things out:
          I am running a simulation. I want to post 10,000 iterations of these matrix values from 10,000 draws. b=Estimated population V=Variance V_srs= SRS Variance. I realize, I should probably be using svy: total rather than svy: mean but that doesn't fix my problem of 'grabbing' the output I need to save with post

          Comment


          • #6
            My code should help with that
            because you now just need to do
            simulate ,reps(1000):mystats
            Storing things in a matrix Beta and return it as coefficients is the easiest way to get things to work. Or at least that is how I would do it.

            Comment


            • #7
              Thanks, I will give it a shot.

              Comment


              • #8
                I am not really sure where in my do-file I can use the program definition and simulate command as I build the random draw from scratch as shown here:

                Code:
                   clear all
                  
                global startdir ="C:\Users\wxxxxxxxx"
                global simul 1000
                global strat0 3189
                global strat1 2427
                global cells 5616
                
                local pcts "2 3 5 10"
                
                foreach j in `pcts' {
                
                local sample0=trunc(0.01*`j'*$strat0)
                local sample1=trunc(0.01*`j'*$strat1)
                local sample=`sample0'+`sample1'
                
                
                forvalues i=1/$simul {
                /*** Generate random draw of `j' percent from strata=0 and strata=1 ***/
                         quietly drop _all
                         quietly set obs $cells
                         gen id = _n
                         gen strata=0
                         replace strata=1 if id>$strat0
                         replace id=_n-$strat0 if id>$strat0
                         gen double rnd = runiform()
                         sort strata rnd
                         by strata: gen NN=_n
                         drop if NN>`sample0' & strata==0
                         quietly drop if NN>`sample1' & strata==1
                         quietly drop rnd
                         quietly drop NN
                        
                /*** Merge random draw with real population dataset ***/
                          quietly merge 1:1 id strata using "$startdir\dataset.dta", keep(3)
                          quietly rename firms_total y
                
                /*** Define weights **/
                          quietly gen wt=($strat0/`sample0') if strata==0
                          quietly replace wt=($strat1/`sample1') if strata==1
                
                /*** Define FPC ***/
                          gen fpc =$strat0 if strata==0
                           replace fpc =$strat1 if strata==1
                          
                /*** Set SVY design ***/
                          quietly svyset [pweight=wt], strata(strata) fpc(fpc)
                          
                          quietly svy: total y
                
                
                }
                }
                Last edited by Josh Wimpey; 18 Mar 2020, 15:38.

                Comment


                • #9
                  This would be my version:
                  Code:
                  program my_sim, eclass
                    
                   local sample0=trunc(0.01*`1'*$strat0)
                  local sample1=trunc(0.01*`1'*$strat1)
                  local sample=`sample0'+`sample1'  
                  /*** Generate random draw of `j' percent from strata=0 and strata=1 ***/      
                   quietly drop _all      
                     quietly set obs $cells  
                  gen id = _n      
                  gen strata=0    
                     replace strata=1 if id>$strat0      
                     replace id=_n-$strat0 if id>$strat0  
                     gen double rnd = runiform()  
                     sort strata rnd
                     by strata: gen NN=_n    
                     drop if NN>`sample0' & strata==0    
                     quietly drop if NN>`sample1' & strata==1      
                     quietly drop rnd        
                     quietly drop NN      
                     /*** Merge random draw with real population dataset ***/      
                     quietly merge 1:1 id strata using "$startdir\dataset.dta", keep(3)  
                     quietly rename firms_total y  /*** Define weights **/    
                     quietly gen wt=($strat0/`sample0') if strata==0    
                     quietly replace wt=($strat1/`sample1') if strata==1  /*** Define FPC ***/  
                     gen fpc =$strat0 if strata==0      
                     replace fpc =$strat1 if strata==1    
                         /*** Set SVY design ***/        
                    quietly svyset [pweight=wt], strata(strata) fpc(fpc)          
                    quietly svy: total y  
                  ** Here I collect all the info I want in a matrix "b"    
                    matrix b= all stuff you need  
                    matrix colname = dif name for each column  
                    ereturn post b
                  end
                  Then run the program, and see if you do collect ALL the info you need, typing

                  Code:
                   my_sim 2
                  my_sim 3
                  my_sim 5
                  * etc
                  If that works, then simply run your simulation
                  Code:
                  simulate, reps(10):my_sim 2
                  HTH
                  Last edited by FernandoRios; 18 Mar 2020, 20:07.

                  Comment


                  • #10
                    Dear Fernando,
                    Thank you for all the wonderful help. This has taught me a lot and I have a working program and do-file now.

                    One last question that I am trying to solve --- How can I loop over a bunch of different sample sizes using this program?

                    If I take the following code, I keep getting an error because stata can not identify the `j' variable value. Here, I am wanting to loop through sample sizes of 2% and 5% respectively.

                    Code:
                     . . .  
                    
                    program my_sim, eclass      
                    
                    local sample0=trunc(0.01*`j'*$strat0)  
                    local sample1=trunc(0.01*`j'*$strat1)
                    
                    . . .  
                    
                    matrix b=e(b),e(V),e(V_srs)    
                    matrix coleq b=b V V_srs    
                    ereturn post b
                    end  
                    
                    local pcts "2 5"  
                    foreach j in `pcts' {
                    simulate ,reps(10):mystats  
                    save "$startdir\SSRS_`j'_test.dta", replace
                    
                     }

                    Comment


                    • #11
                      Perhaps a solution could be:
                      ​​​​​​
                      Code:
                      use yourdata.dta
                      forvalues i=2/5 {
                        preserve
                        simulate, reps(10):my_sim `i'
                        save sim_`i', replace
                        restore
                      }
                      That way, you run all your simulations, you save the results, and restore the original data so you can start again.
                      This assumes that within my_sim, you are using `1' instead of `j' so that the first number sent (i=2,3,4,5) will be used internally to define your sample.

                      Comment


                      • #12
                        So, using `1' instead of `j' will work with forvalues i=2/5 ? I did not know you could use `1' as a universal local of some sort.

                        Anyway, I get the same error whether I use `1' or `j' and either of the local definition statements.

                        Part of the problem may be that my_sim program needs to generate the dataset that I need from scratch each time as it is a randomized draw. Further, the size of my dataset (sample) will change with the values of the percentages called in the loop I want to write. This means that a "use yourdata.dta" statement doesn't help.



                        Comment


                        • #13
                          I see. Ok then the "use data" is not needed.
                          Can you show what exactly is the error that you get?

                          Also, I do not know the exact jargon regarding the syntax, this this program may help show you how it works within stata:

                          Code:
                          program myprog
                          display "`0'"
                          display "`1'"
                          display "`2'"
                          display "`3'"
                          end
                          
                          myprog this is a test
                          myprog hello world!
                          You can see that everything after the command will go to `0', and each word, in the order introduced, will go to locals 1 2 3, etc.
                          HTH
                          Fernando

                          Comment


                          • #14
                            Sure, if I define the program as follows:

                            Code:
                               program mystats, eclass
                               
                               local sample0=trunc(0.01*`j'*$strat0)
                               local sample1=trunc(0.01*`j'*$strat1)
                               local sample=`sample0'+`sample1'  
                            
                            /*** Generate random draw of `j' percent from strata=0 and strata=1 ***/      
                               quietly drop _all      
                               quietly set obs $cells  
                               gen id = _n      
                               gen strata=0    
                               replace strata=1 if id>$strat0      
                               replace id=_n-$strat0 if id>$strat0  
                               gen double rnd = runiform()  
                               sort strata rnd
                               by strata: gen NN=_n    
                               drop if NN>`sample0' & strata==0    
                               quietly drop if NN>`sample1' & strata==1      
                               quietly drop rnd        
                               quietly drop NN    
                               
                            /*** Merge random draw with real population dataset ***/      
                               quietly merge 1:1 id strata using "$startdir\mydata.dta", keep(3) 
                               quietly rename firms_total y  /*** Define weights **/    
                               quietly gen wt=($strat0/`sample0') if strata==0    
                               quietly replace wt=($strat1/`sample1') if strata==1  /*** Define FPC ***/  
                               gen fpc =$strat0 if strata==0      
                               replace fpc =$strat1 if strata==1    
                                   
                            /*** Set SVY design ***/        
                              quietly svyset [pweight=wt], strata(strata) fpc(fpc)          
                              quietly svy: total y 
                              
                            ** Here I collect all the info I want in a matrix "b"    
                              matrix b=e(b),e(V),e(V_srs)  
                              matrix coleq b=b V V_srs  
                              ereturn post b
                            end

                            Then after the program is defined, I use this loop (or your suggested loop with `1' instead of `j' and forvalues 2/5....) to call the program:

                            Code:
                            local pcts "2 5"
                            
                            foreach j in `pcts' {
                            simulate ,reps(10):mystats
                            
                            save "$startdir\Simulation Output files\MCS_00125_URBAN_Stratified\SSRS_`j'_test.dta", replace
                            
                            }

                            I get the following error message:

                            Code:
                            .
                            .
                            .
                            27.   ereturn post b
                             28. end
                            
                            . 
                            . local pcts "2 5"
                            
                            . 
                            . foreach j in `pcts' {
                              2. simulate ,reps(10):mystats
                              3. 
                            . save "$startdir\Simulation Output files\MCS_00125_URBAN_Stratified\SSRS_`j'_test.dta", replace
                              4. 
                            . }
                            invalid varname
                            an error occurred when simulate executed mystats
                            r(198);
                            
                            end of do-file
                            
                            r(198);

                            Comment


                            • #15
                              Try this

                              Code:
                              program mystats, eclass
                              
                              local sample0=trunc(0.01*`1'*$strat0)
                              local sample1=trunc(0.01*`1'*$strat1)
                              local sample=`sample0'+`sample1'
                              
                              /*** Generate random draw of `1' percent from strata=0 and strata=1 ***/
                              quietly drop _all
                              quietly set obs $cells
                              gen id = _n
                              gen strata=0
                              replace strata=1 if id>$strat0
                              replace id=_n-$strat0 if id>$strat0
                              gen double rnd = runiform()
                              sort strata rnd
                              by strata: gen NN=_n
                              drop if NN>`sample0' & strata==0
                              quietly drop if NN>`sample1' & strata==1
                              quietly drop rnd
                              quietly drop NN
                              
                              /*** Merge random draw with real population dataset ***/
                              quietly merge 1:1 id strata using "$startdir\mydata.dta", keep(3)
                              quietly rename firms_total y /*** Define weights **/
                              quietly gen wt=($strat0/`sample0') if strata==0
                              quietly replace wt=($strat1/`sample1') if strata==1 /*** Define FPC ***/
                              gen fpc =$strat0 if strata==0
                              replace fpc =$strat1 if strata==1
                              
                              /*** Set SVY design ***/
                              quietly svyset [pweight=wt], strata(strata) fpc(fpc)
                              quietly svy: total y
                              
                              ** Here I collect all the info I want in a matrix "b"
                              matrix b=e(b),e(V),e(V_srs)
                              matrix coleq b=b V V_srs
                              ereturn post b
                              end
                              
                              foreach j in `pcts' {
                               simulate ,reps(10):mystats `j'
                              save "$startdir\Simulation Output files\MCS_00125_URBAN_Stratified\SSRS_`j'_test.dta", replace
                              }



                              Comment

                              Working...
                              X