Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Saving datasets generated by every replication in MC simulation

    Hi,

    I have an MC simulation which randomly generates a set of variables at each replication for use in estimation and post estimation I then use the output to generate more variables. I know how to save scalars generated by the simulation but I'd also like to save a dataset of all the values of variables generated at each replication. So a simulation of 1000 observations and 1000 replications should ideally give me a dataset that has 1,000,000 observations (1000 observations per replication). I tried to look for a counter to represent the replication number but as far as I know, it doesn't exist (at least in the ado file for simulate). Any help on this would be much appreciated!

    Thanks!

    -Jessica

  • #2
    I don't know if you've read this blog...

    http://blog.stata.com/2015/10/06/mon...s-using-stata/

    and comments?

    So maybe you need to add the replication number to each random data set, save out the dataset for each MC sim step, maybe as f_1 to f_n, and then do multiple appends in a for loop to get the final dataset comprised of each MC dataset?
    Last edited by Dave Airey; 05 Apr 2017, 14:27.

    Comment


    • #3
      I looked it over and it seems -post- builds a dataset by adding rows of observations. I'd rather save a matrix or column vectors from each replication. I figured it out and posted the code for others to use in the future.

      Code:
      clear all
      set more off
      discard
      
      capture program drop temp
      program define temp, eclass
      
      syntax [, nobs(integer 1000)]
      
      preserve
      tempvar x1 x2 x3
      set obs `nobs'
      
      gen `x1' = rnormal()
      gen `x2' = runiform()
      gen `x3' = chi2(2,4)
      
      set matsize 3100
      mkmat `x1', matrix(A) rowprefix(x1_)
      matrix B = A'
      mkmat `x2', matrix(C) rowprefix(x2_)
      matrix D = C'
      mkmat `x3', matrix(F) rowprefix(x3_)
      matrix G = F'
      
      matrix E = B,D
      matrix H = E,G
      ereturn post H
      restore
      end
      
      
      
      simulate _b, reps(10) saving(temp.dta, replace): temp, nobs(1000)
      *we now have a 10 by 3000 matrix
      *the submatrix (10 rows, first 1000 columns) gives us each row representing x1 generated in each of the 10 replications.
      *the submatrix (10 rows, 2nd 1000 columns) gives us each row representing x2 generated in each of the 10 replications.
      *the submatrix (10 rows, 3rd 1000 columns) gives us each row representing x3 generated in each of the 10 replications.
      
      xpose, clear
      *we now have a 3000 x 10 matrix
      
      *create an indicator for each variable
      gen str variable = ""
      replace variable = "x1" in 1/1000
      replace variable = "x2" in 1001/2000
      replace variable = "x3" in 2001/3000
      Last edited by Jessica Lum; 06 Apr 2017, 15:59.

      Comment


      • #4
        I came across this thread when I was looking for a way to save the datasets generated by particular replications in a Monte Carlo analysis (run using the simulate command). I thought I'd write up what I did, in case it is useful to others, though in the end it just boils down to using a global counter. Before running the simulate command, define:
        Code:
        global datasetnum = 1
        Inside the command run by simulate, write
        Code:
        save dataset$datasetnum, replace
        global datasetnum = $datasetnum + 1
        The saved datasets could then be appended to suit the aim of the original poster (or the code could easily be tweaked to do the appending straight away as each replication is run).

        In my case, I'm only interested in a small number of replications where the command I've written seems to encounter a problem. To avoid saving all 1000 or 10000 or so datasets, if I only want to see what happens, say, in the sixth replication, I can instead write:
        Code:
        if $datasetnum == 6 {
             save dataset$datasetnum, replace
        }
        global datasetnum = $datasetnum + 1
        This seems to be an easy way to save the datasets associated with specific replications that you want to take a closer look at.

        Edit: I realise saving the dataset for each replication is easy to do if you run MCs by setting up the loop over the replications yourself, in combination with the postfile command. My aim was to show how the same can easily be done with simulate (in case you prefer using simulate over postfile).
        Last edited by Nicolas Van de Sijpe; 21 Jan 2019, 12:51.

        Comment

        Working...
        X