Adding Noise to Simulations

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#1

Adding Noise to Simulations

17 Apr 2022, 07:32

One thing my instructors have told me is that I should use Monte-Carlo simulations to validate new statistical estimators.

One such synthetic dataset in a paper I found (for free on arXiv, see page 23 if you'd like) generated the data according to the following

Code:

clear set obs 100 egen id = seq(), f(1) t(100) // 100 units expand 250 // 250 time periods bys id : g time = _n // time 1-250 set seed 1000 // The synthetic data **!! qbys id: g y = ln(time)+4*sin(time/_pi)+4*cos(time/_pi)+runiform() //**!! above, runiform() should be an additive noise term epsilon_t xtset id time, g

The issue is that after the second term [4*cos(time/_pi)], there's an error term epsilon indexed to time. I currently used runiform(), but methinks this isn't the same thing as adding in noise. Specifically, the paper says that epsilon_t

is an i.i.d Gaussian noise with a mean of zero and variances of 1, 4, 9, 16, and 25

Well........ how would I generate Gaussian noise? Or any other kind of noise? How would I specify its variance?

Presumably there's a simple solution, I've just never made a simulation before. Any ideas how I'd do this?
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#2

17 Apr 2022, 07:58

Code:

help rnormal()
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#3

17 Apr 2022, 08:09

I’m not sure if I can give you specific advice to your problem. It does appear that the description of epsilon doesn’t match the code. It seems like you want to add

Code:

rnormal(0, s) // is parametrized by s as the standard deviation

In general, Monte-Carlo simulations are based on using a specified data generating model to create fake data, and the specific model is usually informed by the implied models used for analysis. If we consider a simple linear regression, then this model specifically calls for additive effects with an explicit error term (epsilon) following a normal distribution with mean zero and user-selected variance. In real terms you add -rnormal()- to your linear predictor term. On the other extreme is when there is no explicit error term, such as with logistic regression, where the “noise” is probabilistic realizations of the outcome. In hierarchical models, there will be distributional (noise) assumptions at each level.

Edit: crossed with #2 as I was typing this out. Edited typos.

Last edited by Leonardo Guizzetti; 17 Apr 2022, 08:35.
1 like
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#4

17 Apr 2022, 08:16

My estimator is similar to the one in the paper I linked, so the assumptions are pretty much the same. Thank you both so much!
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#5

17 Apr 2022, 12:17

And just for your info, runiform() would have generated random numbers from a (0, 1) uniform distribution. That has a mean of 0.5, and a support of 0 to 1 inclusive. Its SD is less than the SD of a standard normal distribution. So it was actually adding noise, but it was adding less noise than you probably wanted to model, and it was also adding in bias. If you were simulating binary data, you could do something like

Code:

gen y = runiform() > cutoff

Where cutoff is whatever the cutoff probability is.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#6

17 Apr 2022, 16:14

Thank you. I imagine I could do all kinds of data generating processes with different error terms..... such as, an AR(1) error term?

And, presuming that an estimator predicts the treated unit's outcomes well, even in cases of reasonable levels of noise or different kinds of error terms, that's a better argument for the validity of the estimator, right? Weiwen Ng

Last edited by Jared Greathouse; 17 Apr 2022, 16:17.
Comment

Announcement

Adding Noise to Simulations

Comment

Comment

Comment

Comment

Comment