Creating 1000 replications of regression with shuffled variable, and saving coefficients.

Yaya Mensah

Join Date: Jun 2022

Posts: 3
#1

Creating 1000 replications of regression with shuffled variable, and saving coefficients.

23 Jun 2022, 10:47

Dear all,

Please forgive the rookie question.

I want to perform a placebo test to show that my IV results are not the result of a spurious correlation. To that end, I want to perform 1000 replications of my IV specification, shuffling the endogenous variable (within each year) each time. I want to save the coefficients and p-values on the endogenous variable so that I can depict them on a density plot.

I know that by using the -shufflevar- command with cluster(year) I can shuffle the endogenous variable by year and then run the regression. This would then be:

shufflevar endo, cluster(year)
xtivreg2 y (endo_shuffled = instrument) controls, fe

To do it once. What I don't know how to do is to run this 1000 times, saving the coefficient on endo_shuffled each time, so that I can produce the density plot. I don't know this because my knowledge of stata is not yet at a level where I know how to program loops etc.

Therefore, I would appreciate any help.

Thankyou.
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 9838

23 Jun 2022, 11:29

shufflevar and xtivreg2 are from SSC (FAQ Advice #12). The following uses the official xtivreg and is not tested.

Code:

frame create results
frame results{
    set obs 1000
    g coef=.
    g pval=.
}
forval i=1/1000{
    shufflevar endo, cluster(year)
    xtivreg y (endo_shuffled = instrument) controls, fe
    frame results{
        replace coef= `=r(table)["b", "`e(instd)'"]' in `i'
        replace pval= `=r(table)["pvalue", "`e(instd)'"]' in `i'
    }
}
frame change results
browse

Last edited by Andrew Musau; 23 Jun 2022, 11:32.

Comment

Yaya Mensah

Join Date: Jun 2022

Posts: 3
#3

24 Jun 2022, 03:44

Thank you, Andrew. Unfortunately it doesn't run for me, the problem seems to come here:

replace coef= `=r(table)["b", "`e(instd)'"]' in `i' replace pval= `=r(table)["pvalue", "`e(instd)'"]' in `i'
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9838
#4

24 Jun 2022, 06:24

What version of Stata do you have?
Comment
Yaya Mensah

Join Date: Jun 2022

Posts: 3
#5

24 Jun 2022, 09:57

V 16.1,

But don't worry Andrew, I managed to write a solution myself using a loop and then the -simulate- command.

Let me know if you would like me to post it?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9838
#6

25 Jun 2022, 12:41

Originally posted by Yaya Mensah View Post

Let me know if you would like me to post it?

Yes, it should be useful for anyone following this thread.
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3047

26 Jun 2022, 12:19

-permute- should do the trick here as well. E.g.,

Code:

. sysuse auto, clear
(1978 automobile data)

. permute mpg _b, saving(bTEMP, replace): ivregress 2sls price (mpg = headroom turn weight)
(running ivregress on estimation sample)
(file bTEMP.dta not found)

Permutations (100): ..........10..........20..........30..........40..........50..........60......
> ....70..........80..........90..........100 done

Monte Carlo permutation results                    Number of observations =  74
Permutation variable: mpg                          Number of permutations = 100

      Command: ivregress 2sls price (mpg = headroom turn weight)

-------------------------------------------------------------------------------
             |                                               Monte Carlo error
             |                                              -------------------
           T |    T(obs)       Test       c       n      p  SE(p)   [95% CI(p)]
-------------+-----------------------------------------------------------------
         mpg | -323.0714      lower      45     100  .4500  .0497  .3503  .5527
             |                upper      55     100  .5500  .0497  .4473  .6497
             |            two-sided                  .9000  .0300  .8412  .9588
             |
       _cons |   13045.8      lower      55     100  .5500  .0497  .4473  .6497
             |                upper      45     100  .4500  .0497  .3503  .5527
             |            two-sided                  .9000  .0300  .8412  .9588
-------------------------------------------------------------------------------
Notes: For lower one-sided test, c = #{T <= T(obs)} and p = p_lower = c/n.
       For upper one-sided test, c = #{T >= T(obs)} and p = p_upper = c/n.
       For two-sided test, p = 2*min(p_lower, p_upper); SE and CI approximate.

. use bTEMP
(permute mpg : ivregress)

. hist _b_mpg
(bin=10, start=-5132.626, width=890.80593)

. summ _b_mpg

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
      _b_mpg |        100   -90.67279    1254.967  -5132.626   3775.433

.

Announcement

Creating 1000 replications of regression with shuffled variable, and saving coefficients.

Comment

Comment

Comment

Comment

Comment

Comment