Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Randomly shuffling only one variable in Stata

    Hello,

    We are trying to run a placebo test for our model and to do that we wish to randomly shuffle our Y variable, keeping the remaining dataset unchanged (such that the association between Y and the X's becomes random). We found the "shufflevar" command particularly useful to do this.

    However, we wish to repeat this analysis 100 times and we observe that in every iteration "shufflevar" is shuffling the variable in the exact same way. As a result, each of the 100 iterations is just becoming an exact replica of each other. We would rather want a random shuffling each and every time, which gives us random Y's which are distinct from each other. Is there any other way in which we can just randomly mix up the values of the Y variable?

    We would be extremely grateful if anyone could guide us on this! Thanks.

  • #2
    Perhaps you have made an error in your coding, but without seeing the code you ran, it's difficult to guess what the error might be.

    Perhaps the following example that shufflevar does what it says it does will lead you to discover your problem.
    Code:
    . clear all
    
    . set obs 10
    Number of observations (_N) was 0, now 10.
    
    . generate x = _n
    
    . forvalues i = 1/3 {
      2.         shufflevar x
      3.         rename x_shuffled x_`i'
      4. }
    
    . list, clean
    
            x   x_1   x_2   x_3  
      1.    1     2     2     6  
      2.    2     4     9     3  
      3.    3     8     7     8  
      4.    4     5     1     7  
      5.    5    10    10    10  
      6.    6     7     5     2  
      7.    7     3     8     9  
      8.    8     1     6     4  
      9.    9     6     3     5  
     10.   10     9     4     1  
    
    .
    Last edited by William Lisowski; 22 Aug 2021, 12:40.

    Comment


    • #3
      You can adapt this to your particular data; here I demonstrate the approach with a toy data set.

      clear*
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float Y
      -.4868504
      -1.4760038
      -.010911492
      -.013760252
      .5155788
      -.13912173
      1.2873173
      -.58777297
      -.030600455
      -2.1968436
      -.9155009
      -.2230017
      .4945496
      -.26231077
      .3661239
      -.962465
      .25329232
      2.2842488
      .3223809
      -.40590465
      -.6211073
      .13218145
      2.0378487
      1.1648929
      1.0495144
      .3293923
      .7287014
      .8152171
      .8828248
      -.021391753
      .9599126
      .6054105
      -.6521159
      .7883045
      -2.079254
      .4353854
      .8039326
      1.9533952
      -1.2994204
      -.12833437
      .4324252
      -2.3337057
      -.7645378
      1.0160784
      .18720382
      -.3145866
      .4677534
      .8098378
      -.5765787
      -1.3777484
      .04223201
      1.1632676
      .014696945
      .000711149
      -.6886311
      .6953909
      -.0475542
      .46003765
      .6942165
      -.6036441
      1.4292794
      -.77772
      .3661323
      -.1939826
      1.6394366
      -.3533281
      -.149388
      -.7269634
      -.4719959
      .58884954
      .23385265
      -.6359429
      .35979015
      1.0419201
      -1.0310671
      .22454466
      -3.217909
      -.13631526
      .3730735
      -.4330619
      1.418919
      .05279221
      -.14697625
      .8652378
      -.7876329
      -1.6520803
      -.029363904
      -2.2438629
      -.2996832
      -.4705126
      .0023159687
      .6715717
      -1.609707
      -.6880803
      -.54471886
      -.7130303
      1.506732
      .9514928
      .190155
      1.2029504
      end

      capture program drop shuffle_them
      program define shuffle_them
      frame shuffle_frame {
      gen double shuffle = runiform()
      sort shuffle
      replace seq = _n
      drop shuffle
      }
      frlink rebuild shuffle_frame, frame(shuffle_frame)
      replace Y = frval(shuffle_frame, Y)
      exit
      end

      // SET UP FOR SHUFFLING
      gen long seq = _n
      frame put seq Y, into(shuffle_frame)
      frlink 1:1 seq, frame(shuffle_frame)

      set seed 1234 // DO NOT DO THIS INSIDE A LOOP; DO IT ONLY ONCE AT THE START
      forvalues i = 1/10 { // SHUFFLE THE DATA 10 TIMES AND SHOW FIRST 5 OBS EACH TIME
      list Y in 1/5
      shuffle_them
      }
      [/code]

      Note: As this uses frames, it requires version 16 or later.

      In the future, when asking for help with code, show example data, using the -dataex- command (as I have done here). If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      Added: Crossed with #2.

      Comment


      • #4
        Thanks to both of you for the prompt response. I will keep it in mind to post the code/data henceforth.

        Comment


        • #5
          This Stata Tip also explains how you can shuffle (also called permute) a variable manually:
          Ängquist, Lars. "Stata tip 92: Manual implementation of permutations and bootstraps." The Stata Journal 10, no. 4 (2010): 686-688.

          Comment

          Working...
          X