Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing true variable values with simulated data across variables of different types

    Hello,

    I have the following data:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int var2 str10 var3 int var4 byte(var5 var6 var7 var8) float(var9 var10)
    435 "Subtracted" 122 78 44  85 82 5.1 .3
    435 "Subtracted" 124 80 44  78 85 5.6 .3
    401 "Subtracted" 120 78 42  73 76 5.5 .1
    401 "Subtracted" 122 77 45  71 82 5.6 .2
    500 "Subtracted" 106 54 52  92 57 5.4 .3
    500 "Subtracted" 102 57 45  97 56 5.1 .4
    522 "Subtracted" 126 79 47  85 54 6.3 .9
    522 "Subtracted" 124 74 50  79 51 6.5 .2
    486 "Subtracted" 109 67 42 100 56 4.9 .4
    486 "Subtracted" 108 62 46  97 51   5 .1
    423 "Subtracted" 122 67 55  86 52 4.9 .2
    423 "Subtracted" 109 64 45  88 55 4.8 .2
    381 "Subtracted" 114 63 51  85 66 4.4 .2
    381 "Subtracted" 103 62 41  86 64 4.4 .4
    411 "Subtracted" 110 68 42  74 65 5.5 .1
    411 "Subtracted" 110 67 43  76 67 5.4 .2
    437 "Subtracted" 113 72 41  86 49 5.1 .1
    437 "Subtracted" 112 64 48  85 49 5.1 .1
    463 "Subtracted" 103 67 36  87 64 5.4 .4
    463 "Subtracted" 103 67 36  85 64 5.4 .2
    end
    I would like replace these variable values with simulated data that remains consistent with the range of values of the true data. I'd also like to do this in an automated fashion.


    I tried the following:

    Code:
    ds, has(type float int byte) // Select numeric variables
    foreach var of varlist `r(varlist)' {
        // Get the minimum and maximum of the variable
        quietly summarize `var'
        local min = r(min)
        local max = r(max)
    
        local type = `: type `var''
    
        if "`type'" == "float" {
            gen scrambled_`var' = runiform(`min', `max')
        }
        else if "`type'" == "int" | "`type'" == "byte" {
            gen scrambled_`var' = round(runiform(`min', `max'))
        }
    }
    but am getting the error "byte not found".


    Can anyone help me troubleshoot?

    Many thanks,
    Meghan

  • #2
    Remove the = from the -local type = `: type `var''- command and it will run.

    When -`:type `var'- will evaluate as int on the first iteration of the loop because var2 is an int variable. So your command expands out as -local type = int-. Remember that in a local macro definition, if you use the = operator, the expression that is found on the right hand side gets evaluated. So Stata tries to evaluate int, which means it looks for a variable named int--and, of course, it can't find one. What you want your command to expand as is -local type int-, which tells Stata to store the three character string int (not the value of an expression based on a variable named int). Removing the = sign accomplishes that.

    Comment


    • #3
      look at shuffle_var

      Comment


      • #4
        Oh perfect, thank you Clyde!

        Comment

        Working...
        X