Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random assignment of values by group

    Dear Stata users,

    I have a cross-sectional dataset where I test the effect of a county-level predictor "xvar" on a city-level outcome "yvar". Each county has several cities. The predictor values are the same for all cities in any given county, whereas the outcome varies by city (please see the example below). I wanted to ask how to generate another variable, say xvar2, where xvar values will be randomly assigned to different counties. So that, for example, all cities in Appling county end up with xvar=.32336795, those in Baldwin with var=.3913819 etc but the values assigned randomly. I am working with a cross-sectional dataset and I would like to reproduce a placebo test hoping the effect won't hold with random assignment. the fact that cities are nested into counties complicates the assignment for me. Ideally, I would like to run several regressions with different random assignments and tabulate the results with a neat line of code.

    Thank you in advance for any help.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 county long id str22 city float(xvar yvar)
    "Appling"  1 "Baxley"         .3488717 .77
    "Appling"  1 "Graham"         .3488717 .59
    "Appling"  1 "Pine Grove"     .3488717 .32
    "Appling"  1 "Surrency"       .3488717 .17
    "Atkinson" 2 "Axson"          .8689333 .81
    "Atkinson" 2 "Pearson"        .8689333 .31
    "Atkinson" 2 "Willacoochee"   .8689333 .22
    "Bacon"    3 "Alma"          .32336795 .72
    "Bacon"    3 "Guysie"        .32336795  .7
    "Bacon"    3 "Rockingham"    .32336795 .91
    "Baker"    4 "Elmodel"       .32336795 .68
    "Baker"    4 "Newton"        .32336795 .35
    "Baldwin"  5 "Hardwick"       .5844658 .74
    "Baldwin"  5 "Milledgeville"  .5844658 .19
    "Banks"    6 "Baldwin"        .3913819 .31
    "Banks"    6 "Homer"          .3913819 .14
    "Banks"    6 "Maysville"      .3913819 .65
    end
    label values id id
    label def id 1 "Appling", modify
    label def id 2 "Atkinson", modify
    label def id 3 "Bacon", modify
    label def id 4 "Baker", modify
    label def id 5 "Baldwin", modify
    label def id 6 "Banks", modify


  • #2
    Here's a way to do this that takes advantage of the user written -ssc shufflevar- to do the random assignment of values, which saves creating DIY code for that chore:

    Code:
    preserve
    // Make a county-level data set and shuffle xvar
    bysort county: keep if _n ==1
    keep county xvar
    shufflevar xvar // creates xvar_shuffled by default
    list // check it out
    drop xvar
    tempfile temp
    save `temp'
    restore
    //
    // I'm presuming county names don't duplicate across states.
    merge m:1 county using `temp

    Comment


    • #3
      Thank you so much for your advice, Mike! The code did what i needed!

      Comment

      Working...
      X