Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sampling without replacement

    A first question. Is there a way to sample without replacement in Stata? I would like a random identifier (identifier2) to be assigned within another identifier (identifier1) without identitifier2 repeating within identifier1. The second question. Can I draw from a distribution, such as a normal with mean 10 and standard deviation of 3, but bound what I draw between 4 and 18?

    I am currently doing the first question this by hand.

    Code:
    clear all
    set obs 100
    generate identifier1 =  runiformint(1,10)
    bys identifier1: generate identifier2 =  runiformint(1,20)
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]
    bys identifier1 (identifier2): replace identifier2 =  runiformint(1,20) if identifier2 == identifier2[_n-1]

  • #2
    For drawing random samples without replacement, see -[D] sample -- Draw random sample-.

    And no, you cannot have a normally distributed random variable between 4 and 18. The range of normal random variable is minus infinity to plus infinity.

    Comment


    • #3
      I had a slightly different interpretation of what you wanted than Joro. I thought that you didn't want to select particular observations, as -sample- will do, but instead that you wanted to assign nonrepeating id2 values at random within groups sharing the same value of id1. If I'm on target (?), you could do this:
      Code:
      gen rnum = runiform()
      bysort id1 (rnum): gen id2 = _n
      drop runum
      This will, of course, create sequential identifiers, but with 1, ..., _Nwithin_id1 assigned to observations at random.

      If you don't want sequentially numbered id2 values, and you don't want the same id2 value to appear for different id1 values, and you don't care about the range of id2, you could do this:
      Code:
      gen rnum = runiform()
      sort rnum
      gen id2 = _n  // every observation will get a distinct id2

      Comment

      Working...
      X