Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • random numbers to groups of random size

    Hi Statalisters! I have the following problem.

    I have N individuals and I would like to assign to them n random numbers. Since n<N what I need to do (I guess) is to generate n random groups of random size. I did the following:

    generate rannum = uniform()
    egen randomnumber = cut(rannum), group(n)

    and it would work but it creates groups of equal size that is not what I need.

    I also tried this:

    gen randomnumber = runiformint(1,n)

    which also would work but does not assign all the numbers between 1 and n (let us say, almost all for some reason I don't get), and again is not what I need.

    Just to be clear, suppose N=10 and n=4, I'd like to have something like:

    individuals randomnumber
    1 2
    2 3
    3 3
    4 1
    5 4
    6 1
    7 3
    8 4
    9 2
    10 1

    Where randomnumber is randomly assigned to individuals.

    Thanks for any help!
    Last edited by mary nick; 05 Dec 2021, 06:10.

  • #2
    I can't follow your prescription as a sharp rule (meaning, one that leads directly to code). How big is n?

    Comment


    • #3
      Perhaps this example starts you in a useful direction. Each observation will have a uniform probability distribution across {1, 2, ... n}. The probability distribution for the number of observations in each group is a little more complicated.
      Code:
      local N 10  // observations
      local n 4   // groups
      
      // generate example data
      set obs `N'
      generate id = 100+_n // 101, 102, ...
      
      // sort randomly
      set seed 6666
      generate double tosort = runiform()
      sort tosort
      drop tosort
      
      // initialize assignment
      generate int group = .
      // first n values get 1-n
      replace group = _n in 1/`n'
      // remaining values randomly chosen
      replace group = runiformint(1,`n') if missing(group)
      
      // put the data back in order and see what we have
      sort id
      list, clean noobs
      tab group
      Code:
       list, clean noobs
      
           id   group  
          101       3  
          102       3  
          103       2  
          104       1  
          105       2  
          106       4  
          107       2  
          108       3  
          109       2  
          110       1  
      
      . tab group
      
            group |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                1 |          2       20.00       20.00
                2 |          4       40.00       60.00
                3 |          3       30.00       90.00
                4 |          1       10.00      100.00
      ------------+-----------------------------------
            Total |         10      100.00
      Last edited by William Lisowski; 05 Dec 2021, 09:50.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        I can't follow your prescription as a sharp rule (meaning, one that leads directly to code). How big is n?
        n is about 500, N about 2000

        Comment


        • #5
          My example with these values of n and N.
          Code:
          local N 2000  // observations
          local n 500   // groups
          
          // generate example data
          set obs `N'
          generate id = 100+_n // 101, 102, ...
          
          // sort randomly
          set seed 6666
          generate double tosort = runiform()
          sort tosort
          drop tosort
          
          // initialize assignment
          generate int group = .
          // first n values get 1-n
          replace group = _n in 1/`n'
          // remaining values randomly chosen
          replace group = runiformint(1,`n') if missing(group)
          
          // put the data back in order and see what we have
          sort id
          egen groupsize = count(id), by(group)
          egen totab = tag(group)
          tab groupsize if totab
          Code:
          . tab groupsize if totab
          
            groupsize |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    1 |         31        6.20        6.20
                    2 |         82       16.40       22.60
                    3 |        102       20.40       43.00
                    4 |         96       19.20       62.20
                    5 |         91       18.20       80.40
                    6 |         57       11.40       91.80
                    7 |         19        3.80       95.60
                    8 |         15        3.00       98.60
                    9 |          5        1.00       99.60
                   10 |          2        0.40      100.00
          ------------+-----------------------------------
                Total |        500      100.00

          Comment

          Working...
          X