Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • unique random integer

    Dear experts,

    I have a numeric identifier which is missing in observations that have no positive laboratory results..I encoded the variable and its largest value is 3999, the total sample size is just over 10000

    I need to use that variable to reshape my data from long to wide, thus I need to assign random values to its missing part,

    the problem is when I use the following line, I get many duplicates:

    Code:
    replace identifier= floor((14000-4000+1)*runiform() + 4000) if identifier==.
    I am struggling to create a unique random integer using the following command

    Any help is greatly appreciated

    Omar

  • #2
    Why does it have to be random? An identifier could be something like

    Code:
     
    gsort -identifier 
    clonevar identifier2 = identifier 
    replace identifier2 = 10000 + _n  if missing(identifier2)
    Now your identifiers go up to 3999 (original) or up from 10001 (new) and are distinct.



    Comment


    • #3
      it does not have to be random at all , you are right..thank you Nick..

      but just for the sake of learning, how is it possible to generate random integers using runiform() ? cause I looked it up and I can't figure it out

      Comment


      • #4
        If you want random integers, then something like

        Code:
        gen ifoo = ceil(100 * runiform())
        might do, for any suitable value of 100.

        For distinct random integers, I would do something like

        Code:
         
        gen cfoo = runiform()
        sort cfoo 
        gen ifoo = _n
        For reproducibility, set seed beforehand. Apply constants as required. Watch out for storage types in larger datasets.

        Comment


        • #5
          Originally posted by Omar Okasha View Post
          it does not have to be random at all , you are right..thank you Nick..

          but just for the sake of learning, how is it possible to generate random integers using runiform() ? cause I looked it up and I can't figure it out
          It is possible, use "int" beforehand instead "floor" afterward:

          Code:
          gen int x=(14000-4000+1)*runiform()+4000
          However, I find it weird that if your code is:

          Code:
          gen x=int(14000-4000+1)*runiform()+4000
          You don't get integers !!!
          Last edited by Roman Mostazir; 15 Dec 2014, 16:33.
          Roman

          Comment


          • #6
            never mind, nothing to see here, move on...

            Comment


            • #7
              Roman's post #6:

              However, I find it weird that if your code is:
              Code:
              gen x=int(14000-4000+1)*runiform()+4000
              You don't get integers !!!
              Not weird! You multiply an integer with a non-integer (runiform()) and add an integer. Result: non-integer. To get an integer, add parentheses:
              Code:
              gen x=int((14000-4000+1)*runiform())+4000



              Comment


              • #8
                Originally posted by Nick Cox View Post
                If you want random integers, then something like

                Code:
                gen ifoo = ceil(100 * runiform())
                might do, for any suitable value of 100.

                For distinct random integers, I would do something like

                Code:
                gen cfoo = runiform()
                sort cfoo
                gen ifoo = _n
                For reproducibility, set seed beforehand. Apply constants as required. Watch out for storage types in larger datasets.
                Extremely, interesting subject. With respect to the data, I would like to get:
                1. distinct
                2. random
                3. non-sequential
                4. Ideally, with some flexibility to mix letters and numbers
                Code:
                clear
                set obs 100000
                gen cfoo = runiform()
                sort cfoo
                gen ifoo = _n
                * SJ.
                distinct cfoo
                With respect to practical applicability, I may be interested in obtaining numerous set of massive teaching data sets with fake postcodes or millions of randomly National Insurance numbers.
                Kind regards,
                Konrad
                Version: Stata/IC 13.1

                Comment


                • #9
                  Thanks for the clarification Svend (#7), I was weird !!
                  Roman

                  Comment


                  • #10
                    Originally posted by Roman Mostazir View Post

                    It is possible, use "int" beforehand instead "floor" afterward:

                    Code:
                    gen int x=(14000-4000+1)*runiform()+4000
                    Roman, this is dangerous and not advisable. As your data grows you will eventually hit the situation similar to:
                    Code:
                    . set obs 1
                    obs was 0, now 1
                    
                    . gen int x=40000.3
                    
                    . l
                    
                         +---+
                         | x |
                         |---|
                      1. | . |
                         +---+
                    Such errors are hard to find. This is because int here is a type, not a function like in the next example:
                    Code:
                    . di int(40000.3)
                    40000

                    Nick has given very good recommendations in post #4 above. Stick to them.

                    Best, Sergiy Radyakin

                    Comment


                    • #11
                      Many thanks Sergiy, I learn everyday.
                      Roman

                      Comment


                      • #12
                        Apologies for not replying earlier, somehow notifications did not work and I thought I got no replies..

                        Many thanks to all..

                        Nick's recommendation worked like charm..but learned great deal from comments by Roman, Konrad and Sergiy
                        Last edited by Omar Okasha; 20 Dec 2014, 14:56.

                        Comment

                        Working...
                        X