Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • random number for observations

    Hello,

    out of my panel dataset I want to choose random observations of id's.
    Therefore, I wanted to generate a random number for each id, but some observations are available in 2 waves and others in 3,4,5,6,7 or 8 waves.
    The runiform command
    gen random = runiform (0, 1)
    gives each observation of an id a random number, but I want the id's that are various times inside the panel to have the same random numbers each time.

    Here is an example of my data:

    [CODE]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long cid float random int wave

    111203 .3488717 2
    111203 .2668857 3
    111203 .1366463 4
    111203 .028556867 6
    111203 .8689333 8
    111203 .3508549 9
    907201 .07110509 8
    907201 .32336795 9
    907201 .5551032 10
    2767201 .875991 3
    2767201 .20470947 4
    3491201 .8927587 8
    3491201 .5844658 9
    3491201 .3697791 11
    4835201 .8506309 2
    4835201 .3913819 4
    4835201 .11966132 5
    4858201 .7542434 4
    4858201 .6950234 5
    4858201 .6866152 6
    4858201 .9319346 7
    4858201 .4548882 8
    4858201 .0674011 9
    4858201 .3379889 11
    6151201 .9748848 6
    6151201 .7264384 7
    6151201 .04541512 8
    6151201 .7459667 9
    6151201 .4961259 10
    6519201 .7167162 4
    6519201 .859742 5
    6519201 .13407555 6
    6519201 .48844185 7
    6519201 .8712187 8
    7631201 .7664683 8
    7631201 .25125554 9
    7631201 .16636477 11
    8948201 .7437958 2
    8948201 .9805113 3
    8948201 .7295772 5
    8948201 .9011049 9
    9657201 .26436493 5
    9657201 .8856509 8
    9657202 .882112 10
    9657203 .748933 10
    10250201 .9196262 8
    10250201 .6934533 10
    10250201 .2154026 11
    10957202 .8285888 2
    10957202 .04421536 4
    10957202 .8630378 5
    11295201 .3526046 4
    11295201 .7720399 5
    11295201 .5861199 6
    11295201 .3227766 7
    11295201 .17293066 9
    11295201 .8053644 10
    11295201 .3060019 11
    11295202 .21909967 9
    11295202 .724731 10
    11295202 .6964867 11
    12490201 .9119344 3
    12490201 .6795634 4
    12490201 .3549416 5
    12490201 .73897 6
    12490201 .18740167 7
    12490201 .3146128 8
    12490201 .1375693 9
    12490202 .6537739 6
    12490202 .27013195 7
    12490202 .8998394 8
    12490202 .5734232 9
    12490202 .11147037 10
    12490203 .4145227 9
    13345202 .003052204 4
    14898201 .6659978 4
    14898201 .3462876 6
    14902201 .0780235 5
    14902201 .12758136 6
    14902201 .2297006 7
    14902201 .3295547 8
    14902201 .4144089 9
    14902201 .036084738 10
    14902201 .08438109 11
    16671201 .009876247 2
    16671201 .3200437 3
    16671201 .005196966 4
    16829201 .22754347 2
    16829201 .851468 3
    16829201 .9820066 4
    16829201 .032479186 6
    16829202 .9874847 6
    16829202 .894106 7
    16829202 .9684734 9
    16829202 .23922028 10
    17018203 .6927336 2
    17018203 .4884359 3
    17018203 .4376452 4
    17018204 .5858005 6
    17018204 .3787092 7
    end

  • #2
    Code:
    bysort cid (wave) : gen newrandom = runiform() if _n == 1 
    by cid: replace newrandom = newrandom[1]

    Comment


    • #3
      Thank you Nick, that worked perfecltly.
      Now Im confronted with a new challenge.
      I would like to choose random numbers with the most observation times.
      The variable "n_children" contains number of children and I would like to include people with one child and also people with more than one child, but of those with more than one child, I want to pick the children who have the most observations.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long cid float(random n_children) int wave
        111203   .9780947 1  2
        111203   .9780947 1  3
        111203   .9780947 1  4
        111203   .9780947 1  6
        111203   .9780947 1  8
        111203   .9780947 1  9
        907201   .6717034 2  8
        907201   .6717034 2  9
        907201   .6717034 2 10
       2767201   .7768312 1  3
       2767201   .7768312 1  4
       3491201   .5281965 2  8
       3491201   .5281965 2  9
       3491201   .5281965 2 11
       4835201   .9876741 1  2
       4835201   .9876741 1  4
       4835201   .9876741 1  5
       4858201   .9498287 1  4
       4858201   .9498287 1  5
       4858201   .9498287 1  6
       4858201   .9498287 1  7
       4858201   .9498287 1  8
       4858201   .9498287 1  9
       4858201   .9498287 1 11
       6151201  .29660156 1  6
       6151201  .29660156 1  7
       6151201  .29660156 1  8
       6151201  .29660156 1  9
       6151201  .29660156 1 10
       6519201   .8077623 1  4
       6519201   .8077623 1  5
       6519201   .8077623 1  6
       6519201   .8077623 1  7
       6519201   .8077623 1  8
       7631201   .6967398 1  8
       7631201   .6967398 1  9
       7631201   .6967398 1 11
       8948201    .124575 1  2
       8948201    .124575 1  3
       8948201    .124575 1  5
       8948201    .124575 1  9
       9657201 .009352685 3  5
       9657201 .009352685 3  8
       9657202   .7423609 3 10
       9657203   .6246868 3 10
      10250201   .3707404 2  8
      10250201   .3707404 2 10
      10250201   .3707404 2 11
      10957202   .9758291 1  2
      10957202   .9758291 1  4
      10957202   .9758291 1  5
      11295201   .6274327 2  4
      11295201   .6274327 2  5
      11295201   .6274327 2  6
      11295201   .6274327 2  7
      11295201   .6274327 2  9
      11295201   .6274327 2 10
      11295201   .6274327 2 11
      11295202     .68028 2  9
      11295202     .68028 2 10
      11295202     .68028 2 11
      12490201   .9618542 3  3
      12490201   .9618542 3  4
      12490201   .9618542 3  5
      12490201   .9618542 3  6
      12490201   .9618542 3  7
      12490201   .9618542 3  8
      12490201   .9618542 3  9
      12490202   .3463633 3  6
      12490202   .3463633 3  7
      12490202   .3463633 3  8
      12490202   .3463633 3  9
      12490202   .3463633 3 10
      12490203  .26410145 3  9
      13345202   .7256252 1  4
      14898201   .3859604 3  4
      14898201   .3859604 3  6
      14902201   .8343201 1  5
      14902201   .8343201 1  6
      14902201   .8343201 1  7
      14902201   .8343201 1  8
      14902201   .8343201 1  9
      14902201   .8343201 1 10
      14902201   .8343201 1 11
      16671201   .9593934 1  2
      16671201   .9593934 1  3
      16671201   .9593934 1  4
      16829201   .7894724 2  2
      16829201   .7894724 2  3
      16829201   .7894724 2  4
      16829201   .7894724 2  6
      16829202   .9936081 2  6
      16829202   .9936081 2  7
      16829202   .9936081 2  9
      16829202   .9936081 2 10
      17018203  .11311834 3  2
      17018203  .11311834 3  3
      17018203  .11311834 3  4
      17018204   .6573325 3  6
      17018204   .6573325 3  7
      end
      label values wave WAVE_prt2
      label def WAVE_prt2 2 "2 2009/10", modify
      label def WAVE_prt2 3 "3 2010/11", modify
      label def WAVE_prt2 4 "4 2011/12", modify
      label def WAVE_prt2 5 "5 2012/13", modify
      label def WAVE_prt2 6 "6 2013/14", modify
      label def WAVE_prt2 7 "7 2014/15", modify
      label def WAVE_prt2 8 "8 2015/16", modify
      label def WAVE_prt2 9 "9 2016/17", modify
      label def WAVE_prt2 10 "10 2017/18", modify
      label def WAVE_prt2 11 "11 2018/19", modify

      Comment


      • #4
        Ignoring households without children would seem straightforward, but you've lost me on your other criterion or criteria. What if children tie on number of observations, such as there being one each?

        Comment


        • #5
          In my sample are no households without children, for my model I need every child that has no siblings (n_children=1) but also children with siblings. But in the households with multiple children, I want so select only one of them (and the best would be to catch the child with the most observations)

          Comment

          Working...
          X