Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generate an identifier variable that changes with every 3 observations?

    Hi folks. I have a dataset of this form:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id dist wc)
    1 60 3
    1 90 2
    1 10 3
    1 30 1
    1 90 3
    1 10 2
    1 10 4
    1 60 3
    1 90 1
    2 90 2
    2 50 2
    2 10 3
    2 60 1
    2 30 4
    2 30 1
    2 90 3
    2 10 2
    2 40 3
    end
    and I would like to make a variable "id2" that takes on the value 1 for observations 1-3, 2 for observations 4-6, etc. This is the format I would like:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id id2 dist wc)
    1 1 60 3
    1 1 90 2
    1 1 10 3
    1 2 30 1
    1 2 90 3
    1 2 10 2
    1 3 10 4
    1 3 60 3
    1 3 90 1
    2 4 90 2
    2 4 50 2
    2 4 10 3
    2 5 60 1
    2 5 30 4
    2 5 30 1
    2 6 90 3
    2 6 10 2
    2 6 40 3
    end
    Any idea how to do this? It's probably an easy fix, but I can't figure it out. Thank you so much!

  • #2
    Code:
    generate id2 = 1 if _n==1
    replace  id2 = id2[_n-1] + (mod(_n,3)==1) if _n > 1
    order id id2
    list, sep(3)
    Output:
    Code:
    . list, sep(3)
    
         +----------------------+
         | id   id2   dist   wc |
         |----------------------|
      1. |  1     1     60    3 |
      2. |  1     1     90    2 |
      3. |  1     1     10    3 |
         |----------------------|
      4. |  1     2     30    1 |
      5. |  1     2     90    3 |
      6. |  1     2     10    2 |
         |----------------------|
      7. |  1     3     10    4 |
      8. |  1     3     60    3 |
      9. |  1     3     90    1 |
         |----------------------|
     10. |  2     4     90    2 |
     11. |  2     4     50    2 |
     12. |  2     4     10    3 |
         |----------------------|
     13. |  2     5     60    1 |
     14. |  2     5     30    4 |
     15. |  2     5     30    1 |
         |----------------------|
     16. |  2     6     90    3 |
     17. |  2     6     10    2 |
     18. |  2     6     40    3 |
         +----------------------+
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      Here's another trick with mod(): gen byte id2 = 1 + mod(_n-1,3)
      I bet there are others that Bruce and I haven't thought of.

      Comment


      • #4
        Thanks so much, Mike and Bruce!

        Comment


        • #5
          I believe the command this command will do the trick as well:

          Code:
          sort id
          egen id2 = seq(), b(3)
          Last edited by Marcos Almeida; 13 Mar 2020, 09:02.
          Best regards,

          Marcos

          Comment


          • #6
            Like Mike, I was pretty sure there was a better (more elegant) way. Marcos, I had to tweak your suggestion a bit to get it to work:

            Code:
            egen id2 = seq(), b(3)

            --
            Bruce Weaver
            Email: [email protected]
            Version: Stata/MP 18.5 (Windows)

            Comment


            • #7
              I've nothing against seq() -- but note also

              Code:
              gen wanted = ceil(_n/3) 
              When seq() was introduced as an egen function in about 1999 I think ceil() was not in Stata -- and that history is echoed in the help for egen which (still) doesn't explain that ceil() can be useful too.

              Comment

              Working...
              X