Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping every 21 observations

    Hi,
    I have a dataset with 3000 observations and would like to group the data as follows:

    1- generate a new variable (e.g., X);
    2- replace X with 1 for the first 21 observations (row1 to row21)
    3- replace X with 2 for the second group of observations which includes row22 to row42
    4- replace X with 3 for the third group of observations which includes row43 to row63
    5- do number #2 for rows 64 to 84
    6- do number#3 for rows 85 to 105
    .
    .
    .
    Thanks,
    NM

  • #2
    Nader,

    What is the issue? You do not know the code to do this or you have an error doing it? I suggest you share an excerpt of your data (as per FAQ) for more details.

    The basic code to implement this would be along the lines of:
    Code:
    gen X = .
    replace X = 1 if _n<21
    replace X = 2 if _n>22 & _n<42
    replace X = 3 if _n>43 & _n<63
    There should probably be a warning that it isn't good practice to generate a variable based on the row, as this can easily change (e.g. in your raw data or through a misplaced sorting function).

    Best,
    Rhys

    Comment


    • #3
      Code:
      gen mod = mod(_n-1,21)
      gen sort = _n
      egen wanted = group(sort mod) if mod == 0
      replace wanted = wanted[_n-1] if wanted==.
      drop mod sort
      Last edited by Ali Atia; 29 Apr 2021, 08:01.

      Comment


      • #4
        egen, seq() is tailor-made for this kind of pattern.
        Code:
        egen wanted = seq(), to(3) block(21)

        Comment


        • #5
          Dear Rhys and Ali,
          Unfortunately, this is a strange dataset with no ids. Thanks for your helpful solution.
          Ali,
          how can replace wanted with 1, 2 and 3 if wanted is 4, 5, and 6. I would like to do this for the entire column. Please see the code below which perform similar to your suggested code.

          Code:
          egen age = seq(), f(1) t(21)
          order age 
          
          egen group =seq() , by(age)
          order group

          Comment


          • #6
            Thanks Nick!

            Comment


            • #7
              The solution in #4 does what you want but for completeness' sake here's an adjustment of the code from #3 which resets to 1 after three iterations:

              Code:
              gen mod63= mod(_n-1,63)
              gen mod21= mod(_n-1,21)
              egen wanted = group(mod63 mod21) if mod21 == 0
              replace wanted = wanted[_n-1] if wanted==.
              drop mod*

              Comment


              • #8
                See also dm44 in https://www.stata.com/products/stb/journals/stb37.pdf from 1997 for some finger exercises in Stata functions of the kind that underlies seq() in egen.

                Comment

                Working...
                X