Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating variable equal to count of number of unique values in a different variable, long panel data

    Hello,

    I am trying to create a variable equal to the number of years of data available for each observation. I have a monthly dataset of agencies spanning 6 years. I used the following code but it's not working. There could be a way easier way of doing this too. Can anyone help me?

    Code:
    gen identifier = _n
    levelsof identifier, local(identifierlvls)
    gen numberofyears=.
    foreach l of local identifierlvls {
        tab year if identifier==`l'
        matrix yearsmat`l' = r(r)
        replace numberofyears=yearsmat`l' if identifier==`l'
    }
    Thank you very much.

    Tom

  • #2
    You construct an identifier that is just 1 to the number of observations (subject to storage type; with a large data set some big integers can't be held exactly in a float). So by design each identifier occurs just once and your code if it worked would reflect exactly what you created.

    A bug is that

    Code:
     replace numberofyears=yearsmat`l' if identifier==`l'
    needs to be
    Code:
     replace numberofyears=yearsmat`l'[1,1] if identifier==`l'
    But the creation of a matrix -- indeed matrices, one for each distinct identifier -- can be avoided, as code could be
    Code:
      
     replace numberofyears= r(r) if identifier==`l'
    As said, however, your code when fixed would record each identifier occurring once. Assuming an identifier id and a year variable year then
    Code:
    egen tag  = tag(id year)  
    
    egen wanted = total(tag), by(id)
    counts distinct values of
    year. If you don't have a year variable, then sing out and say exactly what your dates look like with a data example to get good advice. https://www.statalist.org/forums/for...s-in-a-dataset is another thread started today that seems to have the same answer.
    Last edited by Nick Cox; 04 Apr 2022, 11:42.

    Comment


    • #3
      Nick Cox I had a year variable so your code
      Code:
      egen tag = tag(id year)
      worked great and was obviously way easier. Thank you very much.

      Comment

      Working...
      X