Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable that counts number of unique observations

    I previously asked this question on the Stata forum, but now wish to implement it in Mata.

    I have data that looks like the following:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(Individual VacationDestination)
    1 12
    1 12
    1 14
    1 14
    2 12
    2 15
    2 16
    3 12
    3 12
    3 12
    4  1
    4  2
    4  3
    end

    In the above, Individual corresponds to the individual ID and vacation destination corresponds to the destination that the individual visited in a particular year. I wish to create a variable that counts the number of unique destinations an individual went to. This variable, call it C, will take on a value of 2 for individual 1, 3 for individual 2 and so on

    As of now, what I have done on mata is as follows.

    Code:
    
    
    C=(sort(Individual,1),VacationDestination)
    
    
    z=uniqrows(C)
    Although I have obtained the values of unique datapoints, how can I count the number of unique datapoints?

  • #2
    The function rows() counts rows.

    Comment


    • #3
      Hi Nick,
      Thanks a lot for your response. The command rows(.) does count the number of rows. However, I wish to create a variable that counts the number of distinct observations by variable value. I am not quite sure how to do that on mata.

      Comment


      • #4
        You can use panelsetup(). Here is an example

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float(Individual VacationDestination)
        1 12
        1 12
        1 14
        1 14
        2 12
        2 15
        2 16
        3 12
        3 12
        3 12
        4  1
        4  2
        4  3
        end
        
        // br
        
        clear mata
        mata:
        real colvector uniqVal(real matrix X,real matrix info)
        {
        
                real scalar i, N
                real colvector uniqVal
                
                N = rows(info)
                uniqVal = J(N,1,.)
                
                for (i=1; i<= rows(info); i++)
                {
                    uniqVal[i,1] = rows(uniqrows(X[info[i,1]..info[i,2],.]))
                
                }
        
                return(uniqVal)
        
        }
        
        // version 13 or above
        real colvector countNobs(real matrix X, info)
        {    
             return(panelsum(X:!=.,info)) // panelsum() is undocumented and available only since version 13
        }
        
        X = st_data(.,"VacationDestination")
        id = st_data(.,"Individual")
        
        info = panelsetup(id,1)
        
        uniqrows(id),uniqVal(X,info)
        
        Y = uniqrows((id,X))
        info = panelsetup(Y[,1],1)
        countNobs(Y[,2],info)
        
        end

        Comment


        • #5
          Thanks a lot for the detailed exposition!

          Comment

          Working...
          X