In the following simulated data (full code provided) I am seeking to count the number of unique and distinct words occurring within one string variable, -allcolors-. I have read the FAQ at:
http://www.stata.com/support/faqs/da...stinct-values/
however that deals with one-word stringvars, whereas I wish to count the number of unique and distinct words with a stringvar that has multiple words.
In this simulation I have manually counted, in the var -total-, the number I'm seeking advice on how to write code for counting.
http://www.stata.com/support/faqs/da...stinct-values/
however that deals with one-word stringvars, whereas I wish to count the number of unique and distinct words with a stringvar that has multiple words.
In this simulation I have manually counted, in the var -total-, the number I'm seeking advice on how to write code for counting.
Code:
clear set obs 8 input group id visitno str25 allcolors total 1 11 1 "Red" 1 1 24 1 "Red" 1 1 24 2 "Red Blue" 2 2 18 1 "Red" 1 2 18 2 "Red Blue" 2 2 18 3 "Red Blue Green Yellow" 4 2 44 1 "Red" 1 2 44 2 "Red Blue" 2 end l, noo sepby(group)
Comment