In the following simulated data (full code provided) I am seeking to count the number of unique and distinct words occurring within one string variable, -allcolors-. I have read the FAQ at:
http://www.stata.com/support/faqs/da...stinct-values/
however that deals with one-word stringvars, whereas I wish to count the number of unique and distinct words with a stringvar that has multiple words.
In this simulation I have manually counted, in the var -total-, the number I'm seeking advice on how to write code for counting.
http://www.stata.com/support/faqs/da...stinct-values/
however that deals with one-word stringvars, whereas I wish to count the number of unique and distinct words with a stringvar that has multiple words.
In this simulation I have manually counted, in the var -total-, the number I'm seeking advice on how to write code for counting.
Code:
clear
set obs 8
input group id visitno str25 allcolors total
1 11 1 "Red" 1
1 24 1 "Red" 1
1 24 2 "Red Blue" 2
2 18 1 "Red" 1
2 18 2 "Red Blue" 2
2 18 3 "Red Blue Green Yellow" 4
2 44 1 "Red" 1
2 44 2 "Red Blue" 2 end
l, noo sepby(group)

Comment