Hey all again, trying to see again if there's a clever solution to my goal. Given the length of the strings and potentially sensitive information, I'm omitting -dataex- and making a brief example to explain.
I have 3 string variables I'd like to encode. The strings are all taken from the string list, however not every variable has every potential value for the string, leading to a dataset like this:
Now, I'd like to use -encode- so that a = 1 , b = 2 c = 3 and so on for all variables. However as each variable has a different number of values, the encoding will not be consistent across variables with repeated -encodes-. Is there a solution to ensure consistency? I'm going to experiment with -reshape- to solve this, but I'm curious to see if anyone else has had this issue
Thanks!
I have 3 string variables I'd like to encode. The strings are all taken from the string list, however not every variable has every potential value for the string, leading to a dataset like this:
Code:
clear input str1 (var1 var2 var3) "a" "a" "a" "b" "a" "b" "c" "c" "b" "a" "d" "b" end
Thanks!
Comment