I want my string values to become numeric values

Yao Zhao

Join Date: Feb 2017
Posts: 226

I want my string values to become numeric values

05 Mar 2020, 20:33

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str26(q11r1_facebook q11r2_instagram q11r3_twitter q11r4_snapchat q11r5_pinterest q11r6_tiktok q11r7_linkedin q11r8_strava)
"I use frequently"           "I use frequently"       "I would consider using"     "I use frequently"    "I use frequently"           "I use frequently"           "I would consider using"     "I've never heard of"       
"I would NOT consider using" "I use frequently"       "I would NOT consider using" "I use occassionally" "I would NOT consider using" "I would NOT consider using" "I would consider using"     "I've never heard of"       
"I use frequently"           "I would consider using" "I would consider using"     "I use frequently"    "I would consider using"     "I would consider using"     "I use occassionally"        "I've never heard of"       
"I use frequently"           "I use frequently"       "I would NOT consider using" "I use frequently"    "I use occassionally"        "I would NOT consider using" "I would NOT consider using" "I've never heard of"       
"I use frequently"           "I use occassionally"    "I would NOT consider using" "I use occassionally" "I use occassionally"        "I would NOT consider using" "I would NOT consider using" "I would NOT consider using"
end

I have these variables. And they share the same value label:
I use frequently
I use occassionally
I would NOT consider using
I would consider using
I've never heard of

I want them to be, say, 5,4,3,2,1 to do future regression.

What's the easiest code?

Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

05 Mar 2020, 20:44

The -encode- command is one of the most useful data management tools in Stata. You should definitely learn how to use it: read -help encode-

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str26(q11r1_facebook q11r2_instagram q11r3_twitter q11r4_snapchat q11r5_pinterest q11r6_tiktok q11r7_linkedin q11r8_strava)
"I use frequently"           "I use frequently"       "I would consider using"     "I use frequently"    "I use frequently"           "I use frequently"           "I would consider using"     "I've never heard of"      
"I would NOT consider using" "I use frequently"       "I would NOT consider using" "I use occassionally" "I would NOT consider using" "I would NOT consider using" "I would consider using"     "I've never heard of"      
"I use frequently"           "I would consider using" "I would consider using"     "I use frequently"    "I would consider using"     "I would consider using"     "I use occassionally"        "I've never heard of"      
"I use frequently"           "I use frequently"       "I would NOT consider using" "I use frequently"    "I use occassionally"        "I would NOT consider using" "I would NOT consider using" "I've never heard of"      
"I use frequently"           "I use occassionally"    "I would NOT consider using" "I use occassionally" "I use occassionally"        "I would NOT consider using" "I would NOT consider using" "I would NOT consider using"
end

label define usage  1   "I've never heard of" ///
                    2   "I would consider using"    ///
                    3   "I would NOT consider using"    ///
                    4   "I use occasionally"    ///
                    5   "I use frequently"
                   
foreach v of varlist q11r1_facebook-q11r8_strava {
    encode `v', gen(_`v') label(usage)
    order _`v', before(`v')
    drop `v'
    rename _`v' `v'
}

Note: Those labels are pretty long and in many kinds of Stata output they will be truncated. Consider using a shorter set of labels such as "frequently," "occasionally," "would not consider," "would consider" and "never heard of."

I should also add that this is a very odd response set and I feel uneasy with this data. On the one hand, the first two responses are frequencies of use, the third and fourth are attitudes towards use, and the fifth is about knowledge of the existence of the focused platform. I cannot imagine what stem of the question would make all of these appropriate. If you ask people one thing and then giving them response choices that don't fit the question, you tend to get random, uninformative responses. And I wonder how it will even be possible to interpret the results, no matter how they are analyzed.

Comment

Yao Zhao

Join Date: Feb 2017

Posts: 226
#3

06 Mar 2020, 07:10

Your codes are excellent! Thank you!
Comment

Announcement

I want my string values to become numeric values

Comment

Comment