Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transforming survey results to numerical codes appropiately (Encoding help)

    Stata and Stata Forum Beginner here.

    Situation: Using Limesurvey data for a health-related QOL study. This has questions where the responses range from things like 'none of the time' to 'all of the time'. These are coded A1 to A5 respectively by Limesurvey.
    Exporting the responses often results in strings (with non-numerical characters) where I would want numerical values as a code. So I searched up the encoding command. (the alternative would be to use global find and replace on excel but I want to find a way to do this on Stata, so I can instantly transform any new responses appropriately using a do file).

    Problem: Encoding option encodes from A1 to A5 as 1 to 5 but I'd rather it start from 0.

    Question: Is there a way to exactly tell Stata how to encode such responses? Or a better way to assign numerical codes to the string answers from the survey rather than encode? I'm aware that maybe my only option is to generate a new variable using if and replace but that seems inelegant and Id have to repeat it many many times for all of the questions.

    Any help would be greatly appreciated.
    Last edited by Nobin Zaman; 30 Jul 2020, 07:52.

  • #2
    Manually creating the value label you want the encode command to use, and then telling it to use that value label to determine the value to assign to each string, leads to the results you seek.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str17 q1
    "none of the time" 
    "a bit of the time"
    "some of the time" 
    "a lot of the time"
    "most of the time" 
    "all of the time"  
    end
    
    label define freq         ///
        0 "none of the time"  ///
        1 "a bit of the time" ///
        2 "some of the time"  ///
        3 "a lot of the time" ///
        4 "most of the time"  ///
        5 "all of the time"  
    
    encode q1, generate(q1n) label(freq)
    list, clean
    list, clean nolabel
    Code:
    . list, clean
    
                          q1                 q1n  
      1.    none of the time    none of the time  
      2.   a bit of the time   a bit of the time  
      3.    some of the time    some of the time  
      4.   a lot of the time   a lot of the time  
      5.    most of the time    most of the time  
      6.     all of the time     all of the time  
    
    . list, clean nolabel
    
                          q1   q1n  
      1.    none of the time     0  
      2.   a bit of the time     1  
      3.    some of the time     2  
      4.   a lot of the time     3  
      5.    most of the time     4  
      6.     all of the time     5

    Comment


    • #3
      Originally posted by William Lisowski View Post
      Manually creating the value label you want the encode command to use, and then telling it to use that value label to determine the value to assign to each string, leads to the results you seek.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str17 q1
      "none of the time"
      "a bit of the time"
      "some of the time"
      "a lot of the time"
      "most of the time"
      "all of the time"
      end
      
      label define freq ///
      0 "none of the time" ///
      1 "a bit of the time" ///
      2 "some of the time" ///
      3 "a lot of the time" ///
      4 "most of the time" ///
      5 "all of the time"
      
      encode q1, generate(q1n) label(freq)
      list, clean
      list, clean nolabel
      Code:
      . list, clean
      
      q1 q1n
      1. none of the time none of the time
      2. a bit of the time a bit of the time
      3. some of the time some of the time
      4. a lot of the time a lot of the time
      5. most of the time most of the time
      6. all of the time all of the time
      
      . list, clean nolabel
      
      q1 q1n
      1. none of the time 0
      2. a bit of the time 1
      3. some of the time 2
      4. a lot of the time 3
      5. most of the time 4
      6. all of the time 5
      Hi William,

      Thanks for answering.
      When I copy the code into a blank Stata it doesn't work saying 'invalid syntax' (r198) for label define freq and '0 is not a valid command name' (r199) etc as a result.
      Do I need to modify the code you've given me somehow, is that what the /// is for?

      Comment


      • #4
        You should run William's code in a do file. Because the command is too long, he is using "///" to break it into several lines, to enhance readability. See

        Code:
        help delimit

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          You should run William's code in a do file. Because the command is too long, he is using "///" to break it into several lines, to enhance readability. See

          Code:
          help delimit
          Ah I see, it works now, thank you.

          Comment

          Working...
          X