Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Categorical variables Automatic recoding to start with "1" and continues sequentially

    Hi everyone, I need to recode categorical variables so that its values are recoded automatically in sequential order starting by "1"
    for example now I have a variable school_lev where its values are 10"None" 33 "elementary school" 40"high school" so i want to somehow generate the variable so that 10>> becomes 1 33>>becomes 2 & 40>>become 3. of course I know Ican do it manually but with the number of variables i have, it will take very long time to do it . Plus if there is a way to preserve the original labels please let me know.
    Many thanks in advance

  • #2
    the main part of what you want can be done via
    Code:
    ta oldvar, gen(newvar)
    replace "oldvar" and "newvar" with your actual variable name and the name you want for the new variable

    Comment


    • #3
      Not so; tabulate just generates dummy variables with that syntax.

      Code:
      egen newvar = group(oldvar), label
      is my suggestion.

      Comment


      • #4
        Originally posted by Rich Goldstein View Post
        the main part of what you want can be done via
        Code:
        ta oldvar, gen(newvar)
        replace "oldvar" and "newvar" with your actual variable name and the name you want for the new variable
        Thank you so much for your reply but as far as I understand, using this way we are creating dummy variables that equal the number of values we have. while I need to have only one variable with the new sequential codes

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Not so; tabulate just generates dummy variables with that syntax.

          Code:
          egen newvar = group(oldvar), label
          is my suggestion.
          Thank you soooo much. this actually worked. However, I will need to remove the value before the value labels as it is preserved as well in the process. as now for example 1 [10]:none

          Comment


          • #6
            That means applying numlabel first.

            Comment


            • #7
              Thank you Riham for the question and for your helpful response Nick. This posting is similar to something I am trying to do but could use some additional guidance. I was able to recode automatically and sequentially using the code provided (similar to Riham's issue but my oldvar and newvar are numeric). However, my dataset is a merged dataset of 100 surveys which are identified as 1 through 100. I would like to apply the egen to each of these surveys. I created a forvalues to loop through each of the 100 survey (variable name called survey in the example below). It codes correctly for the first survey but then will not move on because the variable has already been generated. Any suggestions on how to either modify the current code or another way to accomplish the same thing?

              forvalues i =1/100{
              egen newvar = group(oldvar)if group==`i' } variable newvar already defined

              Comment

              Working...
              X