Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting text values to numeric values in Stata dataset

    Good afternoon! I'm very new to Stata and am having difficulty converting a text value to a numeric one. I have a data set with hospital account types (22 account types under a variable called "accnt_type"), with lengths between 4 and 5 characters long. For example: "O ER" and "I INP" are two of the account types. I used the encode accnt_type, gen(visit_type) command to convert a string variable to a numeric one and create a new variable called visit_type.

    Now, I'd like to replace "O ER" and "O OB" values with a 2, the "O OUT" value with a 1, and all of the other visit types with a 0 - where the 0, 1, and 2 appear in the dataset. However, I can't determine how to do this. Can you provide some guidance on this, please? Thank you.

  • #2
    You can still refer to the original string variable:
    Code:
    generate accnt_cat = 0
    replace accnt_cat=1 if accnt_type=="O OUT"
    replace accnt_cat=2 if accnt_type=="O ER"
    Otherwise you need to know which codes were assigned by the -encode- command to the labels you mention.

    Best, Sergiy Radyakin

    Comment


    • #3
      So, I take it you already have a numeric varriable, visit_type. with a value label also called visit_type (the default name for the value label -encode- creates). You can do the recoding as follows:

      Code:
      local oer = "O ER":visit_type
      local oob = "O OB":visit_type
      local oout = "O OUT":visit_type
      recode visit_type (`oer' `oob' = 2) (`oout' = 1) (nonmissing = 0), gen(visit_type_3)
      label define visit_type_3 0 "Other" 1 "O OUT" 2 "O ER/O OB"
      label values visit_type_3 visit_type_3
      Note the use of the equals sign in the local macro definitions here. That is essential in this context. -recode- does not evaluate expressions like "O ER":visit_type, so the local macro must have already done so before -recode- will accept them.

      Caveat: above code not directly tested. Beware of typos and other small errors.

      Added: Crossed with Sergiy's reply. I tacitly assumed the original string variable was no longer available. But, of course, it might be, and if it is, Sergiy's solution is obviously better.

      Comment


      • #4
        Code:
        gen newvar = 0 
        replace newvar = 1 if accnt_type == "O OUT" 
        replace newvar = 2 if inlist(accnt_type, "O OB", "O ER")
        A taste sometimes never acquired points to

        Code:
        gen newvar = cond(inlist(accnt_type, "O OB", "O ER"), 2, cond(accnt_type == "O OUT", 1, 0))
        Always cross-check with


        Code:
         
        tab accnt_type newvar
        as this code is totally sensitive to exact use of case or spelling variants or extra leading or trailing spaces.

        Comment

        Working...
        X