Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recoding variables in loop

    Hi, I am trying to recode the categorical variables in the dataset in the following ways, but getting errors for both:

    Code:
    foreach var of varlist * {
        replace `var' = . if `var' == "Unsure" | `var' == "Prefer not to say"
    }
    
    type mismatch
    r(109);
    Code:
    .  foreach var of local vars {
      2.   recode `var' (Unsure = .) 
      3.   recode `var' (Prefer not to say = .)  
      4. }
    unknown el Unsure in rule
    r(198);
    Please suggest how can this be corrected.

    Thank you.



  • #2
    The problem you are encountering is that . is the missing value for numeric variables and cannot be used in a string variable. On the other hand, "Unsure" and "Prefer not to say" are string values that cannot appear in numeric variables. Your -replace- command clearly treats `var' as numeric before the -if- and as string after it. That's why you're getting an error message. Whether `var' happens to be numeric or string, either way, the command gets it wrong on one side of the -if-.

    If all of the variables in your data set (since you are iterating over varlist *) are strings, then you need to replace . by "", the missing value for string variables. If they are all numeric, and have attached value labels that include labels "Unsure" and "Prefer not to say", then it is more complicated. It would be something like this:
    Code:
    foreach v of varlist * {
        local lbl: value label `v'
        replace `v' = . if `v' == "Unsure":`lbl' | `v' == "Prefer not to say":`lbl'
    }

    If, as is more common, your data set has a mix of string and numeric variables, rather than doing a single loop over all variables, you are better off doing one loop over the string variables and a separate loop over the numeric variables.

    Added: Your second attempt, using -recode- fails because -recode- works only with numeric variables, and, as already noted, "Unsure" and "Prefer not to say" are not possible values of numeric variables.

    Comment


    • #3
      There is a user-written package -strrec- by daniel klein that implements -recode- for string variables, which might be convenient here. However, while I have it installed, I can't find it via -search strrec- or -search string recode-, so perhaps it's no longer available.

      Comment


      • #4
        The segregation of variable names into numeric variable names and string variable names alluded to by Clyde Schechter can be achieved in various ways.

        One is to use the official command
        Code:
         ds
        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . ds , has(type numeric)
        price         rep78         trunk         length        displacement  foreign
        mpg           headroom      weight        turn          gear_ratio
        
        . local numvars `r(varlist)'
        
        . ds , has(type string)
        make
        
        . local strvars `r(varlist)'
        ds displays a list of variable names satisfying the criterion or criteria specified (or all variables if none is specified) and also leaves the list in r(varlist). But as the example above shows r(varlist) is ephemeral and will get overwritten by the next r-class command, which could be another application of ds.

        So, it is a good idea to copy that result quickly to a local macro of your choice.

        The manual entry shows that I had a hand in ds, although it was an official command before I did anything and remains an official command. But the has() syntax can be blamed on me, unless you like it. Anyway, my second take was findname, which was and is community-contributed. findname has the syntax I like, and more importantly is more versatile than ds. A search shows the original 2010 reference and various fixes and extensions. If interested, download the latest version of the files, which as I write are from 2020.

        Code:
        . search findname , sj
        
        Search of official help files, FAQs, Examples, and Stata Journals
        
        SJ-20-2 dm0048_4  . . . . . . . . . . . . . . . . Software update for findname
                (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                Q2/20   SJ 20(2):504
                new options include columns()
        
        SJ-15-2 dm0048_3  . . . . . . . . . . . . . . . . Software update for findname
                (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                Q2/15   SJ 15(2):605--606
                updated to be able to find strL variables
        
        SJ-12-1 dm0048_2  . . . . . . . . . . . . . . . . Software update for findname
                (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                Q1/12   SJ 12(1):167
                correction for handling embedded double quote characters
        
        SJ-10-4 dm0048_1  . . . . . . . . . . . . . . . . Software update for findname
                (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                Q4/10   SJ 10(4):691
                update for not option
        
        SJ-10-2 dm0048  . . . . . . . . . . . . . .  Speaking Stata: Finding variables
                (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                Q2/10   SJ 10(2):281--296
                produces a list of variable names showing which variables
                have specific properties, such as being of string type, or
                having value labels attached, or having a date format
        In this case, you can get the local macro directly.

        Code:
        . findname , type(numeric) local(mynumvars)
        price         rep78         trunk         length        displacement  foreign
        mpg           headroom      weight        turn          gear_ratio
        
        . findname , type(string) local(mystrvars)
        make
        .

        Comment


        • #5
          strrec is still on SSC, though for whatever reason, it does not turn up when you search it.

          I would no longer recommend its use. It has quite complex syntax and sometimes unpredictable results.

          Depending on the ultimate goals, I would encode (better, yet: multencode [SSC]) the string variables, then work with the numeric ones.

          Comment

          Working...
          X