Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loop for regexm

    I have a text string variable that has lots of categories. I'd like to create dummy variables from this and I am using regular expression.

    response
    ALLEGES, INJURED BY SOMEONE, UNDER INFLUENCE OF ALCOHOL

    I have written regexm code like this.

    gen alleges=regexm(response, "ALLEGES")
    gen injury=regexm(response, "INJURED")
    gen alcohol=regexm(response, "ALCOHOL")

    But the code takes lots of lines for each set of words. I cannot use parse "," because the text strings are not in a specific order.

    Is there a way to write a loop to same some time in writing each line by line?

  • #2
    Like this:
    Code:
    foreach x in ALLEGES INJURED ALCOHOL {
        local varname = lower("`x'")
        gen `varname' = strmatch(response, "*`x'*")
    }
    Note: I used -strmatch()- instead of -regexm- because you are not really taking advantage of the full flexibility of regular expressions. -strmatch()-, though limited to wildcard expressions, will be a bit more efficient as it has a less complicated matching task. But you can go back to using -regexm()- if the full task is more complicated, or even if you just want to.

    I also chose to the name the new variable injured rather than injury so that there is a simple systematic functional correspondence between the search string and the variable name. If we have to make switches between noun and verb forms or the like, then it is not possible to really reduce the code from what you started with.

    Comment


    • #3
      Code:
      foreach cat in alleges injury alcohol{
          gen `cat'= regexm(lower(response), "`cat'")
      }
      Note: Crossed with #2.

      Comment


      • #4
        If I add `_' or anything in front of the generated variable it works perfectly. Thanks for saving me lots of typing!

        foreach x in mental alcohol alleges other disability {
        gen `x'_=regexm(lower(response), "`x'")
        }
        Last edited by John MacDonald; 10 Mar 2024, 16:53. Reason: Found a solution

        Comment

        Working...
        X