Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regex help

    Dear all,
    I have patient side effect descriptions which look like below. Most patients had their location of symptom ended in the last parentheses of the description (that's how the form generated the response). I would like to capture whatever are in the last parenthesis in the description i.e., "right eye" / "left eye" / "both eyes" / left side of my head etc. in the descriptions. Any help with -ustrregex- is much appreciated. Thanks.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str92 symp
    "I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)"
    "I had severe migraine from headache it's(continuous) and also had watery eye only(left eye)"
    "I had severe migraine from headache it's(continuous) and also had watery eye only(both eyes)"
    "I had severe migraine from headache it's(continuous) and (left side of my head)"                 
    "I had severe migraine from headache it's(continuous) and (lack of sleep)"
    "I had severe migraine from headache it's(continuous) and also had (drowsiness)"
    "I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)"
    end


    Roman

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str92 symp
    "I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)"
    "I had severe migraine from headache it's(continuous) and also had watery eye only(left eye)" 
    "I had severe migraine from headache it's(continuous) and also had watery eye only(both eyes)"
    "I had severe migraine from headache it's(continuous) and (left side of my head)"             
    "I had severe migraine from headache it's(continuous) and (lack of sleep)"                    
    "I had severe migraine from headache it's(continuous) and also had (drowsiness)"              
    "I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)"
    end
    
    gen wanted = ustrregexra(symp, ".*\((.*)\)$", "$1")
    Res.:

    Code:
    . l, sep(0)
    
         +---------------------------------------------------------------------------------------------------------------------+
         |                                                                                         symp                 wanted |
         |---------------------------------------------------------------------------------------------------------------------|
      1. | I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)              right eye |
      2. |  I had severe migraine from headache it's(continuous) and also had watery eye only(left eye)               left eye |
      3. | I had severe migraine from headache it's(continuous) and also had watery eye only(both eyes)              both eyes |
      4. |              I had severe migraine from headache it's(continuous) and (left side of my head)   left side of my head |
      5. |                     I had severe migraine from headache it's(continuous) and (lack of sleep)          lack of sleep |
      6. |               I had severe migraine from headache it's(continuous) and also had (drowsiness)             drowsiness |
      7. | I had severe migraine from headache it's(continuous) and also had watery eye only(right eye)              right eye |
         +---------------------------------------------------------------------------------------------------------------------+
    
    .

    Comment


    • #3
      As always, a big thanks Andrew! Knew you would come to rescue . Now for my understanding, I read your code as: "\(" = skip opening parenthesis, "(.*)" = get everything, "\)" = skip closing parenthesis, "$" = occuring at the end. I am not being able to figure out the role of ".*" at the begining and the "$1" at the end. Would you mind explaining these two parts if that is not too time consuming. Thank you again.
      Last edited by Roman Mostazir; 30 Sep 2024, 09:29.
      Roman

      Comment


      • #4
        .* matches any characters leading up to the last set of parentheses. I have to escape the opening and closing parentheses so that they are matched literally (as the parentheses has a special meaning in regular expressions). The inner parentheses and the period and asterisk (.*) define the capture group that captures everything inside the last set of parentheses. Finally the dollar sign signifies the end of the string. So your interpretation is correct.

        Comment


        • #5
          Thank you Andrew.
          Roman

          Comment

          Working...
          X