Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create binary variable if string contains any of the following

    Hi,

    I have a variable "CaseDescription1". It's a string variable, with very long text description of a case. I want to create a new variable "RACF" if "CaseDescription1" contains any of the following terms: RACF, ACF, HLCNH, NH, nursing home, aged care facility it should be coded 1, if it doesn't contain those terms then 0.

    So far I have tried (just with RACF, figured if I can't get it working with one term it's definitely not going to work with multiple). As you'll see it doesn't 'detect' RACF for any of the cases, even though it's definitely there for a few thousand:


    gen RACF=strpos(CaseDescription1 ,"RACF")

    .gen RACF_pt=1 if RACF>0
    (2,858,239 missing values generated)

    . tab RACF_pt
    no observations

    . drop RACF

    . drop RACF_pt

    . gen byte RACF = strmatch( CaseDescription1 , "*RACF*")

    . tab RACF

    RACF | Freq. Percent Cum.
    ------------+-----------------------------------
    0 | 2,858,239 100.00 100.00
    ------------+-----------------------------------
    Total | 2,858,239 100.00

    . drop RACF

    . gen byte RACF = 1 if strmatch( CaseDescription1 , "*RACF*")
    (2,858,239 missing values generated)

    . drop RACF



    Thank you in advance.

  • #2
    See #3: https://www.statalist.org/forums/for...g-observations

    Comment


    • #3
      That worked perfectly! Thank you

      Comment

      Working...
      X