Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating new variables based on multiple string variables

    Dear all,

    I am working with a dataset that has unique identification codes assigned to all observations, e.g. GEC10B5. The first five letters/numbers in this code refer to a type of worker "GEC10". I would now like to "group" all those observations together that were done by this GEC10 (that could e.g. be GEC10B5-GEC10B15, that are all listed in Stata beneath each other).
    Is there a way to create a variable based on their string name within another variable?

    Thanks in advance!

    Theresa

  • #2
    As your variable is string then the first 5 characters can be the basis of a new variable. For example:

    Code:
    gen first5 = substr(whatever, 1, 5) 
    
    gen isGEC10 = substr(whatever, 1, 5) == "GEC10"

    Comment


    • #3
      Code:
      gen wanted= regexm(trim(lower(id)), "^gec10")
      where you replace "id" with the name of your identifier in the code above.

      Comment


      • #4
        [QUOTE=Nick Cox;n1652373]As your variable is string then the first 5 characters can be the basis of a new variable. For example:

        gen first5 = substr(whatever, 1, 5)

        gen isGEC10 = substr(whatever, 1, 5) == "GEC10"

        If I use this code replacing (whatever, 1,5 with the list of idcodes I want to group together) Stata gives me an error message substr not found

        Comment

        Working...
        X