Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remove last character if it is a string

    Hello

    I would appreciate any advice with my problem. I have a set of IDs. The child ID has an "A" suffix at the end of a series of integers but the parent ID has matching integers only. Please could someone advise how I remove the "A"?

    I tried tried the following but it removed the last the last character from all ids. I only wish to remove the last character if it is "A":

    HTML Code:
    generate str id = substr( Subject , 1, strlen( Subject) - 1)
    My data looks like:
    HTML Code:
        Subject   
            
    1.    C001    
    2.    C001A    
    3.    C002    
    4.    C002A    
    5.    C003    
    6.    C003A    
    7.    C004    
    8.    C004A    
    9.    C005    
    10.    C005A

    Many thanks!
    Sara

  • #2

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str5 Subject
    "C001" 
    "C001A"
    "C002" 
    "C002A"
    "C003" 
    "C003A"
    "C004" 
    "C004A"
    "C005" 
    "C005A"
    end
    
    replace Subject = substr(Subject, 1, length(Subject) - 1) if substr(Subject, -1, 1) ==  "A"
    
    list, sepby(Subject)
    
         +---------+
         | Subject |
         |---------|
      1. |    C001 |
      2. |    C001 |
         |---------|
      3. |    C002 |
      4. |    C002 |
         |---------|
      5. |    C003 |
      6. |    C003 |
         |---------|
      7. |    C004 |
      8. |    C004 |
         |---------|
      9. |    C005 |
     10. |    C005 |
         +---------+

    Comment


    • #3
      This should work as long at there is only one A in variable Subject

      Code:
      .generate str id = cond(substr(Subject,-1,.)=="A",subinstr(Subject,"A","",.),Subject)
      Some additional trickery would be necessary if "A" can appear anywhere in the string.

      On that occasion: It would be helpful if -subinstr(s1,s2,s3,n)- would allow negative values for n, similar to -substr()-


      Comment


      • #4
        Many thanks! Both worked.

        Comment


        • #5
          Working out code for doing it on the reversed string and then reversing back is set as an exercise!

          Comment


          • #6
            Dear all,

            What if the OP had the following data:

            Code:
             
             clear input str5 Subject "C001"  "C001B" "C002"  "C002A" "C003"  "C003C" "C004"  "C004d" "C005"  "C005A"
            How would one tweak the code shown in #2 to remove the last character, if it is a string (any string now, not just "A")?

            Many thanks in advance!

            Comment


            • #7
              In a string all characters are ... string characters. But you mean removing a character that doesn't have a numeric interpretation.

              Code:
              replace Subject = substr(Subject, 1, strlen(Subject) - 1) if !inrange(substr(Subject, -1, 1), "0", "9")
              might help.

              Comment


              • #8
                Yes exactly, it was a very formulation from me. That is exactly what I meant.

                Thank you!

                Comment

                Working...
                X