Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using subinstr to replace the third instance and beyond of a particular character (instead of the first n instances)

    Hi,

    I have a string variable that should be mostly numbers, apart from two alphabets ("BH") at the beginning of the string.

    So the string (which is a government ID number) should look something like this: BH-03-012-015-03136300/1684. The length of the last number after the "/" is not necessarily 4 though, so I can't just use a substring function where I only keep the first 27 characters for example.

    Sometimes, because of a data scraping issue, there is some other text concatenated after the ID number, and this text is of varying lengths. I basically want to remove this text at the end.

    What I am trying to do is this:

    Code:
    charlist var
        return list
        local character `r(sepchars)'
    
        local i=1
    
        foreach ch of local character {
            if `i'>14 { //14 is where the alphabets start
                replace var= subinstr(var,"`ch'","",.)
            }
            local ++i
        }
    The problem here is that this will replace all the alphabets including the legitimate "BH" at the beginning. I want to know if I can use subinstr to remove the third instance and beyond of a character.

    Any help is appreciated!

    P.S -- I do know that I could just use the code as it is and concatenate BH again at the beginning, but would be good to know generally if there is something like subinstr when you don't want to replace the first n instances of something.

  • #2
    You can try regular expressions.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input strL id
    "BH-03-012-015-03136300/1684"      
    "BH-03-012-015-03136300/222250hdP1"
    "BH-03-012-015-03136300/3364KM"    
    end
    
    replace id= ustrregexra(id, "(.*/\d+)([^\d]{1})(\w+)", "$1")
    Res.:

    Code:
    . l
    
         +-------------------------------+
         |                            id |
         |-------------------------------|
      1. |   BH-03-012-015-03136300/1684 |
      2. | BH-03-012-015-03136300/222250 |
      3. |   BH-03-012-015-03136300/3364 |
         +-------------------------------+
    Last edited by Andrew Musau; 09 Dec 2021, 08:45.

    Comment

    Working...
    X