I have a long string variable which contains sequences of question marks of variable lengths, like this:
Without knowing the maximum length of the longest sequence of question marks in any row, is it possible to strip out all such long sequences using regexr?
I know that I can do
to remove all of the question marks, but that's going to generate a typo: "words?another" above will become "wordsanother". I want to avoid that.
If possible, it would suffice for my problem to replace all sequences of ?'s with a single question mark:
then use regex or subinstr to replace the remaining "?"'s with spaces depending on where they appear relative to a space (to avoid generating a typo above).
I have tried
which I thought meant "match one or more of the characters \? in between \? and \?" -> getting everything like "???" or longer and
which I thought meant "match one or more of the single allowable character \? between \? and \?"
but both of these are defeated by "word???another word" -> the case where the sequence "???" appears between two words without spaces.
I would love to know what I am misunderstanding about the syntax. Thanks!
Code:
some words???? words words?another word no????spaces more question marks?? yes? why not????????
I know that I can do
Code:
subinstr( text_variable, "?", "", .)
If possible, it would suffice for my problem to replace all sequences of ?'s with a single question mark:
Code:
replace text_variable = regexr(text_variable, <<identify sequences of \? 2 or longer >>, "?")
I have tried
Code:
replace text_variable = regexr(text_variable, "\?(\?)+\?", "?")
Code:
replace text_variable = regexr(text_variable, "\?([\?])+\?" "?")
but both of these are defeated by "word???another word" -> the case where the sequence "???" appears between two words without spaces.
I would love to know what I am misunderstanding about the syntax. Thanks!
Comment