Hello all, I hope everyone is doing well. I have a dataset with two addresses and my intension is to take out the space between 132 and F in the 1st record (in the example data with these 'problem' addresses below) and 1 and A in the 2nd record and so on.
I am generating a flag to know which are these addresses by:
Ideally, I would like to replace the regex portions of add1 and add2 in the line above with a pattern of "[0-9][A-Z][ ]+", as in my case, I can't have a specific value to replace them with. The standard replace command (as expected) does not work here, and I suspect it has to be done with a loop. I have been trying to pinpoint the positions of the spaces that I don't want, to use that as the starting point, but my tries are fruitless till now. I apologise in advance if my search in statalist has not be exhaustive and someone has already asked this question elsewhere. If anyone could show me even a direction to solve this, it would be great! Thank you very much in advance.
Best, Arpita
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str102(add1 add2) "132 F LARCHMONT ROAD" "132 F LARCHMONT ROAD" "FLAT 1 A 1 BRAMLEY ROAD" "FLAT 1 A 1 BRAMLEY ROAD" "FLAT 1 A 7 WOODLAND AVENUE" "FLAT 1 A 7 WOODLAND AVENUE" "FLAT 1 A COPTHALL HOUSE GLOUCESTER CRESCENT" "FLAT 1 A COPTHALL HOUSE GLOUCESTER CRESCENT" "FLAT 1 A HOOD COURT NORTH STREET" "FLAT 1 A HOOD COURT NORTH STREET" end
Code:
gen flag=1 if regexm(add1, "[0-9][ ][A-Z][ ]+")
Best, Arpita
Comment