I'm using Stata 17 MP on a hosted application (Windows environment), though I'm running test code on the example below in Stata 15 (Windows 10). I've consulted the manuals and other posts but I'm still stuck on something that seems fairly basic.
I'm trying to clean an error that occurred when a text string variable containing commas (var1 = "Fred, Sarah, & Abdul Inc") was spread to 1-3 other variables in a parsing error (var1="Fred"; var2= "Sarah"; var3= "& Abdul Inc"), pushing that row out to three extra columns.
My actual data set contains 1.5 million records and 28 variables. Here is a small example of the problem:
I tried two slightly different approaches, both involving tokenizing the variable list.
APPROACH 1
The error received was "1 invalid name" and "if not found". OK, I get the first one. I then tried all manner of single parentheses to trigger an evaluation of (i + 1) without success.
APPROACH 2
The error received was "var1 not found". I then tried slightly different syntax of ` and ' without success.
Any suggestions? Kind regards and thank you in advance for your help.
I'm trying to clean an error that occurred when a text string variable containing commas (var1 = "Fred, Sarah, & Abdul Inc") was spread to 1-3 other variables in a parsing error (var1="Fred"; var2= "Sarah"; var3= "& Abdul Inc"), pushing that row out to three extra columns.
My actual data set contains 1.5 million records and 28 variables. Here is a small example of the problem:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float visit str20(specialty location address condition v4) 20 "Internal Medicine" "Main Clinic" "245 Oak St" "Diabetes" "" 21 "Pulmonology" "Hospital B" "8211 Peabody St" "COPD" "" 22 "Dermatology" "Cosmetic" "Clinic 2" "3588 King St" "Dermabrasion" 23 "Family Medicine" "Clinic 2" "3588 King St" "Sinus Congestion" "" 24 "Dermatology" "Cosmetic" "Clinic 2" "3588 King St" "Scar Removal" end
APPROACH 1
Code:
local offset_vars address condition v4 tokenize `offset_vars' forval i=1/2 { replace word `i' = "`i' + 1" if location == "Cosmetic" } replace specialty = "Dermatology_Cosmetic" if specialty == "Dermatology"
APPROACH 2
Code:
local offset_vars address condition v4 set trace on forval i=1/2 { local var1 `: word `i' of `offset_vars'' di var1 local var2 `: word ``i' + 1' of `offset_vars'' di var2 replace `var1' = `var2' if location == "Cosmetic" }
Any suggestions? Kind regards and thank you in advance for your help.
Comment