Hi all,
I am trying to find a specific word in a string. This seems like it should be simple, but looking through all the documentation and prior forum messages on strpos, substr, and regex, I haven't been able to find something that will work for the data I am using. Example below. I am trying to create a new variable "apple" that only includes observations from var fruit="APPLE" (and as such, exclude "REAPPLE."
strpos doesn't seem to work because it will include REAPPLE. The regex commands are tricky for me and it seems like most of the indicators (e.g. ^, ., $) require the word to be in a certain spot in the string(?) - I feel like I am misunderstanding the regex documentation so feel free to correct me there. My issue is that the word could show up at any time in the string, and APPLE and REAPPLE could show up in the same string. I'm wondering if there is maybe a solution that says, search for a word that starts with "AP" or search for these characters "APPLE" and exclude if more than 5 characters? Any help is so appreciated.
I am trying to find a specific word in a string. This seems like it should be simple, but looking through all the documentation and prior forum messages on strpos, substr, and regex, I haven't been able to find something that will work for the data I am using. Example below. I am trying to create a new variable "apple" that only includes observations from var fruit="APPLE" (and as such, exclude "REAPPLE."
strpos doesn't seem to work because it will include REAPPLE. The regex commands are tricky for me and it seems like most of the indicators (e.g. ^, ., $) require the word to be in a certain spot in the string(?) - I feel like I am misunderstanding the regex documentation so feel free to correct me there. My issue is that the word could show up at any time in the string, and APPLE and REAPPLE could show up in the same string. I'm wondering if there is maybe a solution that says, search for a word that starts with "AP" or search for these characters "APPLE" and exclude if more than 5 characters? Any help is so appreciated.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str33 fruithashtags "GRAPE FRUIT REAPPLE" "GRAPE FRUIT REAPPLE REBANANA" "FRUIT REAPPLE REBANANA REKIWI" "APPLE CANTALOUPE MELON KIWI" "APPLE BANANA" "KIWI APPLE GRAPE REAPPLE" "KIWI FRUIT REAPPLE REBANANA APPLE" "CANTALOUPE MELON APPLE BANANA" "REAPPLE" "APPLE" end
Comment