Hi, I am using a dataset from SIPRI for a time-series analysis I am doing with many countries (100+) and many years (~60). I am also using a few other datasets and in order to merge them, I need a consistent country code (numbers or letters) or country name. SIPRI's dataset doesn't have a country code but it does have country names, however many of the entries have weird characters, like brackets and stars. I've included 3 examples.
I'm familiar with the command "kountry" but even using kountry stuck means missing out on a lot of countries' observations. Does anyone know what dataset SIPRI is using for their name variable or if they came up with their own completely? Otherwise, are there any commands where I can remove various special characters from string variables for many observations efficiently?
Any help is greatly appreciated!
I'm familiar with the command "kountry" but even using kountry stuck means missing out on a lot of countries' observations. Does anyone know what dataset SIPRI is using for their name variable or if they came up with their own completely? Otherwise, are there any commands where I can remove various special characters from string variables for many observations efficiently?
Any help is greatly appreciated!
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str37 country_name int year "ANC (South Africa)*" 1950 "ANC (South Africa)*" 1951 "ANC (South Africa)*" 1952 "ANC (South Africa)*" 1953 "ANC (South Africa)*" 1954 "ANC (South Africa)*" 1955 "ANC (South Africa)*" 1956 "ANC (South Africa)*" 1957 "ANC (South Africa)*" 1958 "ANC (South Africa)*" 1959 end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str37 country_name int year "African Union**" 1950 "African Union**" 1951 "African Union**" 1952 "African Union**" 1953 "African Union**" 1954 "African Union**" 1955 "African Union**" 1956 "African Union**" 1957 "African Union**" 1958 "African Union**" 1959 end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str37 country_name int year "Amal (Lebanon)*" 1950 "Amal (Lebanon)*" 1951 "Amal (Lebanon)*" 1952 "Amal (Lebanon)*" 1953 "Amal (Lebanon)*" 1954 "Amal (Lebanon)*" 1955 "Amal (Lebanon)*" 1956 "Amal (Lebanon)*" 1957 "Amal (Lebanon)*" 1958 "Amal (Lebanon)*" 1959 end
Comment