Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping of countries by continent

    I have deforestation data for 265 countries and want to analyse them separately by continent. I usually do the sorting manually which takes too much time, so would like to know if Stata can help with this:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str52 country str11(year1990 year2000 year2015)
    "Country Name"                                  "1990"        "2000"        "2015"      
    "Aruba"                                         "2,333333333" "2,333333333" "2,333333333"
    "Afghanistan"                                   "2,067824648" "2,067824648" "2,067824648"
    "Angola"                                        "48,90992219" "47,90887944" "46,40731531"
    "Albania"                                       "28,78832117" "28,07664234" "28,15693431"
    "Andorra"                                       "34,04255319" "34,04255319" "34,04255319"
    "Arab World"                                    "3,684247883" "3,514498043" "2,802680075"
    "United Arab Emirates"                          "2,93062201"  "3,708133971" "3,858851675"
    "Argentina"                                     "12,7135335"  "11,64180086" "9,906858285"
    "Armenia"                                       "11,76677204" "11,69652266" "11,66139796"
    "American Samoa"                                "91,95"       "90,25"       "87,7"      
    "Antigua and Barbuda"                           "23,40909091" "22,72727273" "22,27272727"
    "Australia"                                     "16,73209846" "16,77114927" "16,23875662"
    "Austria"                                       "45,72535723" "46,47614434" "46,88389903"
    "Azerbaijan"                                    "10,23829266" "10,55384057" "13,78367589"
    "Burundi"                                       "11,25389408" "7,710280374" "10,74766355"
    "Belgium"                                       ""            "22,03764861" "22,56935271"
    "Benin"                                         "51,09081234" "44,88293721" "38,23164243"
    "Burkina Faso"                                  "25,0255848"  "22,83625731" "19,55409357"
    "Bangladesh"                                    "11,47729892" "11,27756011" "10,97795191"
    "Bulgaria"                                      "30,07321703" "30,50709572" "35,21554901"
    "Bahrain"                                       "0,31884058"  "0,521126761" "0,778210117"
    "Bahamas, The"                                  "51,44855145" "51,44855145" "51,44855145"
    "Bosnia and Herzegovina"                        "43,33333333" "42,67578125" "42,67578125"
    "Belarus"                                       "38,35535397" "40,78584106" "42,54842048"
    "Belize"                                        "70,84743534" "63,97632617" "59,89916703"
    "Bermuda"                                       "20"          "20"          "20"        
    "Bolivia"                                       "57,96639897" "55,47032216" "50,55294009"
    "Brazil"                                        "65,40988785" "62,36722524" "59,04878358"
    "Barbados"                                      "14,65116279" "14,65116279" "14,65116279"
    "Brunei Darussalam"                             "78,36812144" "75,33206831" "72,10626186"
    "Bhutan"                                        "53,65045053" "65,47738693" "72,27562505"
    "Botswana"                                      "24,20552997" "22,11811621" "19,12727401"
    "Central African Republic"                      "36,21304055" "35,96263122" "35,58701724"
    "Canada"                                        "38,29907264" "38,24727745" "38,16667052"
    "Central Europe and the Baltics"                "31,94439667" "32,4878545"  "34,17912313"
    "Switzerland"                                   "29,11786283" "30,20949297" "31,73398117"
    "Channel Islands"                               "4,210526316" "4,210526316" "4,210526316"
    "Chile"                                         "20,52769753" "21,29565372" "23,8523695"
    "China"                                         "16,73800762" "18,85346743" "22,18966958"
    "Cote d'Ivoire"                                 "32,14465409" "32,47798742" "32,70754717"
    "Cameroon"                                      "51,43957183" "46,78555563" "39,80453132"
    "Congo, Dem. Rep."                              "70,73641958" "69,36282835" "67,3024415"
    "Congo, Rep."                                   "66,54758419" "66,04978038" "65,39970717"
    "Colombia"                                      "58,05948626" "55,69936007" "52,72802163"
    "Comoros"                                       "26,32993015" "24,18054809" "19,88178399"
    "Cabo Verde"                                    "14,33002481" "20,36972705" "22,30769231"
    "Costa Rica"                                    "50,21543282" "46,53349001" "53,97571485"
    "Caribbean small states"                        "86,32263801" "85,7122885"  "85,07768309"
    "Cuba"                                          "19,16201117" "22,67225326" "30,76331475"
    "Curacao"                                       ""            ""            ""          
    "Cayman Islands"                                "52,91666667" "52,91666667" "52,91666667"
    "Cyprus"                                        "17,43614719" "18,57251082" "18,69047619"
    "Czech Republic"                                "34,02355377" "34,12708684" "34,53768454"
    "Germany"                                       "32,36616733" "32,53761284" "32,73242198"
    "Djibouti"                                      "0,241587575" "0,241587575" "0,241587575"
    "Dominica"                                      "66,66666667" "63,10666667" "57,77333333"
    "Denmark"                                       "12,81434301" "13,79919868" "14,57966182"
    "Dominican Republic"                            "22,87311116" "30,75967709" "41,04740219"
    "Algeria"                                       "0,69990847"  "0,662960693" "0,821248331"
    "East Asia & Pacific (excluding high income)"   "28,69120083" "28,51314926" "29,77071544"
    "Early-demographic dividend"                    "24,46613032" "23,05415701" "21,92542137"
    "East Asia & Pacific"                           "25,77104848" "25,68218425" "26,33562637"
    "Europe & Central Asia (excluding high income)" "38,37982277" "38,47491067" "38,89798739"
    "Europe & Central Asia"                         "37,27149582" "37,48816515" "38,03961386"
    "Ecuador"                                       "52,84947984" "55,27830569" "50,52295056"
    "Egypt, Arab Rep."                              "0,044201115" "0,059269677" "0,073333668"
    "Euro area"                                     "34,37000573" "36,52082712" "38,26298262"
    "Eritrea"                                       ""            "15,6039604"  "14,95049505"
    "Spain"                                         "27,64994794" "34,02192385" "36,81850894"
    "Estonia"                                       "52,04057561" "52,91342298" "52,65392781"
    "Ethiopia"                                      "15,19981835" "13,705"      "12,499"    
    "European Union"                                "34,96133281" "36,47796407" "38,0092625"
    "Fragile and conflict affected situations"      "27,77057418" "26,54201991" "24,35091655"
    "Finland"                                       "71,81785351" "73,68922158" "73,10716989"
    "Fiji"                                          "52,15654078" "53,66392994" "55,67597154"
    "France"                                        "26,36394517" "27,92169725" "31,02690679"
    "Faroe Islands"                                 "0,05730659"  "0,05730659"  "0,05730659"
    "Micronesia, Fed. Sts."                         ""            "91,22857143" "91,81428571"
    "Gabon"                                         "85,38052548" "85,38052548" "89,26145845"
    "United Kingdom"                                "11,48266027" "12,21014343" "12,99549456"
    "Georgia"                                       "39,60282055" "39,72657936" "40,61591596"
    "Ghana"                                         "37,91421289" "39,15355542" "41,03454338"
    "Gibraltar"                                     "0"           "0"           "0"          
    "Guinea"                                        "29,56210321" "28,097021"   "25,89939769"
    "Gambia, The"                                   "43,67588933" "45,55335968" "48,22134387"
    "Guinea-Bissau"                                 "78,80512091" "75,39118065" "70,12802276"
    "Equatorial Guinea"                             "66,31016043" "62,13903743" "55,90017825"
    "Greece"                                        "25,59348332" "27,93638479" "31,45073701"
    "Grenada"                                       "49,97058824" "49,97058824" "49,97058824"
    "Greenland"                                     "0,00064384"  "0,000535997" "0,000535997"
    "Guatemala"                                     "44,30757745" "39,26838373" "33,03471445"
    "Guam"                                          "46,2962963"  "46,2962963"  "46,2962963"
    "Guyana"                                        "84,63296927" "84,43992888" "83,9522479"
    "High income"                                   "27,30069335" "27,41498893" "27,55014229"
    "Hong Kong SAR, China"                          ""            ""            ""          
    "Honduras"                                      "72,71427295" "57,12753597" "41,04030744"
    "Heavily indebted poor countries (HIPC)"        "31,87450722" "30,40744168" "27,94081901"
    "Croatia"                                       "33,08889286" "33,71489894" "34,3459614"
    "Haiti"                                         "4,208998549" "3,955007257" "3,519593614"
    end

  • #2
    Since you apparently know what countries are in what continents from doing it by hand, what you need to do is build a "crosswalk" file with two variables: country and continent, and then merge this file to your data by country, so that the continent variable will be added to your data.

    Comment


    • #3
      The short answer is "Yes, Stata can do it, but you need to teach Stata how to do it".

      1. The country names resemble the names used in the World Bank open data API queries, which supply regional belonging for each country, for example, Ukraine. You will need to parse this information out of the XML file. JSON response is also possible, though for such simple things there is hardly any difference.

      Regions are defined here.

      2. Many of the names are not really countries , e.g.
      Heavily indebted poor countries (HIPC) High income Fragile and conflict affected situations 3. If you are doing a study using the WB data, perhaps it is easier to use the API to pull out the data you need, rather then fight with the file formats conversions.

      Best, Sergiy

      Comment


      • #4
        You may try the user-written kountry command, play with different geo() options, for example:

        Code:
        net install dm0038_1
        kountry country, from(other) geo(marc)
        tab GEO

        Comment


        • #5
          To follow up on William Lisowski 's suggestion in #2, I would look for a list of countries & continents, (like this Excel spreadsheet here.) Import it into Stata and then merge it into your master data listed in #1. You'll have to do some cleaning so that the names will match (in #1 its "Bahamas, The", in their data it's just "Bahamas"). The linked spreadsheet has 230 obs, so it will get you most of the way there. Obviously, it won't match some the region names you have in your data, "Central Europe and the Baltics", "East Asia & Pacific (excluding high income)", etc, so you will have to manually decide on those.

          Code:
          * Data imported from that spreadsheet
          rename countryenglish country
          rename continentname continent
          sort country
          
          . list in 1/15, noobs
          
            +-------------------------------------+
            |             country       continent |
            |-------------------------------------|
            |         Afghanistan            Asia |
            |             Albania          Europe |
            |             Algeria          Africa |
            |      American Samoa         Oceania |
            |             Andorra          Europe |
            |-------------------------------------|
            |              Angola          Africa |
            |            Anguilla   North America |
            | Antigua and Barbuda   North America |
            |           Argentina   South America |
            |             Armenia            Asia |
            |-------------------------------------|
            |               Aruba   North America |
            |           Australia         Oceania |
            |             Austria          Europe |
            |          Azerbaijan            Asia |
            |             Bahamas   North America |
            +-------------------------------------+
          
          * Saves as country_continent.dta
          use master_data.dta, clear
          merge country 1:1 using country_continent.dta, nonotes keepusing(vars to bring over) keep(match master) gen(merge_continent)


          Comment


          • #6
            David Benson , I just recently found your reply on this post. I would like to ask you a follow up question based on your solution:
            1. Would you be able to show the syntax for how to import the specific country and continent data into Stata? Do I need to import the entire spreadsheet you shared here or is there a way to import only the "country summary information" sheet and only the country and continent columns?
            Similar to the original question posed in #1, I have a dataset with a very long list of countries (indicating where participants were born) and I am trying to figure out how to group them into continents. My google search brought me here

            Thank you so much for your time!
            Irina

            P.S. I followed the link in the spreadsheet you uploaded and found an updated version of the country level information
            Last edited by Irina Chukhray; 02 Nov 2020, 21:38.

            Comment

            Working...
            X