Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quick way to replace multiple variables

    Dear all,

    I am dealing with mismatched country codes. One dataset has countries given numbers (Global Terrorism Database) while the other gives countries codes (World Development Indicators, e.g., AFG). I have encoded the latter's country codes and then am going through and am changing them to match the GTD's country codes country-by-country. This means that my DO file currently looks like the following, where "ctry" is the encoded WDI country-code variable and "country" is the new GTD country-numbers variable.
    Code:
    replace country=4 if ctry==3
    replace country=5 if ctry==6
    replace country=6 if ctry==61
    replace country=8 if ctry==5
    replace country=10 if ctry==13
    replace country=11 if ctry==10
    replace country=12 if ctry==11
    replace country=17 if ctry==24
    replace country=18 if ctry==23
    replace country=19 if ctry==21
    replace country=20 if ctry==31
    replace country=35 if ctry==26
    replace country=21 if ctry==18
    replace country=22 if ctry==27
    replace country=23 if ctry==19
    replace country=24 if ctry==28
    replace country=25 if ctry==33
    replace country=26 if ctry==29
    replace country=28 if ctry==25
    replace country=29 if ctry==34
    replace country=31 if ctry==32
    replace country=32 if ctry==22
    replace country=33 if ctry==20
    replace country=34 if ctry==17
    replace country=36 if ctry==124
    replace country=37 if ctry==43
    replace country=38 if ctry==36
    replace country=41 if ctry==35
    This repeats for 190 lines. I was wondering if there was a more concise way to do this? I understand if there is not, considering each pair is unique - no one country variable in either shares a number.

    Many thanks!

  • #2
    You could write a loop that is shorter, but you would end up with code that was much harder to read and much easier to get wrong.

    Sometimes code is just messy, especially in do-files.

    Comment


    • #3
      There is always -recode-. The traditional solution to this problem is to make a 190 line dataset with the old and new values, and then merge them on the old values. This is clear and concise, and reasonably efficient. See http://back.nber.org/stata/efficient/recode.html for how to use -matrix define- to create a very fast and compact lookup table. It is harder to read and easier to get wrong but it if the conversion happens often enough it can be worthwhile.

      Comment


      • #4
        Ah yes I see how I could do that - thanks! I think I may stick with what I have because I can definitely see how the code would be harder to read and potentially wrong.

        Comment


        • #5
          feenberg - I don't believe the countries are necessarily in the same order. E.g., "South Korea" versus "Korea, Republic of" puts them in different parts of the list and thus the recode may not work as easily. If there is a way to get around this though, please let me knw!

          Comment


          • #6
            The "destination" list does not have to be in order, it can jump around.

            Code:
            define matrix x=(2,1)
            gen y=x[n]
            is equivalent to

            Code:
            gen y=2 if n==1
            replace y=1 if n==2
            You just need to have an array large enough to encompass the largest value of ctry that appears in the data.

            Comment


            • #7
              Detail on #6.

              Stata matrices have rows and columns, even if there is only one of either. In Stata you would need to go

              Code:
               
               matrix x = (2,1)  gen y = x[1, n]
              Mata allows single subscripts with row or column vectors, but Stata doesn't.

              NB My personal aversion to recode surfaced in not remembering it at all.

              Comment


              • #8
                Yes, my bad. Hard to keep different languages straight.

                Comment

                Working...
                X