Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create a Dummy indicating country pair has appeared previously in different variable

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(Year country_pair2 BIT_pair)
    1993 4447  1
    1999 2507  2
    2008  583  3
    1994 6339  4
    1993 2171  5
    2010 6813  6
    1994 6917  7
    1995 2198  8
    1993 2198  9
    1997 6687 10
    1995 2196 11
    1991 5720 12
    1991 4500 13
    1996 2546 14
    2002 2783 15
    1996  555 16
    1991 2198 17
    2003 2198 18
    2007 6114 19
    2007 1431 20
    end
    label values country_pair2 country_pair2
    label def country_pair2 555 "Belgium Germany", modify
    label def country_pair2 583 "Belgium Netherlands", modify
    label def country_pair2 1431 "Colombia Colombia", modify
    label def country_pair2 2171 "Germany Brazil", modify
    label def country_pair2 2196 "Germany France", modify
    label def country_pair2 2198 "Germany Germany", modify
    label def country_pair2 2507 "Hong Kong Australia", modify
    label def country_pair2 2546 "Hong Kong Hong Kong", modify
    label def country_pair2 2783 "India India", modify
    label def country_pair2 4447 "Netherlands France", modify
    label def country_pair2 4500 "Netherlands Netherlands", modify
    label def country_pair2 5720 "South Africa South Africa", modify
    label def country_pair2 6114 "Sweden Poland", modify
    label def country_pair2 6339 "Taiwan United States", modify
    label def country_pair2 6687 "United Kingdom India", modify
    label def country_pair2 6813 "United States Australia", modify
    label def country_pair2 6917 "United States Mexico", modify
    label values BIT_pair BIT_pair
    label def BIT_pair 1 "Albania  Austria", modify
    label def BIT_pair 2 "Albania  BLEU (Belgium-Luxembourg Economic Union)", modify
    label def BIT_pair 3 "Albania  Bosnia and Herzegovina", modify
    label def BIT_pair 4 "Albania  Bulgaria", modify
    label def BIT_pair 5 "Albania  China", modify
    label def BIT_pair 6 "Albania  Cyprus", modify
    label def BIT_pair 7 "Albania  Czechia", modify
    label def BIT_pair 8 "Albania  Denmark", modify
    label def BIT_pair 9 "Albania  Egypt", modify
    label def BIT_pair 10 "Albania  Finland", modify
    label def BIT_pair 11 "Albania  France", modify
    label def BIT_pair 12 "Albania  Germany", modify
    label def BIT_pair 13 "Albania  Greece", modify
    label def BIT_pair 14 "Albania  Hungary", modify
    label def BIT_pair 15 "Albania  Iran", modify
    label def BIT_pair 16 "Albania  Israel", modify
    label def BIT_pair 17 "Albania  Italy", modify
    label def BIT_pair 18 "Albania  Korea", modify
    label def BIT_pair 19 "Albania  Kuwait", modify
    label def BIT_pair 20 "Albania  Lithuania", modify
    I'm trying to create a dummy = 1 if a country pair has appeared at a previous date in a separate variable list (indicating a treaty signing), and 0 otherwise.
    Any help is greatly appreciated, and apologies if the dataex is not provided very well.

  • #2
    The -dataex- you provided is fine. And thank you for using it on your very first post.

    But you didn't provide one for the separate data set containing the treaty signing information. And without that, I don't think anyone can advise you.

    Added: Also, I now notice that the country pairs in variable country_pair2 differ from the country pairs in BIT_pair. So is it the country_pair2 variable you want to check against the treaties, or the BIT_pair variable?
    Last edited by Clyde Schechter; 27 Mar 2022, 22:24.

    Comment


    • #3
      Thanks for the reply Clyde, I am trying to check country_pair2 data against BIT_pair, however I have hundreds of thousands of observations so I didn't want to provide a huge dataset

      Comment


      • #4
        I have matched the treaty data and M&A data by year and need to create a dummy = 1 if the country pairs in the M&A variable ("country_pair2") have appeared in any row above in the BIT_pair variable column

        Comment


        • #5
          I believe this will do it:
          Code:
          isid country_pair2 Year, sort
          decode country_pair2, gen(link)
          decode BIT_pair, gen(bit)
          
          preserve
          keep Year bit
          rename bit link
          tempfile treaties
          save `treaties'
          restore
          
          rangejoin Year . -1 using `treaties', by(link)
          by country_pair2 Year, sort: egen wanted = count(Year_U)
          replace wanted = min(wanted, 1)
          by country_pair2 Year: keep if _n == 1
          drop Year_U
          Notes:
          1. -rangejoin- is by Robert Picard and is available from SSC. To use it, you must also install -rangestat- by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC.
          2. This code depends crucially on consistent spelling out of country names between the country_pair2 and BIT_pair variables. If one of them says South Africa and the other says S. Africa or Republic of South Africa, the match will not be recognizable to Stata. Similarly, even differences in upper vs lower case or punctuation can be problems. The fact that there is Belgium in one variable and BLEU in the other makes me nervous about this (admittedly Belgium and BLEU are not quite the same thing). There is no way to code around this in Stata, as Stata has no semantic capabilities to recognize alternate names for the same country. So you may have to do a fair amount of observation-by-observation -replace- work to make this viable.
          3. In your example data, there are no instances of a country pair appearing in a BIT before it appears in country_pair2. So wanted comes out 0 in every observation. But this code will, I believe, work appropriately in the full data set (assuming my concerns in 2 are either unwarranted or dealt with appropriately.)

          Comment

          Working...
          X