Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • All possible pairwise combinations for specific companies in a specific year and in a specific country.

    Hello, I would like to find all possible pairwise combinations for specific companies in a specific year and in a specific country.
    YEAR COUNTRY AUDIT FIRM CITY
    2012 Belgium EY Brussels
    2012 Belgium EY Antwerp
    2012 Belgium EY Liege
    2013 Belgium KPMG Brussels
    2013 Belgium KPMG Antwerp
    2013 Belgium KPMG Liege
    so that the result is in this format
    YEAR COUNTRY AUDIT FIRM CITY CITY1 CITY2
    2012 Belgium EY Brussels Brussels Antwerp
    2012 Belgium EY Antwerp Antwerp Liege
    2012 Belgium EY Liege Liege Brussels
    2013 Belgium KPMG Brussels Brussels Antwerp
    2013 Belgium KPMG Antwerp Antwerp Liege
    2013 Belgium KPMG Liege Liege Brussels
    You help is appreciated

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int year str7 country str4 auditfirm str8 city
    2012 "Belgium" "EY"   "Brussels"
    2012 "Belgium" "EY"   "Antwerp" 
    2012 "Belgium" "EY"   "Liege"   
    2013 "Belgium" "KPMG" "Brussels"
    2013 "Belgium" "KPMG" "Antwerp" 
    2013 "Belgium" "KPMG" "Liege"   
    end
    
    preserve
    rename city city2
    tempfile city
    save `city'
    restore
    joinby year country auditfirm using `city'
    drop if city==city2
    Res.:

    Code:
    . l, sepby(year country auditfirm)
    
         +-------------------------------------------------+
         | year   country   auditf~m       city      city2 |
         |-------------------------------------------------|
      1. | 2012   Belgium         EY   Brussels    Antwerp |
      2. | 2012   Belgium         EY   Brussels      Liege |
      3. | 2012   Belgium         EY    Antwerp   Brussels |
      4. | 2012   Belgium         EY    Antwerp      Liege |
      5. | 2012   Belgium         EY      Liege   Brussels |
      6. | 2012   Belgium         EY      Liege    Antwerp |
         |-------------------------------------------------|
      7. | 2013   Belgium       KPMG   Brussels      Liege |
      8. | 2013   Belgium       KPMG   Brussels    Antwerp |
      9. | 2013   Belgium       KPMG    Antwerp   Brussels |
     10. | 2013   Belgium       KPMG    Antwerp      Liege |
     11. | 2013   Belgium       KPMG      Liege   Brussels |
     12. | 2013   Belgium       KPMG      Liege    Antwerp |
         +-------------------------------------------------+

    Comment


    • #3
      Thank you so much,
      I tried it and worked but still have two things:
      1) if i have a huge dataset, how can i amend the code so that it uses a specific dataset
      2) how can i exclude the repetitive combinations because now i have Brussels to Antwerp and Antwerp to Brussels
      Many thanks for your help

      Comment


      • #4
        2) how can i exclude the repetitive combinations because now i have Brussels to Antwerp and Antwerp to Brussels
        Change the last line of the code to

        Code:
        drop if city>=city2
        Res.:

        Code:
         drop if city>=city2
        (12 observations deleted)
        
        . l, sepby(year country auditfirm)
        
             +-------------------------------------------------+
             | year   country   auditf~m       city      city2 |
             |-------------------------------------------------|
          1. | 2012   Belgium         EY   Brussels      Liege |
          2. | 2012   Belgium         EY    Antwerp   Brussels |
          3. | 2012   Belgium         EY    Antwerp      Liege |
             |-------------------------------------------------|
          4. | 2013   Belgium       KPMG   Brussels      Liege |
          5. | 2013   Belgium       KPMG    Antwerp   Brussels |
          6. | 2013   Belgium       KPMG    Antwerp      Liege |
             +-------------------------------------------------+

        1) if i have a huge dataset, how can i amend the code so that it uses a specific dataset
        You can break the dataset into several datasets, provided that you do not truncate groups (city-year-firm combinations). So if you have 150,000 observations and each group of 50,000 does not truncate groups (you have to do a manual check)

        Code:
        use dataset, clear
        tempfile group1 group2 group3
        keep in 1/50000
        save `group1'
        use dataset, clear
        keep in 50001/100000
        save `group2'
        use dataset, clear
        keep in 100001/150000
        save `group3'
        Then run the code in #1 on these datasets and append the results.

        Code:
        forval i=1/3{
            use `group`i'', clear
            *RUN CODE
            save `group`i'', replace
        }
        use `group1', clear
        append using `group2'
        append using `group3'
        Last edited by Andrew Musau; 12 Dec 2021, 02:39.

        Comment


        • #5
          Thank you so much, that was really helpful
          My dataset is actually 40,000 observations only so i don't need to divide it into 3 groups so the finalized code is the following but i think there is something wrong (the red part):
          type help dataex
          clear
          input int year str7 country str4 auditfirm str8 city
          use sample to test.dta, clear
          end
          preserve
          rename city city2
          tempfile city
          save `city'
          restore
          joinby year country auditfirm using `city'
          drop if city>=city2
          Appreciate your help

          Comment


          • #6
            The dataex part should not be part of your code. First open your dataset and start at preserve. You should change the name of your identifiers if they differ from those in #2. Otherwise, if you cannot make progress, after opening your dataset, copy and paste the output of

            Code:
            dataex in 1/5

            Comment


            • #7
              Thanks a lot, it works now. I really appreciate your help!

              Comment

              Working...
              X