Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sankey plot - adding labels

    Dear community,

    does someone have experience in creating sankey plots in Stata? I created a sankey plot for FDI data with the 10 lagest country pairs following this instruction by Fernando Rios-Avila which was extremely helpful. However, I would like to make one addition, namely add in the middle of the connecting lines the name of the sector, in which the most FDI occurs between the origin and host country. Does someone have an idea how I could add this to the plot?

    I retrieved the sankey palettes colrspace schemepack from SSC and before executing my code I ran the two code files (sankey_plot and sankey_i) by Fernando Rios-Avila

    My code is:
    use gravity_sectorlevel, replace
    collapse (sum) total_fdi_stock_dest = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
    collapse (sum) total_fdi_stock_dest, by( iso3_d_encode iso3_d country_d )
    keep if iso3_d == "AUS" | iso3_d == "BEL" | iso3_d == "CHN" | iso3_d == "DEU" | iso3_d == "FRA" | iso3_d == "GBR" | iso3_d == "HKG" | iso3_d == "IRL" | iso3_d == "LUX" | iso3_d == "NLD" | iso3_d == "USA"
    gen fdi_dest_rs = total_fdi_stock_dest / 1000000
    format fdi_dest_rs %9.2f
    save dest_countries.dta, replace

    use gravity_sectorlevel, replace
    collapse (sum) total_fdi_stock_origin = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
    collapse (sum) total_fdi_stock_origin, by( iso3_o_encode iso3_o country_o )
    keep if iso3_o == "USA" | iso3_o == "JPN" | iso3_o == "GBR" | iso3_o == "FRA" | iso3_o == "ESP" | iso3_o == "DEU" | iso3_o == "CHE" | iso3_o == "CAN"
    gen fdi_origin_rs = total_fdi_stock_origin / 1000000
    format fdi_origin_rs %9.2f
    save origin_countries.dta, replace


    use gravity_sectorlevel, replace
    collapse (sum) total_fdi_stock = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
    collapse (sum) total_fdi_stock, by(country_pair iso3_d_encode iso3_d country_d iso3_o_encode iso3_o country_o)
    sort total_fdi_stock
    gen rank = _n
    keep in -20/-1
    list country_pair total_fdi_stock in 1/10
    merge m:1 iso3_d using dest_countries.dta
    drop _merge
    merge m:1 iso3_o using origin_countries.dta
    drop _merge

    gen fdi_origin_str = string(fdi_origin_rs, "%9.2f")
    gen fdi_dest_str = string(fdi_dest_rs, "%9.2f")

    egen label0 = concat(iso3_o fdi_origin_str), p(" ")
    egen label1 = concat(iso3_d fdi_dest_str), p(" ")

    set scheme stcolor

    gen x0 = 1
    gen x1 = 2
    sankey_plot x0 iso3_o_encode x1 iso3_d_encode, ///
    width0(total_fdi_stock) extra adjust ///
    colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) ///
    label0(label0) label1(label1) ///
    xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5)

    sankey_plot x0 iso3_o x1 iso3_d, width0(total_fdi_stock) extra adjust colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) label0(label0) xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5) // title("Top 20 country pairs by total assets (average over time)")
    My plot looks like this:

    Click image for larger version

Name:	sankey plot.png
Views:	1
Size:	98.4 KB
ID:	1765913


    What I want is sth like this, where the sector is displayed (taken from https://www.usitc.gov/publications/3...d_nov_2023.pdf) :
    Click image for larger version

Name:	sankey2.png
Views:	1
Size:	211.4 KB
ID:	1765914


    I would appreciate any help!

    Best
    Noemi

  • #2
    Hi Noemi
    For something like that you need more layers.
    Specifcally, your X0 X1 will need to be there going from 1 to 2 (this is what you have) and from 2 to 3. The 2 would be your middle group
    Hope this helps
    F

    Comment


    • #3
      Dear FernandoRios

      thank you so much for your response. I think my data does not have this structure. So there is no way I could just manually add a label to the connecting segment, stating the sector name?

      Best
      Noemi

      Comment


      • #4
        you may need to restructure it
        i cannot say more beyond that without seen the data

        Comment


        • #5
          This is an example of my data:

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str6 country_pair str3 iso3_o str14 country_o str3 iso3_d str14 country_d str3(iso3_d_encode iso3_o_encode) double total_fdi_stock str12 sector
          "FRAGBR" "FRA" "France"         "GBR" "United Kingdom" "GBR" "FRA"  6150296.337366313 "Finance"     
          "USAAUS" "USA" "United States"  "AUS" "Australia"      "AUS" "USA"  6230860.105399132 "Finance"     
          "GBRAUS" "GBR" "United Kingdom" "AUS" "Australia"      "AUS" "GBR"  6446273.537326217 "Management"  
          "FRABEL" "FRA" "France"         "BEL" "Belgium"        "BEL" "FRA"  6954542.277267169 "FInance"     
          "USADEU" "USA" "United States"  "DEU" "Germany"        "DEU" "USA"   7208773.22272037 "Information "
          "CHEUSA" "CHE" "Switzerland"    "USA" "United States"  "USA" "CHE"   7390831.17666626 "Finance"     
          "USACHN" "USA" "United States"  "CHN" "China"          "CHN" "USA"  7771158.493024005 "Real Estate"
          "DEUGBR" "DEU" "Germany"        "GBR" "United Kingdom" "GBR" "DEU"  8468188.065097764 "Finance"     
          "GBRNLD" "GBR" "United Kingdom" "NLD" "Netherlands"    "NLD" "GBR"  9012846.412998468 "FInance"     
          "GBRFRA" "GBR" "United Kingdom" "FRA" "France"         "FRA" "GBR"   9598786.51034689 "Finance"     
          "JPNUSA" "JPN" "Japan"          "USA" "United States"  "USA" "JPN"  10150002.13390249 "Management"  
          "JPNGBR" "JPN" "Japan"          "GBR" "United Kingdom" "GBR" "JPN" 10435803.109379198 "Finance"     
          "USAIRL" "USA" "United States"  "IRL" "Ireland"        "IRL" "USA" 11598436.177555203 "Finance"     
          "ESPGBR" "ESP" "Spain"          "GBR" "United Kingdom" "GBR" "ESP" 13133246.855458409 "Information "
          "GBRUSA" "GBR" "United Kingdom" "USA" "United States"  "USA" "GBR" 15515183.413017288 "Finance"     
          "CANUSA" "CAN" "Canada"         "USA" "United States"  "USA" "CAN" 16677937.239602685 "Finance"     
          "GBRHKG" "GBR" "United Kingdom" "HKG" "Hong Kong"      "HKG" "GBR"  16703170.70928955 "Finance"     
          "USALUX" "USA" "United States"  "LUX" "Luxembourg"     "LUX" "USA"  20181799.81582506 "Management"  
          "USANLD" "USA" "United States"  "NLD" "Netherlands"    "NLD" "USA"   20239658.5090217 "Real Estate"
          "USAGBR" "USA" "United States"  "GBR" "United Kingdom" "GBR" "USA"  74039827.05130184 "Finance"     
          end
          Unfortunately I don't manage to restructure it in a way similar to the job market example in your guideline. Maybe you have an idea seeing the data? Please let me know if you need more information about the data.

          Best
          Noemi

          Comment

          Working...
          X