Dear community,
does someone have experience in creating sankey plots in Stata? I created a sankey plot for FDI data with the 10 lagest country pairs following this instruction by Fernando Rios-Avila which was extremely helpful. However, I would like to make one addition, namely add in the middle of the connecting lines the name of the sector, in which the most FDI occurs between the origin and host country. Does someone have an idea how I could add this to the plot?
I retrieved the sankey palettes colrspace schemepack from SSC and before executing my code I ran the two code files (sankey_plot and sankey_i) by Fernando Rios-Avila
My code is:
My plot looks like this:
data:image/s3,"s3://crabby-images/b3b08/b3b087a011311031990c2865eda124bf556ff411" alt="Click image for larger version
Name: sankey plot.png
Views: 1
Size: 98.4 KB
ID: 1765913"
What I want is sth like this, where the sector is displayed (taken from https://www.usitc.gov/publications/3...d_nov_2023.pdf) :
data:image/s3,"s3://crabby-images/e4959/e4959d3234c907ce4307735b2bfeb1c4e66ca477" alt="Click image for larger version
Name: sankey2.png
Views: 1
Size: 211.4 KB
ID: 1765914"
I would appreciate any help!
Best
Noemi
does someone have experience in creating sankey plots in Stata? I created a sankey plot for FDI data with the 10 lagest country pairs following this instruction by Fernando Rios-Avila which was extremely helpful. However, I would like to make one addition, namely add in the middle of the connecting lines the name of the sector, in which the most FDI occurs between the origin and host country. Does someone have an idea how I could add this to the plot?
I retrieved the sankey palettes colrspace schemepack from SSC and before executing my code I ran the two code files (sankey_plot and sankey_i) by Fernando Rios-Avila
My code is:
use gravity_sectorlevel, replace
collapse (sum) total_fdi_stock_dest = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock_dest, by( iso3_d_encode iso3_d country_d )
keep if iso3_d == "AUS" | iso3_d == "BEL" | iso3_d == "CHN" | iso3_d == "DEU" | iso3_d == "FRA" | iso3_d == "GBR" | iso3_d == "HKG" | iso3_d == "IRL" | iso3_d == "LUX" | iso3_d == "NLD" | iso3_d == "USA"
gen fdi_dest_rs = total_fdi_stock_dest / 1000000
format fdi_dest_rs %9.2f
save dest_countries.dta, replace
use gravity_sectorlevel, replace
collapse (sum) total_fdi_stock_origin = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock_origin, by( iso3_o_encode iso3_o country_o )
keep if iso3_o == "USA" | iso3_o == "JPN" | iso3_o == "GBR" | iso3_o == "FRA" | iso3_o == "ESP" | iso3_o == "DEU" | iso3_o == "CHE" | iso3_o == "CAN"
gen fdi_origin_rs = total_fdi_stock_origin / 1000000
format fdi_origin_rs %9.2f
save origin_countries.dta, replace
use gravity_sectorlevel, replace
collapse (sum) total_fdi_stock = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock, by(country_pair iso3_d_encode iso3_d country_d iso3_o_encode iso3_o country_o)
sort total_fdi_stock
gen rank = _n
keep in -20/-1
list country_pair total_fdi_stock in 1/10
merge m:1 iso3_d using dest_countries.dta
drop _merge
merge m:1 iso3_o using origin_countries.dta
drop _merge
gen fdi_origin_str = string(fdi_origin_rs, "%9.2f")
gen fdi_dest_str = string(fdi_dest_rs, "%9.2f")
egen label0 = concat(iso3_o fdi_origin_str), p(" ")
egen label1 = concat(iso3_d fdi_dest_str), p(" ")
set scheme stcolor
gen x0 = 1
gen x1 = 2
sankey_plot x0 iso3_o_encode x1 iso3_d_encode, ///
width0(total_fdi_stock) extra adjust ///
colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) ///
label0(label0) label1(label1) ///
xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5)
sankey_plot x0 iso3_o x1 iso3_d, width0(total_fdi_stock) extra adjust colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) label0(label0) xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5) // title("Top 20 country pairs by total assets (average over time)")
collapse (sum) total_fdi_stock_dest = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock_dest, by( iso3_d_encode iso3_d country_d )
keep if iso3_d == "AUS" | iso3_d == "BEL" | iso3_d == "CHN" | iso3_d == "DEU" | iso3_d == "FRA" | iso3_d == "GBR" | iso3_d == "HKG" | iso3_d == "IRL" | iso3_d == "LUX" | iso3_d == "NLD" | iso3_d == "USA"
gen fdi_dest_rs = total_fdi_stock_dest / 1000000
format fdi_dest_rs %9.2f
save dest_countries.dta, replace
use gravity_sectorlevel, replace
collapse (sum) total_fdi_stock_origin = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock_origin, by( iso3_o_encode iso3_o country_o )
keep if iso3_o == "USA" | iso3_o == "JPN" | iso3_o == "GBR" | iso3_o == "FRA" | iso3_o == "ESP" | iso3_o == "DEU" | iso3_o == "CHE" | iso3_o == "CAN"
gen fdi_origin_rs = total_fdi_stock_origin / 1000000
format fdi_origin_rs %9.2f
save origin_countries.dta, replace
use gravity_sectorlevel, replace
collapse (sum) total_fdi_stock = TotalassetsthUSD, by( iso3_d_encode iso3_d country_d iso3_o iso3_o_encode country_o year country_pair)
collapse (sum) total_fdi_stock, by(country_pair iso3_d_encode iso3_d country_d iso3_o_encode iso3_o country_o)
sort total_fdi_stock
gen rank = _n
keep in -20/-1
list country_pair total_fdi_stock in 1/10
merge m:1 iso3_d using dest_countries.dta
drop _merge
merge m:1 iso3_o using origin_countries.dta
drop _merge
gen fdi_origin_str = string(fdi_origin_rs, "%9.2f")
gen fdi_dest_str = string(fdi_dest_rs, "%9.2f")
egen label0 = concat(iso3_o fdi_origin_str), p(" ")
egen label1 = concat(iso3_d fdi_dest_str), p(" ")
set scheme stcolor
gen x0 = 1
gen x1 = 2
sankey_plot x0 iso3_o_encode x1 iso3_d_encode, ///
width0(total_fdi_stock) extra adjust ///
colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) ///
label0(label0) label1(label1) ///
xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5)
sankey_plot x0 iso3_o x1 iso3_d, width0(total_fdi_stock) extra adjust colorpalette(viridis, opacity(40)) gap(0.1) noline labcolor(black) label0(label0) xlabel(1 "Origin" 2 "Host", nogrid) xsize(5) ysize(5.5) // title("Top 20 country pairs by total assets (average over time)")
What I want is sth like this, where the sector is displayed (taken from https://www.usitc.gov/publications/3...d_nov_2023.pdf) :
I would appreciate any help!
Best
Noemi
Comment