Hi, I am doing a series of data cleaning on data of multiple countries. To make this job easier, I have defined a program for the repeated parts of the code, and call it for each country.
My issue is that this program works for majority of the countries (like Bangladesh), but does not work for some of the countries (like Yemen). Would you mind helping me with this? Thank you.
I have created a sample of datasets for Bangladesh and Yemen below using the dataex command. This is the program I have defined:
And I call it for two countries as below:
And
Yemen's sample data:
And Bangladesh's sample data:
My issue is that this program works for majority of the countries (like Bangladesh), but does not work for some of the countries (like Yemen). Would you mind helping me with this? Thank you.
I have created a sample of datasets for Bangladesh and Yemen below using the dataex command. This is the program I have defined:
Code:
program replace_m // Some observations are labeled as "Don't Know" and take the value of -9. This function recodes them as missing. capture describe d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a if _rc == 0 { keep d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a rename (d2 n2e n2a n7a l1 l6 a14y n2f n2b n6a a15y n2ra a4a) (total_sales total_input_costs total_cost_labor machinery_replacement_value num_perm_worker num_temp_worker interview_year total_cost_fuel total_cost_electricity net_book_value_machinery interview_year_end total_rent industry) replace total_sales = . if total_sales < 0 replace total_input_costs = . if total_input_costs < 0 replace total_cost_labor = . if total_cost_labor < 0 replace machinery_replacement_value = . if machinery_replacement_value < 0 replace total_cost_fuel = . if total_cost_fuel < 0 replace num_perm_worker = . if num_perm_worker < 0 replace num_temp_worker = . if num_temp_worker < 0 replace total_cost_electricity = . if total_cost_electricity < 0 replace net_book_value_machinery = . if net_book_value_machinery < 0 replace total_rent = . if total_rent < 0 } end
And I call it for two countries as below:
Code:
program clean_yemen use "./data/WBES_FIRM/Yemen/Yemen-2010-full-data-.dta", clear gen ave_exchange_rate = 202.84666667 // Official average exchange rate of USD per YER for 2009; source: FAO STAT gen country_name = "Yemen" gen country_iso = "YEM" gen year = 2010 replace_m save "./output/country_Yemen.dta", replace end
Code:
program clean_bangladesh use "./data/WBES_FIRM/Bangladesh/Bangladesh-2013-full-data.dta", clear gen ave_exchange_rate = 74.1524 // Official average exchange rate of USD per BDT for 2011; source: FAO STAT gen country_name = "Bangladesh" gen country_iso = "BGD" gen year = 2013 replace_m save "./output/country_Bangladesh.dta", replace end
Yemen's sample data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double(d2 n2e n2a n7a) int(l1 l6 a14y) double n2f long n2b double n6a int a15y byte a4a -9 . -9 . 250 30 2010 . -9 . 2010 52 5000000 4000000 1600000 6000000 5 0 2010 120000 200000 -9 2010 28 8000000 6000000 1556000 -9 7 0 2010 -8 1202000 -9 2010 18 10000000 . 1000000 . 10 0 2010 . 1500000 . 2010 52 -9 -9 -9 -9 10 20 2010 -9 200000 -9 2010 2 30000000 20000000 6000000 40000000 15 0 2010 30000000 2700000 26000000 2010 26 -9 -9 -8 100000 5 0 2010 -8 20000 -9 2010 18 500000 200000 -9 10000 5 0 2010 -8 14400 0 2010 18 -9 . -9 . 18 4 2010 . -9 . 2010 55 -9 -9 -9 -9 10 9 2010 -9 200000 -9 2010 28 -9 . -8 . 6 12 2010 . -8 . 2010 52 1.200e+08 4000000 6000000 35000000 31 7 2010 500000 100000 20000000 2010 26 -9 . 2296000 . 15 11 2010 . -9 . 2010 45 200000 600000 400000 600000 5 0 2010 36000 -9 50000 2010 26 2493925376 . 117896720 . 300 100 2010 . 49738168 . 2010 51 12000000 1400000 2000000 0 5 3 2010 -8 90000 0 2010 2 5000000 . -9 . 5 0 2010 . 660000 . 2010 52 6000000 . 2000000 . 10 0 2010 . -9 . 2010 55 5000000 . 1500000 . 7 2 2010 . 600000 . 2010 52 -9 -9 -9 -9 10 0 2010 -9 -9 -9 2010 2 9846871 . 9617782 . 25 10 2010 . 200000 . 2010 51 -9 -9 -9 -9 160 0 2010 -9 -9 -9 2010 28 1000000 . 200000 . 5 0 2010 . 120000 . 2010 28 -9 7000000 3000000 2.000e+08 60 10 2010 200000 1500000 2.000e+08 2010 28 -9 . -9 . 10 0 2010 . 360000 . 2010 45 5.000e+08 4.980e+08 71000000 1.200e+08 200 0 2010 7000000 5800000 1.200e+08 2010 52 60000000 36000000 15000000 -9 95 20 2010 1000000 0 1.520e+08 2010 26 -9 4.000e+08 1.500e+08 -9 365 45 2010 10000000 -9 -9 2010 24 -9 -8 -8 -9 220 0 2010 1000000 6000000 -8 2010 24 5.840e+08 . 15000000 . 25 4 2010 . -9 . 2010 60 -9 -9 -9 -9 240 0 2010 -9 -9 -9 2010 24 1000000 500000 -9 100000 13 6 2010 -8 60000 -9 2010 18 -9 . -9 . 38 0 2010 . 2000000 . 2010 52 3857281024 . 4000000 . 40 10 2010 . 6000000 . 2010 52 60000000 . 11000000 . 60 30 2010 . 936000 . 2010 52 -9 . -8 . 9 1 2010 . 300000 . 2010 51 4.000e+08 . 20000000 . 5000 200 2010 . 100000 . 2010 45 1.000e+08 . 15000000 . 50 0 2010 . 800000 . 2010 52 -9 . -8 . 10 0 2010 . -8 . 2010 50 1200000 . 2000000 . 7 4 2010 . 300000 . 2010 55 1.800e+09 500000 2.000e+08 5.500e+08 500 250 2010 65000 2200000 4.500e+08 2010 25 1.000e+08 6.000e+08 2880000 1000000 8 10 2010 1000000 200000 2000000 2010 28 50000000 . 12000000 . 40 0 2010 . 180000 . 2010 51 -9 -8 -8 2000000 14 3 2010 -9 600000 4000000 2010 28 2000000 1000000 400000 18000000 15 10 2010 360000 240000 18000000 2010 26 15000000 3000000 600000 6000000 6 0 2010 1440000 840000 5000000 2010 26 1000000 . 1000000 . 5 8 2010 . 500000 . 2010 50 2.400e+10 -8 -8 -9 1357 44 2010 -8 -8 -9 2010 15 11000000512 5.200e+09 2381659904 -9 1696 216 2010 2.800e+08 32000000 2381659904 2010 15 -9 -8 -8 -9 200 10 2010 -8 -8 -8 2010 2 1.4336e+09 . 50000000 . 40 0 2010 . 840000 . 2010 50 -9 -9 -9 -9 -9 0 2010 -9 -9 -9 2010 2 30000000 . 42000000 . 12 4 2010 . 480000 . 2010 55 10000000 -9 3000000 50000000 7 4 2010 -9 840000 50000000 2010 26 20000000 20000000 3000000 50000000 25 8 2010 4000000 72000 50000000 2010 26 9000000 . 1500000 . 5 0 2010 . -9 . 2010 52 1000000 -8 1080000 70000 5 2 2010 -9 72000 70000 2010 26 4800000 . 1640000 . 5 0 2010 . 96000 . 2010 55 15000000 9000000 2200000 4000000 10 0 2010 50000 240000 4000000 2010 2 10000000 6840000 1000000 800000 5 0 2010 580000 60000 1000000 2010 15 2880000 . 216000 . 8 10 2010 . 360000 . 2010 55 3000000 . 300000 . 5 0 2010 . 72000 . 2010 52 2000000 500000 1200000 1000000 5 2 2010 -8 192000 500000 2010 18 6499999744 . 35000000 . 69 12 2010 . 7800000 . 2010 51 5000000 -9 1800000 56000 8 3 2010 2500 36000 56000 2010 18 2.000e+08 . 13000000 . 39 12 2010 . 15000000 . 2010 55 4.320e+09 . 23881630 . 30 4 2010 . 1200000 . 2010 51 5760000 1000000 3012000 300000 24 15 2010 -8 150000 700000 2010 18 -9 . 5000000 . 9 4 2010 . 600000 . 2010 52 3000000 2000000 1200000 3000000 9 10 2010 500000 -9 2000000 2010 26 35000000 10000000 15000000 10000000 25 10 2010 500000 5000000 10000000 2010 26 -9 . 1000000 . 18 1 2010 . 150000 . 2010 51 72000000 . 1500000 . 35 0 2010 . 840000 . 2010 55 4000000 900000 1200000 1000000 7 0 2010 -9 200000 500000 2010 2 12000000 . 2000000 . 5 0 2010 . 3000000 . 2010 52 -9 . -9 . 10 0 2010 . 840000 . 2010 55 7000000 . 3600000 . 40 10 2010 . 400000 . 2010 55 1.360e+08 . 36000000 . 75 10 2010 . 23000000 . 2010 55 -9 . 60000000 . 170 0 2010 . 12000000 . 2010 55 -9 40600000 26200000 80000000 40 0 2010 3050000 960000 86000000 2010 28 -9 . -9 . 20 0 2010 . -9 . 2010 52 -9 . 7200000 . 30 0 2010 . 1200000 . 2010 51 3.000e+09 1.528e+10 255630800 8814000128 500 0 2010 1.170e+08 -9 2344999936 2010 17 3.000e+08 . 2000000 . 6 10 2010 . 240000 . 2010 52 8.800e+09 . 24000000 . 48 0 2010 . 12000000 . 2010 51 4.000e+09 2.000e+09 65460000 3.000e+09 150 10 2010 1.500e+08 10000000 2.000e+09 2010 15 44999999488 4.000e+09 4000000 7.000e+09 500 200 2010 1.200e+09 -9 4.000e+09 2010 15 4.710e+08 1.870e+08 18000000 1.600e+08 121 27 2010 3600000 14400000 14200000 2010 18 75000000 39000000 1.850e+08 -9 38 0 2010 400000 1600000 -9 2010 28 50000000 20000000 18000000 -9 75 0 2010 6000000 4800000 5.000e+08 2010 28 10000000 3600000 4320000 -9 9 0 2010 -8 72000 -8 2010 28 1800000 . 1800000 . 5 1 2010 . 60000 . 2010 52 -9 -9 -9 1800000 5 0 2010 360000 500000 1000000 2010 18 4015000 250000 2281250 500000 6 0 2010 50000 144000 100000 2010 18 3273659904 . 33000000 . 29 0 2010 . 480000 . 2010 51 48000000 . 1500000 . 5 0 2010 . 48000 . 2010 52 3600000 . 1200000 . 5 0 2010 . 360000 . 2010 55 1.560e+08 1.400e+08 16000000 -9 33 30 2010 21000000 10000000 -9 2010 26 4000000 1800000 2160000 -9 5 1 2010 96000 120000 -9 2010 2 18250000 . 24000000 . 10 3 2010 . 1560000 . 2010 55 end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double(d2 n2e n2a n7a) int(l1 l6 a14y) long(n2f n2b) double n6a int a15y byte a4a 1.934e+08 1.200e+08 35000000 6.000e+09 600 80 2013 400000 5000000 2.000e+09 2013 15 90000000 70000000 4000000 12000000 60 0 2013 30000 60000 8000000 2013 31 5.400e+08 4.000e+08 14000000 60000000 110 0 2013 300000 1000000 40000000 2013 19 1.000e+09 6.000e+08 2.500e+08 2.500e+08 1800 0 2013 10000000 6000000 1.500e+08 2013 18 1.400e+09 9.000e+08 3.200e+08 2.300e+08 3000 0 2013 5000000 6000000 18000000 2013 18 7.200e+08 5.700e+08 90000000 45000000 1100 0 2013 1200000 9000000 30000000 2013 18 3.000e+08 2.700e+08 10000000 60000000 110 0 2013 2000000 600000 50000000 2013 29 1.600e+08 80000000 40000000 12000000 860 0 2013 1300000 1300000 10000000 2013 17 1.200e+08 67000000 24000000 20000000 350 0 2013 1700000 800000 15000000 2013 17 2.000e+08 30000000 50000000 15000000 350 0 2013 10000000 1000000 10000000 2013 18 6.000e+08 1.000e+08 3.000e+08 -9 250 150 2013 1000000 2000000 -9 2013 18 1.800e+08 1.037e+08 6634000 -9 1700 0 2013 2900000 6000000 -9 2013 18 3.054e+08 2.300e+08 40000000 60000000 448 0 2013 1200000 8000000 10900000 2013 18 9600000 6000000 600000 -9 12 0 2013 12000 140000 -9 2013 15 3.415e+08 1.600e+08 1.000e+08 30000000 1000 0 2013 900000 500000 20000000 2013 18 1400000 1200000 200000 70000 5 0 2013 0 20000 50000 2013 27 2.600e+08 1.800e+08 12000000 30000000 500 0 2013 1000000 300000 14000000 2013 18 6.600e+08 5.500e+08 24000000 50000000 500 0 2013 0 884000 30000000 2013 18 6.357e+08 3.552e+08 36200000 2.600e+08 350 0 2013 10400000 300000 1.500e+08 2013 18 1.450e+09 9.700e+08 75000000 1.200e+09 1450 0 2013 30000000 0 8.700e+08 2013 17 1.490e+09 6.400e+08 123423564 1.200e+09 1100 200 2013 5000000 1200000 1.000e+09 2013 17 8.500e+08 3.000e+08 1.500e+08 2.000e+09 900 30 2013 140000000 0 1.500e+09 2013 17 5.400e+08 4.700e+08 36000000 25000000 500 0 2013 10800000 0 20000000 2013 18 1500000 0 50000 60000 4 0 2013 0 12000 40000 2013 27 18500000 2200000 3000000 3000000 91 10 2013 1500000 1000000 2000000 2013 15 7.200e+09 2.000e+09 1.200e+09 3.000e+09 2600 500 2013 500000000 200000000 2.000e+09 2013 24 8.000e+08 4.000e+08 20000000 2.000e+09 1200 0 2013 50000000 50000000 2.000e+09 2013 24 2.400e+08 1.500e+08 30000000 1.500e+08 250 0 2013 400000 4500000 5.000e+08 2013 24 3.400e+08 1.300e+08 70000000 1.500e+08 72 0 2013 2000000 2000000 90000000 2013 24 3.000e+08 20000000 96000000 2.000e+08 400 0 2013 2400000 2600000 1.000e+09 2013 18 2.000e+08 1.500e+08 6500000 -9 100 30 2013 100000 84000 52400000 2013 19 40000000 15000000 2700000 500000 30 30 2013 150000 3500000 150000 2013 19 3000000 1000000 960000 700000 20 0 2013 240000 120000 500000 2013 29 3.600e+08 2.900e+08 45000000 40000000 800 0 2013 7200000 3000000 30000000 2013 18 6.000e+08 4.900e+08 48000000 55000000 830 0 2013 3000000 2500000 60000000 2013 18 6.000e+08 3.000e+08 2.000e+08 6500000 700 200 2013 5000000 10000000 6000000 2013 17 37000000 5000000 12000000 40000000 200 0 2013 1600000 2000000 20000000 2013 15 6.500e+08 5.500e+08 22000000 5.000e+08 150 10 2013 700000 2500000 2.000e+08 2013 19 3.000e+08 1.300e+08 1.200e+08 7.000e+08 550 50 2013 10000000 10000000 5.000e+08 2013 24 8.900e+08 6.500e+08 30000000 45000000 2050 0 2013 5000000 15000000 30000000 2013 18 1.300e+08 60000000 40000000 32000000 470 0 2013 2000000 6000000 28100000 2013 18 3.080e+09 2.050e+09 3.200e+08 1.400e+08 5000 0 2013 30000000 20000000 1.200e+08 2013 18 8.000e+08 4.600e+08 50000000 60000000 550 0 2013 6000000 20000000 50000000 2013 18 12000000 4500000 3600000 30000000 24 0 2013 0 360000 25000000 2013 19 577500 . 80000 . 20 0 2013 . 120000 . 2013 52 1.160e+09 9.200e+08 1.800e+08 1.500e+08 1150 0 2013 8600000 3800000 1.300e+08 2013 17 32500000 10000000 9000000 40000000 110 0 2013 1200000 1800000 20000000 2013 17 3.000e+08 2.630e+08 21000000 15000000 250 0 2013 3744000 1080000 1.200e+08 2013 18 3.600e+08 3.070e+08 26000000 20000000 305 0 2013 3000000 1800000 15000000 2013 18 95000000 . 7100000 . 70 0 2013 . 1300000 . 2013 45 -9 -9 -9 -9 270 0 2013 -9 -9 -9 2013 15 50000000 . 4000000 . 35 5 2013 . 700000 . 2013 52 1.859e+10 -9 -9 -9 1202 0 2013 -9 -9 -9 2013 24 1.900e+08 60000000 1.000e+08 70000000 1350 0 2013 1000000 6600000 50000000 2013 18 1.500e+08 40000000 40000000 20000000 200 0 2013 5000000 2000000 15000000 2013 18 4.800e+08 4.230e+08 38000000 2.500e+08 270 0 2013 3000000 2500000 2.000e+08 2013 18 1.800e+09 1.620e+09 84000000 90000000 1400 0 2013 42000000 600000 70000000 2013 18 35454000 15160000 3700000 1.407e+08 350 0 2013 51000 4054000 10000000 2013 18 1.100e+08 50000000 30000000 50000000 25 15 2013 2000000 4000000 30000000 2013 15 30000000 9000000 8400000 3.000e+08 123 0 2013 2500000 1300000 15000000 2013 31 5.000e+09 2.000e+09 3.000e+08 2.000e+09 1500 0 2013 30000000 30000000 4.500e+08 2013 23 1.000e+08 500000 4000000 15000000 330 0 2013 300000 1000000 10000000 2013 18 1.600e+09 1.000e+09 3.000e+08 15000000 2500 0 2013 10000000 15000000 10000000 2013 17 1.300e+08 69000000 31200000 8.000e+08 300 0 2013 21600000 480000 5.200e+08 2013 18 60000000 27000000 20000000 10000000 350 0 2013 1600000 1200000 8000000 2013 18 1.200e+08 90000000 14400000 50000000 120 0 2013 1800000 4800000 45000000 2013 19 90000000 76000000 6000000 9000000 120 0 2013 600000 264000 6000000 2013 18 3.225e+08 6450000 23400000 -9 1500 30 2013 11500000 200000 -9 2013 18 2.000e+08 1.300e+08 52800000 50000000 550 0 2013 3000000 2160000 40000000 2013 18 6.000e+08 4.780e+08 1.000e+08 50000000 1500 0 2013 6000000 1800000 40000000 2013 17 4.800e+08 2.915e+08 1.500e+08 95000000 1950 0 2013 20000000 3600000 80000000 2013 24 2.400e+08 1.200e+08 20000000 46000000 250 0 2013 84000000 120000 40000000 2013 18 1.200e+08 83000000 25200000 9000000 320 0 2013 108000000 1000000 7000000 2013 18 851772215 732479813 40627181 -9 550 50 2013 4729666 1194353 21920845 2013 17 3.500e+08 -9 -9 -9 275 0 2013 -9 -9 -9 2013 19 3700000 1500000 800000 400000 10 0 2013 -9 120000 200000 2013 27 -9 -9 -9 -9 2200 500 2013 -9 -9 -9 2013 18 6.000e+08 4.870e+08 80000000 86000000 1100 0 2013 13500000 6000000 80000000 2013 18 9.000e+08 70000000 55000000 6.500e+08 600 60 2013 12000000 10000000 6.000e+08 2013 17 7.000e+08 5.000e+08 12500000 2.500e+08 100 30 2013 7000000 6000000 2.000e+08 2013 31 90000000 63000000 13000000 14000000 250 0 2013 3000000 720000 12000000 2013 31 3.600e+08 2.300e+08 82000000 40000000 750 0 2013 8000000 3600000 35000000 2013 24 6000000 4500000 240000 2000000 7 0 2013 60000 18000 1500000 2013 15 4905783521 2.1618e+09 1152050000 7.000e+08 1907 0 2013 354200000 482000000 5.000e+08 2013 17 3.600e+08 3.000e+08 15800000 19000000 110 0 2013 3600000 1200000 17000000 2013 31 2.400e+09 2.100e+09 1.170e+08 1.000e+08 650 0 2013 18000000 6000000 80000000 2013 24 2.000e+08 1.600e+08 12000000 50000000 140 30 2013 2000000 2600000 40000000 2013 17 4.800e+08 4.300e+08 8600000 8000000 60 0 2013 1500000 1200000 6000000 2013 31 5.200e+08 3.640e+08 1.200e+08 15000000 1400 0 2013 4800000 2400000 12000000 2013 18 3.000e+08 2.330e+08 36000000 1.200e+08 2500 0 2013 3600000 1800000 1.000e+08 2013 24 4000000 2400000 480000 1000000 5 0 2013 24000 78000 800000 2013 15 2500000 . 85700 . 4 1 2013 . 9600 . 2013 52 186020000 1.400e+08 28000000 20000000 300 0 2013 800000 1500000 12000000 2013 26 -9 . 13500000 . 120 0 2013 . 200000 . 2013 60 3.000e+08 2.000e+08 30000000 1.800e+08 225 0 2013 1600000 3700000 1.300e+08 2013 31 1.500e+09 1.000e+09 1.200e+08 3.000e+08 2000 0 2013 1500000 4000000 1.500e+08 2013 18 34186960 20000000 3500000 2000000 35 80 2013 200000 300000 1000000 2013 24 15000000 10000000 400000 1000000 4 25 2013 0 150000 600000 2013 15 1.350e+08 62000000 16000000 30000000 200 0 2013 1500000 500000 20000000 2013 18 2.300e+08 1.200e+08 46500000 50000000 500 0 2013 200000 1800000 30000000 2013 18 end label values d2 LABF label values l1 LABF label values n2a LABF label values n2e LABF label values n2f LABF label values n2b LABF label values n6a LABF label values n7a LABF label def LABF -9 "DON'T KNOW", modify label values l6 L6 label def L6 0 "NO FULL-TIME SEASONAL OR TEMPORTARY WORKERS", modify label values a4a LABC label def LABC 15 "Food", modify label def LABC 17 "Textiles", modify label def LABC 18 "Garments", modify label def LABC 19 "Leather", modify label def LABC 23 "Refined petroleum product", modify label def LABC 24 "Chemicals", modify label def LABC 26 "Non metallic mineral products", modify label def LABC 27 "Basic metals", modify label def LABC 29 "Machinery and equipment (29 & 30)", modify label def LABC 31 "Electronics (31 & 32)", modify label def LABC 45 "Construction Section F: F", modify label def LABC 52 "Retail", modify label def LABC 60 "Transport Section I: (60-64) I", modify
Comment