Good Day,
I am not well-versed in using Stata and am a newcomer to econometrics. Currently, I am doing a research examining the impact of institutional quality on the export of wood products for 21 years. Below, I've outlined the key variables employed in my study:
I am utilizing the PPML method to account for zero trade flows in the data, which are notably present in 132 out of 3044 observations. I have structured my model as follows:
.ppmlhdfe trade_Musd YR* Imp_FE* ln_distancebv ln_gdp15_Obv ln_gdp15_Dbv contigbv comlang_offbv gee_reporter_5 rqe_reporter_5 rle_reporter_5, cluster (country_pair)
However, I have encountered warnings and issues during modeling:
ppmlhdfe trade_Musd YR* Imp_FE* ln_distancebv ln_gdp15_Obv ln_gdp15_Dbv contigbv comlang_offbv gee_reporter_5 rqe_reporter_5 rle_reporter_5 , cluster ( country_pair)
warning: dependent variable takes very low values after standardizing (4.7427e-07)
note: 2 variables omitted because of collinearity: YR21 Imp_FE30
Iteration 1: deviance = 1.2667e+05 eps = . iters = 1 tol = 1.0e-04 min(eta) = -4.26 P
Iteration 2: deviance = 9.6560e+04 eps = 3.12e-01 iters = 1 tol = 1.0e-04 min(eta) = -5.47
Iteration 3: deviance = 9.4215e+04 eps = 2.49e-02 iters = 1 tol = 1.0e-04 min(eta) = -6.10
Iteration 4: deviance = 9.4178e+04 eps = 3.91e-04 iters = 1 tol = 1.0e-04 min(eta) = -6.20
Iteration 5: deviance = 9.4178e+04 eps = 4.23e-07 iters = 1 tol = 1.0e-04 min(eta) = -6.20
Iteration 6: deviance = 9.4178e+04 eps = 4.26e-12 iters = 1 tol = 1.0e-05 min(eta) = -6.20 S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
Converged in 6 iterations and 6 HDFE sub-iterations (tol = 1.0e-08)
PPML regression No. of obs = 2,012
Residual df = 144
Statistics robust to heteroskedasticity Wald chi2(57) = 6317.31
Deviance = 94177.76498 Prob > chi2 = 0.0000
Log pseudolikelihood = -50876.06515 Pseudo R2 = 0.7598
Number of clusters (country_pair)= 145
(Std. err. adjusted for 145 clusters in country_pair)
--------------------------------------------------------------------------------
| Robust
trade_Musd | Coefficient std. err. z P>|z| [95% conf. interval]
---------------+----------------------------------------------------------------
YR1 | .5201256 .4521295 1.15 0.250 -.366032 1.406283
YR2 | .5693542 .425712 1.34 0.181 -.265026 1.403735
YR3 | .0812014 .3480763 0.23 0.816 -.6010157 .7634185
YR4 | -.0667056 .3440161 -0.19 0.846 -.7409648 .6075537
YR5 | -.2977823 .3044973 -0.98 0.328 -.894586 .2990213
YR6 | -.453788 .3123138 -1.45 0.146 -1.065912 .1583358
YR7 | -.3775395 .2578755 -1.46 0.143 -.8829662 .1278871
YR8 | -.27286 .258202 -1.06 0.291 -.7789266 .2332065
YR9 | -.4517603 .3093984 -1.46 0.144 -1.05817 .1546495
YR10 | -.5358888 .2348802 -2.28 0.023 -.9962455 -.0755321
YR11 | -.4674764 .2593535 -1.80 0.071 -.9758 .0408472
YR12 | -.2157911 .2293564 -0.94 0.347 -.6653214 .2337392
YR13 | -.0412336 .2089766 -0.20 0.844 -.4508202 .3683529
YR14 | .1085185 .1845772 0.59 0.557 -.2532462 .4702831
YR15 | .3571066 .2162609 1.65 0.099 -.066757 .7809702
YR16 | .5880488 .242982 2.42 0.016 .1118129 1.064285
YR17 | .518406 .2218798 2.34 0.019 .0835296 .9532824
YR18 | .6243174 .2428429 2.57 0.010 .1483541 1.100281
YR19 | .7592175 .3363167 2.26 0.024 .100049 1.418386
YR20 | .6836577 .2278672 3.00 0.003 .2370462 1.130269
YR21 | 0 (omitted)
Imp_FE1 | 1.58989 .4691004 3.39 0.001 .6704697 2.50931
Imp_FE2 | 1.542975 .5017832 3.07 0.002 .5594977 2.526452
Imp_FE3 | .9104672 .4975008 1.83 0.067 -.0646164 1.885551
Imp_FE4 | -.1596206 .4402698 -0.36 0.717 -1.022534 .7032923
Imp_FE5 | 3.851282 .5783316 6.66 0.000 2.717773 4.984792
Imp_FE6 | 1.448788 .5446943 2.66 0.008 .3812065 2.516369
Imp_FE7 | -.2576722 .4723964 -0.55 0.585 -1.183552 .6682078
Imp_FE8 | .2248053 .4897663 0.46 0.646 -.7351189 1.18473
Imp_FE9 | .5332713 .4409573 1.21 0.227 -.3309891 1.397532
Imp_FE10 | 1.640335 .451596 3.63 0.000 .7552227 2.525446
Imp_FE11 | 1.518954 .5736724 2.65 0.008 .3945769 2.643332
Imp_FE12 | .1505784 1.110076 0.14 0.892 -2.025132 2.326288
Imp_FE13 | 1.884956 .574301 3.28 0.001 .7593463 3.010565
Imp_FE14 | .6061414 .7510758 0.81 0.420 -.8659401 2.078223
Imp_FE15 | .3853813 .4876849 0.79 0.429 -.5704635 1.341226
Imp_FE16 | .1135511 .5075052 0.22 0.823 -.8811408 1.108243
Imp_FE17 | 4.574264 .5250587 8.71 0.000 3.545167 5.60336
Imp_FE18 | 2.689692 .4984901 5.40 0.000 1.712669 3.666715
Imp_FE19 | -.148624 .4893643 -0.30 0.761 -1.10776 .8105124
Imp_FE20 | -.4661252 .5843572 -0.80 0.425 -1.611444 .6791938
Imp_FE21 | 1.591687 .7409989 2.15 0.032 .1393559 3.044018
Imp_FE22 | 1.66037 .5329054 3.12 0.002 .6158944 2.704845
Imp_FE23 | 1.034219 .7812646 1.32 0.186 -.4970311 2.56547
Imp_FE24 | 1.489572 .5090167 2.93 0.003 .4919174 2.487226
Imp_FE25 | 1.176602 .5340354 2.20 0.028 .1299117 2.223292
Imp_FE26 | 1.105201 .6090627 1.81 0.070 -.0885398 2.298942
Imp_FE27 | 2.654559 .4340863 6.12 0.000 1.803765 3.505352
Imp_FE28 | 1.531106 .8776364 1.74 0.081 -.1890299 3.251242
Imp_FE29 | .3832464 .6973546 0.55 0.583 -.9835435 1.750036
Imp_FE30 | 0 (omitted)
ln_distancebv | -.8236917 .3705728 -2.22 0.026 -1.550001 -.0973823
ln_gdp15_Obv | 2.136898 4.918015 0.43 0.664 -7.502234 11.77603
ln_gdp15_Dbv | 54.73018 5.862969 9.33 0.000 43.23897 66.22139
contigbv | 1.298234 .6498093 2.00 0.046 .0246307 2.571836
comlang_offbv | -.7467727 .362121 -2.06 0.039 -1.456517 -.0370286
gee_reporter_5 | 3.706266 .5156612 7.19 0.000 2.695588 4.716943
rqe_reporter_5 | -2.172155 .4599424 -4.72 0.000 -3.073626 -1.270684
rle_reporter_5 | -.8153704 .5973095 -1.37 0.172 -1.986075 .3553347
_cons | -.8481844 .5664779 -1.50 0.134 -1.958461 .262092
--------------------------------------------------------------------------------
Given this context, I have several questions:
Thank you very much for your assistance.
James
I am not well-versed in using Stata and am a newcomer to econometrics. Currently, I am doing a research examining the impact of institutional quality on the export of wood products for 21 years. Below, I've outlined the key variables employed in my study:
- trade_Musd: Trade value in million USD.
- ln_distancebv: Natural logarithm of the distance between trade partners.
- ln_gdp15_Obv: Natural logarithm of the constant 2015 GDP of the reporting country.
- ln_gdp15_Dbv: Natural logarithm of the constant 2015 GDP of the partner country.
- contigbv: Indicator for whether countries share a border (contiguity).
- comlang_offbv: Indicator for whether countries share an official language.
- gee_reporter_5, rqe_reporter_5, rle_reporter_5: Rescaled indicators of institutional quality (from a scale of -2.5 – 2.5 to 0 – 5).
I am utilizing the PPML method to account for zero trade flows in the data, which are notably present in 132 out of 3044 observations. I have structured my model as follows:
.ppmlhdfe trade_Musd YR* Imp_FE* ln_distancebv ln_gdp15_Obv ln_gdp15_Dbv contigbv comlang_offbv gee_reporter_5 rqe_reporter_5 rle_reporter_5, cluster (country_pair)
However, I have encountered warnings and issues during modeling:
- Warning: The dependent variable takes very low values after standardizing (4.7427e-07).
- Note: Variables YR21 and Imp_FE30 were omitted due to collinearity.
ppmlhdfe trade_Musd YR* Imp_FE* ln_distancebv ln_gdp15_Obv ln_gdp15_Dbv contigbv comlang_offbv gee_reporter_5 rqe_reporter_5 rle_reporter_5 , cluster ( country_pair)
warning: dependent variable takes very low values after standardizing (4.7427e-07)
note: 2 variables omitted because of collinearity: YR21 Imp_FE30
Iteration 1: deviance = 1.2667e+05 eps = . iters = 1 tol = 1.0e-04 min(eta) = -4.26 P
Iteration 2: deviance = 9.6560e+04 eps = 3.12e-01 iters = 1 tol = 1.0e-04 min(eta) = -5.47
Iteration 3: deviance = 9.4215e+04 eps = 2.49e-02 iters = 1 tol = 1.0e-04 min(eta) = -6.10
Iteration 4: deviance = 9.4178e+04 eps = 3.91e-04 iters = 1 tol = 1.0e-04 min(eta) = -6.20
Iteration 5: deviance = 9.4178e+04 eps = 4.23e-07 iters = 1 tol = 1.0e-04 min(eta) = -6.20
Iteration 6: deviance = 9.4178e+04 eps = 4.26e-12 iters = 1 tol = 1.0e-05 min(eta) = -6.20 S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
Converged in 6 iterations and 6 HDFE sub-iterations (tol = 1.0e-08)
PPML regression No. of obs = 2,012
Residual df = 144
Statistics robust to heteroskedasticity Wald chi2(57) = 6317.31
Deviance = 94177.76498 Prob > chi2 = 0.0000
Log pseudolikelihood = -50876.06515 Pseudo R2 = 0.7598
Number of clusters (country_pair)= 145
(Std. err. adjusted for 145 clusters in country_pair)
--------------------------------------------------------------------------------
| Robust
trade_Musd | Coefficient std. err. z P>|z| [95% conf. interval]
---------------+----------------------------------------------------------------
YR1 | .5201256 .4521295 1.15 0.250 -.366032 1.406283
YR2 | .5693542 .425712 1.34 0.181 -.265026 1.403735
YR3 | .0812014 .3480763 0.23 0.816 -.6010157 .7634185
YR4 | -.0667056 .3440161 -0.19 0.846 -.7409648 .6075537
YR5 | -.2977823 .3044973 -0.98 0.328 -.894586 .2990213
YR6 | -.453788 .3123138 -1.45 0.146 -1.065912 .1583358
YR7 | -.3775395 .2578755 -1.46 0.143 -.8829662 .1278871
YR8 | -.27286 .258202 -1.06 0.291 -.7789266 .2332065
YR9 | -.4517603 .3093984 -1.46 0.144 -1.05817 .1546495
YR10 | -.5358888 .2348802 -2.28 0.023 -.9962455 -.0755321
YR11 | -.4674764 .2593535 -1.80 0.071 -.9758 .0408472
YR12 | -.2157911 .2293564 -0.94 0.347 -.6653214 .2337392
YR13 | -.0412336 .2089766 -0.20 0.844 -.4508202 .3683529
YR14 | .1085185 .1845772 0.59 0.557 -.2532462 .4702831
YR15 | .3571066 .2162609 1.65 0.099 -.066757 .7809702
YR16 | .5880488 .242982 2.42 0.016 .1118129 1.064285
YR17 | .518406 .2218798 2.34 0.019 .0835296 .9532824
YR18 | .6243174 .2428429 2.57 0.010 .1483541 1.100281
YR19 | .7592175 .3363167 2.26 0.024 .100049 1.418386
YR20 | .6836577 .2278672 3.00 0.003 .2370462 1.130269
YR21 | 0 (omitted)
Imp_FE1 | 1.58989 .4691004 3.39 0.001 .6704697 2.50931
Imp_FE2 | 1.542975 .5017832 3.07 0.002 .5594977 2.526452
Imp_FE3 | .9104672 .4975008 1.83 0.067 -.0646164 1.885551
Imp_FE4 | -.1596206 .4402698 -0.36 0.717 -1.022534 .7032923
Imp_FE5 | 3.851282 .5783316 6.66 0.000 2.717773 4.984792
Imp_FE6 | 1.448788 .5446943 2.66 0.008 .3812065 2.516369
Imp_FE7 | -.2576722 .4723964 -0.55 0.585 -1.183552 .6682078
Imp_FE8 | .2248053 .4897663 0.46 0.646 -.7351189 1.18473
Imp_FE9 | .5332713 .4409573 1.21 0.227 -.3309891 1.397532
Imp_FE10 | 1.640335 .451596 3.63 0.000 .7552227 2.525446
Imp_FE11 | 1.518954 .5736724 2.65 0.008 .3945769 2.643332
Imp_FE12 | .1505784 1.110076 0.14 0.892 -2.025132 2.326288
Imp_FE13 | 1.884956 .574301 3.28 0.001 .7593463 3.010565
Imp_FE14 | .6061414 .7510758 0.81 0.420 -.8659401 2.078223
Imp_FE15 | .3853813 .4876849 0.79 0.429 -.5704635 1.341226
Imp_FE16 | .1135511 .5075052 0.22 0.823 -.8811408 1.108243
Imp_FE17 | 4.574264 .5250587 8.71 0.000 3.545167 5.60336
Imp_FE18 | 2.689692 .4984901 5.40 0.000 1.712669 3.666715
Imp_FE19 | -.148624 .4893643 -0.30 0.761 -1.10776 .8105124
Imp_FE20 | -.4661252 .5843572 -0.80 0.425 -1.611444 .6791938
Imp_FE21 | 1.591687 .7409989 2.15 0.032 .1393559 3.044018
Imp_FE22 | 1.66037 .5329054 3.12 0.002 .6158944 2.704845
Imp_FE23 | 1.034219 .7812646 1.32 0.186 -.4970311 2.56547
Imp_FE24 | 1.489572 .5090167 2.93 0.003 .4919174 2.487226
Imp_FE25 | 1.176602 .5340354 2.20 0.028 .1299117 2.223292
Imp_FE26 | 1.105201 .6090627 1.81 0.070 -.0885398 2.298942
Imp_FE27 | 2.654559 .4340863 6.12 0.000 1.803765 3.505352
Imp_FE28 | 1.531106 .8776364 1.74 0.081 -.1890299 3.251242
Imp_FE29 | .3832464 .6973546 0.55 0.583 -.9835435 1.750036
Imp_FE30 | 0 (omitted)
ln_distancebv | -.8236917 .3705728 -2.22 0.026 -1.550001 -.0973823
ln_gdp15_Obv | 2.136898 4.918015 0.43 0.664 -7.502234 11.77603
ln_gdp15_Dbv | 54.73018 5.862969 9.33 0.000 43.23897 66.22139
contigbv | 1.298234 .6498093 2.00 0.046 .0246307 2.571836
comlang_offbv | -.7467727 .362121 -2.06 0.039 -1.456517 -.0370286
gee_reporter_5 | 3.706266 .5156612 7.19 0.000 2.695588 4.716943
rqe_reporter_5 | -2.172155 .4599424 -4.72 0.000 -3.073626 -1.270684
rle_reporter_5 | -.8153704 .5973095 -1.37 0.172 -1.986075 .3553347
_cons | -.8481844 .5664779 -1.50 0.134 -1.958461 .262092
--------------------------------------------------------------------------------
Given this context, I have several questions:
- Am I using the ppmlhdfe command correctly for my research aims?
- Is the presence of many zero trade flows (132 out of 3044 observations) sufficient justification for using PPML over OLS and fixed effects estimations?
- Should I be concerned about the warning and collinearity issues, and does it actually need further improvements to my model or dataset? Is having a Pseudo R2 = 0.7598 too high?
- What tests or diagnostics would you recommend to ensure the robustness of my model? I have read some of the post here specifically about using RESET Test for PPML as suggested by Professor Silva. Are there any other test to check for this?
Thank you very much for your assistance.
James
Comment