Hello community,
I am using the ppmlhdfe regression to study the impact of common language (comlang_off) on trade (tradeflow_baci) between many African countries.
For that I am using exporter fixed effects (exp_year), importer fixed effects (imp_year) and country pair fixed effects (pair_id).
1 - Regression: pmlhdfe tradeflow_baci fta_wto ln_dist contig comlang_off, a(exp_year imp_year pair_id) cluster (pair_id) nolog
Output with country pair_id fixed effects:
(dropped 283 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (5.1476e-09)
note: 1 variable omitted because of collinearity: comlang_off
note: ln_dist is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-06)
Converged in 13 iterations and 69 HDFE sub-iterations (tol = 1.0e-08)
HDFE PPML regression No. of obs = 9,051
Absorbing 3 HDFE groups Residual df = 2,047
Statistics robust to heteroskedasticity Wald chi2(2) = 1.17
Deviance = 52472688.32 Prob > chi2 = 0.5574
Log pseudolikelihood = -26271738.18 Pseudo R2 = 0.9676
Number of clusters (pair_id)= 2,048
(Std. err. adjusted for 2,048 clusters in pair_id)
------------------------------------------------------------------------------
| Robust
tradeflow_~i | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
fta_wto | -.2064625 .2064227 -1.00 0.317 -.6110436 .1981186
ln_dist | 0 (omitted)
contig | .0691352 .1640386 0.42 0.673 -.2523745 .3906449
comlang_off | 0 (omitted)
_cons | 13.10739 .1680008 78.02 0.000 12.77812 13.43667
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
exp_year | 306 1 305 |
imp_year | 308 6 302 |
pair_id | 2048 2048 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
Dear community
2 - Regression without country pair_id: ppmlhdfe tradeflow_baci fta_wto ln_dist contig comlang_off, a(exp_year imp_year) cluster (pair_id) nolog
Output without country pair_id fixed effects:
(dropped 2 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (5.2246e-09)
Converged in 11 iterations and 50 HDFE sub-iterations (tol = 1.0e-08)
HDFE PPML regression No. of obs = 9,332
Absorbing 2 HDFE groups Residual df = 2,328
Statistics robust to heteroskedasticity Wald chi2(4) = 732.50
Deviance = 248048316.2 Prob > chi2 = 0.0000
Log pseudolikelihood = -124060124.6 Pseudo R2 = 0.8487
Number of clusters (pair_id)= 2,329
(Std. err. adjusted for 2,329 clusters in pair_id)
------------------------------------------------------------------------------
| Robust
tradeflow_~i | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
fta_wto | .7144648 .1703926 4.19 0.000 .3805013 1.048428
ln_dist | -.8836056 .1328718 -6.65 0.000 -1.14403 -.6231816
contig | .773755 .1805465 4.29 0.000 .4198903 1.12762
comlang_off | .670514 .1545234 4.34 0.000 .3676537 .9733744
_cons | 17.56397 1.064159 16.51 0.000 15.47825 19.64968
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
exp_year | 306 0 306 |
imp_year | 308 6 302 |
Question: in the regression 1(with country pair_id) variable comlang_off (dummy = 1 if same language) is being omitted due to collinearity. I am not sure why. It looks like the country pair fixed effects is absorbing the comlang_off fixed effects. If so, can I go for regression 2 without country pair (pair_id) fixed effects without any concern of endogeneity?
Thoughts and comments are welcome.
Thank you in advance
I am using the ppmlhdfe regression to study the impact of common language (comlang_off) on trade (tradeflow_baci) between many African countries.
For that I am using exporter fixed effects (exp_year), importer fixed effects (imp_year) and country pair fixed effects (pair_id).
1 - Regression: pmlhdfe tradeflow_baci fta_wto ln_dist contig comlang_off, a(exp_year imp_year pair_id) cluster (pair_id) nolog
Output with country pair_id fixed effects:
(dropped 283 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (5.1476e-09)
note: 1 variable omitted because of collinearity: comlang_off
note: ln_dist is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-06)
Converged in 13 iterations and 69 HDFE sub-iterations (tol = 1.0e-08)
HDFE PPML regression No. of obs = 9,051
Absorbing 3 HDFE groups Residual df = 2,047
Statistics robust to heteroskedasticity Wald chi2(2) = 1.17
Deviance = 52472688.32 Prob > chi2 = 0.5574
Log pseudolikelihood = -26271738.18 Pseudo R2 = 0.9676
Number of clusters (pair_id)= 2,048
(Std. err. adjusted for 2,048 clusters in pair_id)
------------------------------------------------------------------------------
| Robust
tradeflow_~i | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
fta_wto | -.2064625 .2064227 -1.00 0.317 -.6110436 .1981186
ln_dist | 0 (omitted)
contig | .0691352 .1640386 0.42 0.673 -.2523745 .3906449
comlang_off | 0 (omitted)
_cons | 13.10739 .1680008 78.02 0.000 12.77812 13.43667
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
exp_year | 306 1 305 |
imp_year | 308 6 302 |
pair_id | 2048 2048 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
Dear community
2 - Regression without country pair_id: ppmlhdfe tradeflow_baci fta_wto ln_dist contig comlang_off, a(exp_year imp_year) cluster (pair_id) nolog
Output without country pair_id fixed effects:
(dropped 2 observations that are either singletons or separated by a fixed effect)
warning: dependent variable takes very low values after standardizing (5.2246e-09)
Converged in 11 iterations and 50 HDFE sub-iterations (tol = 1.0e-08)
HDFE PPML regression No. of obs = 9,332
Absorbing 2 HDFE groups Residual df = 2,328
Statistics robust to heteroskedasticity Wald chi2(4) = 732.50
Deviance = 248048316.2 Prob > chi2 = 0.0000
Log pseudolikelihood = -124060124.6 Pseudo R2 = 0.8487
Number of clusters (pair_id)= 2,329
(Std. err. adjusted for 2,329 clusters in pair_id)
------------------------------------------------------------------------------
| Robust
tradeflow_~i | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
fta_wto | .7144648 .1703926 4.19 0.000 .3805013 1.048428
ln_dist | -.8836056 .1328718 -6.65 0.000 -1.14403 -.6231816
contig | .773755 .1805465 4.29 0.000 .4198903 1.12762
comlang_off | .670514 .1545234 4.34 0.000 .3676537 .9733744
_cons | 17.56397 1.064159 16.51 0.000 15.47825 19.64968
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
exp_year | 306 0 306 |
imp_year | 308 6 302 |
Question: in the regression 1(with country pair_id) variable comlang_off (dummy = 1 if same language) is being omitted due to collinearity. I am not sure why. It looks like the country pair fixed effects is absorbing the comlang_off fixed effects. If so, can I go for regression 2 without country pair (pair_id) fixed effects without any concern of endogeneity?
Thoughts and comments are welcome.
Thank you in advance
Comment