I examine the number of co-publications of the last 31 years by six countries that are member of a regional organization. I'm particularly interested in similarities and differences of the countries' co-authorship patterns, especially in those variables that reflect the relation to co-authors' countries such as the trade volume or the geographical distance. For this purpose, I've decided to apply a population-averaged negative binomial model
following the field-specific literature's recommendation for the case of overdispersed data.
I'm currently, however, at an impasse because including any of these "pair variables" (or virtual proximity variables) results in
.
Now this issue is certainly not new and I have found helpful explanations and advice in previous forum threads [1,2,3,4,5] - but I'm not sure if I understand all of it properly and I'm a bit uncertain about which of them apply to my specific case. I have summarized potential issues, their recommended remedies and what I've tried so far to give you a better overview on my current understanding. Please indicate if you see something that I got wrong.
1. It may be the case that there is high collinearity between independent variables [5]. Some year dummies at the end of the time period are dropped due to collinearity but checking correlations with
shows relatively weak correlations between pair variables (<0.4) while they are higher between those variables that reflect the domestic dimension (where convergence is achieved).
is also an issue if I exclude the year dummies so I don't think this should be an issue.
2. There is the possibility that the maximum likelihood estimator for my model "does not exist" for my data [1]. This seems to be a possible option as my data has indeed a large number of 0 values in the dependent variable and the "pair" independent variables. A potential remedy would be to start with a poisson regression and plugin the estimates into the negative binomial regression [2]. I wasn't sure what model I should use so I've just run a population-averaged poisson regression
but it results in
as well. I assume using these estimates probably won't help neither, right?
3. In case of using an interaction, the model including the interaction may not be identified by the data [4]. I'm not using interactions so that specific issue should not apply here.
4. My model is insufficient and I should try something different.
I've tried to use -difficult- option to change the steps during the iteration [4], however, to no avail. I have tried to use
as a fall back option [3] but with the same result. What I haven't thoroughly tried so far is to use another maximization technique as I lack proper understanding of the particularities of the different techniques.
I should mention that I have isues with an empty Wald chiĀ² statistic that I attribute to a scaling problem as I was able to fix it by re-scaling the problematic pair variables.
Do you have some recommendations on possible next steps?
I've attached an example of the regression and copied the output below for better
[1] https://www.statalist.org/forums/for...binomial-model
[2] https://www.statalist.org/forums/for...sson-estimates
[3] https://www.statalist.org/forums/for...-fixed-effects
[4] https://www.statalist.org/forums/for...ial-regression
[5] https://www.stata.com/statalist/arch.../msg00288.html
Code:
xtnbreg, pa difficult vce(robust)
I'm currently, however, at an impasse because including any of these "pair variables" (or virtual proximity variables) results in
Code:
no convergence
Now this issue is certainly not new and I have found helpful explanations and advice in previous forum threads [1,2,3,4,5] - but I'm not sure if I understand all of it properly and I'm a bit uncertain about which of them apply to my specific case. I have summarized potential issues, their recommended remedies and what I've tried so far to give you a better overview on my current understanding. Please indicate if you see something that I got wrong.
1. It may be the case that there is high collinearity between independent variables [5]. Some year dummies at the end of the time period are dropped due to collinearity but checking correlations with
Code:
pwcorr
Code:
no convergence
2. There is the possibility that the maximum likelihood estimator for my model "does not exist" for my data [1]. This seems to be a possible option as my data has indeed a large number of 0 values in the dependent variable and the "pair" independent variables. A potential remedy would be to start with a poisson regression and plugin the estimates into the negative binomial regression [2]. I wasn't sure what model I should use so I've just run a population-averaged poisson regression
Code:
xtpoisson, pa
Code:
no convergence
3. In case of using an interaction, the model including the interaction may not be identified by the data [4]. I'm not using interactions so that specific issue should not apply here.
4. My model is insufficient and I should try something different.
I've tried to use -difficult- option to change the steps during the iteration [4], however, to no avail. I have tried to use
Code:
xtpoisson, r fe
I should mention that I have isues with an empty Wald chiĀ² statistic that I attribute to a scaling problem as I was able to fix it by re-scaling the problematic pair variables.
Do you have some recommendations on possible next steps?
I've attached an example of the regression and copied the output below for better
Code:
. xtnbreg collab_weight rtot_trade gdp_pc tertenrol_epol trade_percgdp mobcell100 colotrad langcom i.year, pa difficult vce(robust) note: 2015.year omitted because of collinearity note: 2016.year omitted because of collinearity note: 2017.year omitted because of collinearity note: 2018.year omitted because of collinearity Iteration 1: tolerance = .31055186 Iteration 2: tolerance = .07928357 Iteration 3: tolerance = .08383659 Iteration 4: tolerance = .04340788 Iteration 5: tolerance = .22333106 Iteration 6: tolerance = .20362621 Iteration 7: tolerance = .52943928 Iteration 8: tolerance = .75065012 Iteration 9: tolerance = .08511348 Iteration 10: tolerance = .10210511 Iteration 11: tolerance = .06861638 Iteration 12: tolerance = .04175124 Iteration 13: tolerance = .57174736 Iteration 14: tolerance = .83240469 Iteration 15: tolerance = .13318422 Iteration 16: tolerance = .07155872 Iteration 17: tolerance = .07823117 Iteration 18: tolerance = .07148965 Iteration 19: tolerance = .03836049 Iteration 20: tolerance = .5207436 Iteration 21: tolerance = .77286907 Iteration 22: tolerance = .08953199 Iteration 23: tolerance = .09903163 Iteration 24: tolerance = .07404575 Iteration 25: tolerance = .03569057 Iteration 26: tolerance = .37159328 Iteration 27: tolerance = .54720453 Iteration 28: tolerance = .14373912 Iteration 29: tolerance = .03682585 Iteration 30: tolerance = .38592322 Iteration 31: tolerance = .57121807 Iteration 32: tolerance = .14028013 Iteration 33: tolerance = .04100945 Iteration 34: tolerance = .22680789 Iteration 35: tolerance = .25970474 Iteration 36: tolerance = .09572691 Iteration 37: tolerance = .11652831 Iteration 38: tolerance = .66254531 Iteration 39: tolerance = .77250927 Iteration 40: tolerance = .25295589 Iteration 41: tolerance = .05215284 Iteration 42: tolerance = .04514844 Iteration 43: tolerance = .0596035 Iteration 44: tolerance = .07818158 Iteration 45: tolerance = .07041597 Iteration 46: tolerance = .03923707 Iteration 47: tolerance = .55936823 Iteration 48: tolerance = .81969253 Iteration 49: tolerance = .12112588 Iteration 50: tolerance = .07921637 Iteration 51: tolerance = .08010801 Iteration 52: tolerance = .06113941 Iteration 53: tolerance = .08624978 Iteration 54: tolerance = .63494462 Iteration 55: tolerance = .85733655 Iteration 56: tolerance = .21802994 Iteration 57: tolerance = .04431795 Iteration 58: tolerance = .05111883 Iteration 59: tolerance = .07124833 Iteration 60: tolerance = .07989025 Iteration 61: tolerance = .04704872 Iteration 62: tolerance = .2078943 Iteration 63: tolerance = .13467526 Iteration 64: tolerance = .82616843 Iteration 65: tolerance = .57358112 Iteration 66: tolerance = .31724385 Iteration 67: tolerance = .39922756 Iteration 68: tolerance = .18022887 Iteration 69: tolerance = .10210485 Iteration 70: tolerance = .07338091 Iteration 71: tolerance = .06084837 Iteration 72: tolerance = .05441135 Iteration 73: tolerance = .05120285 Iteration 74: tolerance = .07191157 Iteration 75: tolerance = .07959718 Iteration 76: tolerance = .0462729 Iteration 77: tolerance = .21305515 Iteration 78: tolerance = .15071179 Iteration 79: tolerance = .89146622 Iteration 80: tolerance = .6153406 Iteration 81: tolerance = .19964479 Iteration 82: tolerance = .05179655 Iteration 83: tolerance = .32744225 Iteration 84: tolerance = .55548529 Iteration 85: tolerance = .30313567 Iteration 86: tolerance = .12112433 Iteration 87: tolerance = .07585909 Iteration 88: tolerance = .06068375 Iteration 89: tolerance = .05399522 Iteration 90: tolerance = .05320866 Iteration 91: tolerance = .07358059 Iteration 92: tolerance = .07842722 Iteration 93: tolerance = .04223677 Iteration 94: tolerance = .22502611 Iteration 95: tolerance = .23854644 Iteration 96: tolerance = .19215449 Iteration 97: tolerance = .23957639 Iteration 98: tolerance = .20948912 Iteration 99: tolerance = .26024878 Iteration 100: tolerance = .10269846 GEE population-averaged model Number of obs = 3,161 Group variable: target Number of groups = 109 Link: log Obs per group: Family: negative binomial(k=1) min = 29 Correlation: exchangeable avg = 29.0 max = 29 Wald chi2(31) = 10794.89 Scale parameter: 1 Prob > chi2 = 0.0000 (Std. Err. adjusted for clustering on target) ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | Semirobust collab_weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------------+------------------------------------------------------------------------------------------------------------------------------------------------ rtot_trade | 2.60e-09 6.37e-09 0.41 0.683 -9.88e-09 1.51e-08 gdp_pc | .0000971 .0000247 3.94 0.000 .0000487 .0001454 tertenrol_epol | 2.22e-06 4.01e-07 5.53 0.000 1.43e-06 3.01e-06 trade_percgdp | -.0183561 .0116593 -1.57 0.115 -.0412079 .0044958 mobcell100 | .0073132 .0011522 6.35 0.000 .0050549 .0095715 colotrad | 1.594459 .5851681 2.72 0.006 .4475504 2.741367 langcom | 1.509419 .6002954 2.51 0.012 .3328614 2.685976 ---------------------------------------------------------------------------------------------------------------------------------------------------------------- year | 1991 | .2856466 .3194495 0.89 0.371 -.3404629 .9117561 1992 | -.0231025 .2237049 -0.10 0.918 -.461556 .415351 1993 | .4922213 .1874601 2.63 0.009 .1248063 .8596362 1994 | .4741249 .2524005 1.88 0.060 -.020571 .9688208 1995 | .4691006 .2529398 1.85 0.064 -.0266523 .9648535 1996 | .4894541 .1945354 2.52 0.012 .1081717 .8707365 1997 | .927893 .2780302 3.34 0.001 .3829638 1.472822 1998 | 1.250811 .2103598 5.95 0.000 .8385137 1.663109 1999 | 1.082485 .2243396 4.83 0.000 .6427869 1.522182 2000 | .96179 .2667751 3.61 0.000 .4389204 1.48466 2001 | .9312558 .1905119 4.89 0.000 .5578593 1.304652 2002 | .8492289 .1903188 4.46 0.000 .4762109 1.222247 2003 | .9112329 .2544881 3.58 0.000 .4124454 1.41002 2004 | .6635822 .2677673 2.48 0.013 .138768 1.188396 2005 | .2132622 .2759962 0.77 0.440 -.3276804 .7542048 2006 | .2055204 .3177095 0.65 0.518 -.4171788 .8282196 2007 | .0070632 .3180742 0.02 0.982 -.6163507 .6304771 2008 | -.4258888 .2031861 -2.10 0.036 -.8241263 -.0276514 2009 | -.1525135 .2025328 -0.75 0.451 -.5494704 .2444435 2010 | -.2249248 .181555 -1.24 0.215 -.580766 .1309164 2011 | -.2851735 .1338686 -2.13 0.033 -.5475511 -.0227958 2012 | -.54935 .0686513 -8.00 0.000 -.6839041 -.4147959 2013 | -.6197236 .0624535 -9.92 0.000 -.7421302 -.497317 2014 | -.4995681 .0395978 -12.62 0.000 -.5771784 -.4219578 2015 | 0 (omitted) 2016 | 0 (omitted) 2017 | 0 (omitted) 2018 | 0 (omitted) | _cons | -.7398958 .8294814 -0.89 0.372 -2.365649 .8858579 -------------------------------------------------------------------------------- convergence not achieved r(430);
[1] https://www.statalist.org/forums/for...binomial-model
[2] https://www.statalist.org/forums/for...sson-estimates
[3] https://www.statalist.org/forums/for...-fixed-effects
[4] https://www.statalist.org/forums/for...ial-regression
[5] https://www.stata.com/statalist/arch.../msg00288.html