Interpretation of coefficients in gravity model using PPML (dummies)

Ridwan Sheikh

Join Date: Apr 2021

Posts: 163
#91

27 Sep 2023, 23:49

Thank you very much Joao Santos Silva for your help.

Thanks and regards,
(Ridwan)
Comment
Andyx Zhang

Join Date: Dec 2023

Posts: 5
#92

28 Dec 2023, 11:30

Joao Santos Silva

Dear Professor Silva,

I apologize in advance if this question comes off as naive, as I am only a current high school student getting into econometrics. Using your PPMLHDFE function with year fixed effects, I am investigating how recipient country factors affects zero inflated absoloute continuous positive amounts of foriegn aid given by donor countries. From reading your previous responses to this thread, I have come to the folowing conclusions:

1. Log-log: Interpret the coefficient dirrectly as a percentage change when the regressor is logged
2. Log-linear: calculate (e^b-1)*100% when the regressor is a index or dummy variable

However, I do not understand exactly why interpreting log-log and log-linear requires different sets of calculations. Aren't they both inherent in coefficient values, given the status of the PPML log-link function and regressors?

To illustrate, these are my results:

Iteration 1: deviance = 2.5387e+11 eps = . iters = 1 tol = 1.0e-04 min(eta) =
> -5.42 P
Iteration 2: deviance = 2.0741e+11 eps = 2.24e-01 iters = 1 tol = 1.0e-04 min(eta) =
> -7.42
Iteration 3: deviance = 1.9975e+11 eps = 3.84e-02 iters = 1 tol = 1.0e-04 min(eta) =
> -9.20
Iteration 4: deviance = 1.9887e+11 eps = 4.46e-03 iters = 1 tol = 1.0e-04 min(eta) = -
> 10.08
Iteration 5: deviance = 1.9878e+11 eps = 4.10e-04 iters = 1 tol = 1.0e-04 min(eta) = -
> 10.35
Iteration 6: deviance = 1.9878e+11 eps = 1.50e-05 iters = 1 tol = 1.0e-04 min(eta) = -
> 10.39
Iteration 7: deviance = 1.9878e+11 eps = 4.53e-08 iters = 1 tol = 1.0e-05 min(eta) = -
> 10.39 S
Iteration 8: deviance = 1.9878e+11 eps = 7.41e-13 iters = 1 tol = 1.0e-06 min(eta) = -
> 10.39 S O
----------------------------------------------------------------------------------------------
> --------------
(legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance
> )
Converged in 8 iterations and 8 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression No. of obs = 542
Absorbing 1 HDFE group Residual df = 512
Wald chi2(10) = 270.65
Deviance = 1.98781e+11 Prob > chi2 = 0.0000
Log pseudolikelihood = -9.93903e+10 Pseudo R2 = 0.6693
---------------------------------------------------------------------------------------
| Robust
oofLike | Coefficient std. err. z P>|z| [95% conf. interval]
----------------------+----------------------------------------------------------------
ungaVoting | -.2929826 .7861342 -0.37 0.709 -1.833777 1.247812
taiwan | -1.833995 1.073804 -1.71 0.088 -3.938612 .2706208
ln_oresMetalsReal | .1760836 .0576731 3.05 0.002 .0630464 .2891208
ln_mineralProduction | .0143097 .0616158 0.23 0.816 -.106455 .1350743
lag_democracy | -.5238319 .1328634 -3.94 0.000 -.7842393 -.2634244
lag_corruptionControl | -1.015097 .2056114 -4.94 0.000 -1.418088 -.6121056
lag_polStability | .9343729 .2116334 4.42 0.000 .519579 1.349167
lag_debtGDP | -.0081825 .0074118 -1.10 0.270 -.0227094 .0063443
lag_ln_gdpCapita | .7254903 .2236881 3.24 0.001 .2870696 1.163911
lag_ln_population | .8607137 .1360158 6.33 0.000 .5941277 1.1273
_cons | -2.471449 2.422612 -1.02 0.308 -7.219681 2.276784
---------------------------------------------------------------------------------------

My interpretation is as follows:
1. For oresMetalsReal which is logged, a 1% increase in the regressor is associated with a 1.7% increase in the dependent variable for each 10% increase in the independent variable
2. For Taiwan which is a dummy binary indicator of 1 or 0, I calculate (e^-1.8-1)*100%=-84% decrease in the dependent variable
3. For polStability which is a discrete index between -2.5 and 2.5, I calculate (e^.934)*100% = 154% increase in the dependent variable for each unit increase in the independent variable

Are these interpretations correct as they differ drastically in magnitude. Could you also elaborate on why the calculations/interpretations of coefficients for semi-elasticity and elasticity is different please
Further, I have read your previous comments here: https://www.statalist.org/forums/for...ative-binomial. regarding the disutility of calculating partial effects for poisson regressions with fixed effects. To confirm, is it due to the nonlinearity of GLM models and Incidental Paramter Problems with year fixed effects?

Could you also comment on reporting dirrectly the incident rate ratios (exp(b)) for regressors in the context of a multiplicative model (positive when greater than 1, negative when less than 1). Is this applicable for both log and non-logged covariates?

Kind regards,
Andy

Last edited by Andyx Zhang; 28 Dec 2023, 12:07.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#93

29 Dec 2023, 00:18

Dear Andyx Zhang,

I start by noting that I am not the author of the command ppmlhdfe; as detailed in the help file, the authors are Sergio Correia, Paulo Gimaraes and Tom Zylkin.

Anyway, your interpretation of the coefficients appears to be correct. Just to be clear, in 1, a 1% increase in oresMetalsReal is associated with a 0.176% increase in the expectation of the dependent variable. The reason why the interpretation is different for variables that are logged is exactly because of that: if the variable enters the model in a different way, its effect takes a different form.

Finally, indeed the computation of partial effects can be affected by the IPP and therefore it is better to avoid it (and it is generally not needed). I am afraid I do not work with IRRs so I will not comment on that.

Best wishes,

Joao
Comment
Andyx Zhang

Join Date: Dec 2023

Posts: 5
#94

29 Dec 2023, 02:13

Joao Santos Silva
Dear Professor Santos,

Thank you for that explanation. I also have an additional question. Referring to your comments here (https://www.statalist.org/forums/for...uated-at-means), I estimate a secondary probit model with year dummies to consider initial selection.

1. I have 542 total observations with 48 entities (countries) in each time period (year). There are 22 years. Is this problematic?
2. How would I interpret the coefficients of said probit model? Is it with marginal effects? I've included my results below. If you could provide the calculations/explanations/code for interpreting continuous logged variables and index or dummy variables, that would be wonderful.
3. Could you recommend some statistical diagnostics/code I could perform on the PPML and probit model to check for model robustness and fit? My understanding is that traditional diagnostics don't work on non-linear models

Thank you so much in advance for your help!

Kind regards,
Andy

Fitting comparison model:

Iteration 0: Log likelihood = -368.83397
Iteration 1: Log likelihood = -293.84043
Iteration 2: Log likelihood = -293.07702
Iteration 3: Log likelihood = -293.07649
Iteration 4: Log likelihood = -293.07649

Fitting full model:

rho = 0.0 Log likelihood = -293.07649
rho = 0.1 Log likelihood = -283.67416
rho = 0.2 Log likelihood = -282.8983
rho = 0.3 Log likelihood = -284.29889

Iteration 0: Log likelihood = -282.90118
Iteration 1: Log likelihood = -280.97241
Iteration 2: Log likelihood = -280.93819
Iteration 3: Log likelihood = -280.93817

Random-effects probit regression Number of obs = 542
Group variable: country_id Number of groups = 32

Random effects u_i ~ Gaussian Obs per group:
min = 2
avg = 16.9
max = 20

Integration method: mvaghermite Integration pts. = 12

Wald chi2(27) = 76.40
Log likelihood = -280.93817 Prob > chi2 = 0.0000

---------------------------------------------------------------------------------------
oofLike_binary | Coefficient Std. err. z P>|z| [95% conf. interval]
----------------------+----------------------------------------------------------------
ln_oresMetalsReal | -.0142638 .0438314 -0.33 0.745 -.1001718 .0716442
ln_mineralProduction | .1234686 .0505275 2.44 0.015 .0244366 .2225006
lag_democracy | .0555519 .1254174 0.44 0.658 -.1902616 .3013654
lag_corruptionControl | -.0117526 .241321 -0.05 0.961 -.4847331 .461228
lag_polStability | -.1830842 .1744381 -1.05 0.294 -.5249766 .1588082
lag_debtGDP | -.0028961 .0033415 -0.87 0.386 -.0094453 .003653
lag_ln_gdpCapita | .1538641 .2024738 0.76 0.447 -.2429773 .5507055
lag_ln_population | .272665 .1289058 2.12 0.034 .0200143 .5253157
|
year |
2003 | .5410429 .4741918 1.14 0.254 -.388356 1.470442
2004 | .4515605 .4798015 0.94 0.347 -.4888332 1.391954
2005 | .6890558 .4746356 1.45 0.147 -.2412129 1.619324
2006 | .2662411 .4939528 0.54 0.590 -.7018886 1.234371
2007 | .80612 .4694938 1.72 0.086 -.1140709 1.726311
2008 | .9448596 .4610701 2.05 0.040 .0411788 1.84854
2009 | 1.292398 .4609122 2.80 0.005 .3890265 2.195769
2010 | .8754572 .4833168 1.81 0.070 -.0718263 1.822741
2011 | 1.461005 .4767169 3.06 0.002 .5266572 2.395353
2012 | 1.464459 .4746362 3.09 0.002 .534189 2.394729
2013 | 1.254003 .4737252 2.65 0.008 .3255183 2.182487
2014 | 1.401906 .4744211 2.95 0.003 .4720579 2.331755
2015 | 1.199736 .4557932 2.63 0.008 .3063983 2.093075
2016 | 1.106459 .4577559 2.42 0.016 .2092739 2.003644
2017 | 1.291948 .4591265 2.81 0.005 .3920761 2.191819
2018 | 1.527548 .4628055 3.30 0.001 .6204663 2.43463
2019 | 1.698583 .4723852 3.60 0.000 .7727253 2.624441
2020 | .9000742 .469809 1.92 0.055 -.0207346 1.820883
2021 | 1.152581 .4984003 2.31 0.021 .1757343 2.129428
|
_cons | -8.23449 2.53318 -3.25 0.001 -13.19943 -3.269548
----------------------+----------------------------------------------------------------
/lnsig2u | -1.251739 .4453194 -2.124549 -.3789294
----------------------+----------------------------------------------------------------
sigma_u | .5347961 .1190775 .3456686 .8274019
rho | .2223992 .0770126 .1067336 .4063851
---------------------------------------------------------------------------------------
LR test of rho=0: chibar2(01) = 24.28 Prob >= chibar2 = 0.000
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#95

29 Dec 2023, 03:40

Dear Andyx Zhang,

I am afraid I do not understand the purpose of estimating a probit here. It may be better to stick to the Poisson regression.

Best wishes,

Joao
Comment
Andyx Zhang

Join Date: Dec 2023

Posts: 5
#96

29 Dec 2023, 08:44

Joao Santos Silva

It is a secondary model that estimates the dichotomous yes/no outcome of if a country is eligible for aid allocation or not. Given that, how would I proceed with my interpretation/diagnostics?

Kind regards,
Andy
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#97

30 Dec 2023, 02:49

Dear Andyx Zhang,

I am afraid I would need to know much more about what you are doing to be able to comment on this.

Best wishes,

Joao
Comment
Rethabile Molapo

Join Date: Jan 2024

Posts: 10
#98

08 Jan 2024, 10:00

Dear Professor Joao Santos Silva
I am attempting to run a ppml regression with variables exports gdp commonofficial language distance common trade agreement

I am employing this data using observations from 2022 for 167 countries

I have used the ffg command ppml exports gdp commonofficial language distance common trade agreement

Considering that this model has fixed effects is it enough to only use the ppml command?
May you also assist me with commands for the Park test and the Ramsey RESET test.

I hope you will assist.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#99

08 Jan 2024, 23:42

Dear Rethabile Molapo,

I suggest you use the ppmlhdfe command instead of ppml. The results should be exactly the same, but ppmlhdfe is much faster.

I would not perform the Park test as it is not needed. For the RESET, please see the example here; it is for the poisson command, but for ppmlhdfe is not very different.

Best wishes,

Joao
Comment
Rethabile Molapo

Join Date: Jan 2024

Posts: 10
#100

09 Jan 2024, 01:32

Thank you Prof

Last edited by Rethabile Molapo; 09 Jan 2024, 01:39.
1 like
Comment
Gerome Retamal

Join Date: Apr 2024

Posts: 4
#101

21 May 2024, 12:58

Hello! In relation to the interpretation, where I have exports (in levels), as dependent variable, and log of tariffs as independent variable. How do we interpret the coefficient of tariffs, when PPML estimator is used?

1% increase in tariffs lead to -xx% decrease in exports OR
1 percentage point increase in tariffs lead to -xx% decrease in exports

Thank you very much for the help!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#102

21 May 2024, 14:53

If tariffs are in logs the coefficient is an elasticity, so the first option is the correct one.
Comment
Noemi Seng

Join Date: Jan 2024

Posts: 90
#103

24 Jul 2024, 09:33

Originally posted by Joao Santos Silva View Post

Dear OA Stata,

The coefficients on logged regressors are elasticities and there is no need to transform those. For regressors not in logs, the semi-elasticity is given by 100*(exp(beta) - 1)%. This is negative for negative beta and positive for positive beta; is is also approximately equal to 100*(beta)% for beta close to zero.

About #2, note that it should be (e^(-0.4)-1)*100 = -0.33%.

Best wishes,

Joao

Dear Joao Santos Silva

I was just reading through this thread and I'm a bit confused. I have learned in my econometrics classes that in a regression with the dependent variable in levels, the dummy coefficient doesn't have to be transformed with (exp(beta)-1)*100%. The coefficient just indicates the difference in the dependent variable between the two groups that are separated with it in the unit of the dependent variable. The transformation would only be necessary if our dependent variable was in log (which we don't do with ppml). Also, the coefficients of logged regressors don't represent elasticities (as would be the case with a logged DV), but only semi-elasticities.

Maybe I'm just confusing things but that's how I learned it in class - could you shed some more light on this?

Thank you so much.
Best
Noemi
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2995
#104

24 Jul 2024, 10:39

Dear Noemi Seng,

What you learned applies to linear models, whereas PPML estimates an exponential model. Therefore, the interpretation of PPML estimates is exactly as if the dependent variable was in logs.

Best wishes,

Joao
Comment
Noemi Seng

Join Date: Jan 2024

Posts: 90
#105

25 Jul 2024, 04:33

Originally posted by Joao Santos Silva View Post

Dear Julia Veje,

If I am not mistaken, (e^-0,4-1)*100 = -0.33%, so the effect is negative. More generally, the coefficient and the effect always have the same sign because e^0-1=0.

Best wishes,

Joao

Dear Joao Santos Silva ,

thank you so much for this key clarification. Just one remaining question: in your calculation above, wouldn't it be -33% instead of 0.33%?

Best wishes
Noemi
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment