Dear community,
I think I came across an issue that has not been subject to any discussion in the forum but is really important.
As I am conducting an analysis using a logistic model, I am interested in calculating the margins/ marginal effects to quantify the results.
However, using the - margins - command, I find that for seemingly the same calculation of the margins, I get different results.
Let me show you a quick example of what I mean. I run expectations of a firm's growth prospect on its current realization and a set of dummies. Both, expectations and realization are categorical variables that take on three different values: -1 (deteriorate), 0 (...remain stable), or 1 (...improve). Because factor variables do not allow negative values, I recode my realization to 1 2 3.
I run the regression first with treating realization as continuous:
When I try to run the margins, I have two options: Treating the independent variable as continuous or as categorical (factor variable).
First, I go with the first option and calculate the margins for expectation= -1, using several specifications:
1) Calculating the margin of realisation at the mean of realisation:
2) Calculating the margin of realisation at the mean of all variables
3) Now, I use the "over" option to predict the margins for all values of realisation:
4) Lastly, I manually predict the margins for all values of realization:
If I now specify the independent variable as a categorical variable, running the regression:
I will get these:
5) Calculating the margin of realisation at the mean of realisation:
6) Calculating the margin of realisation at the mean of all variables
7) Now, I use the "over" option to predict the margins for all values of realisation:
8) Lastly, I manually predict the margins for all values of realization:
All the results are really confusing me, because a lot of these specifications should yield the same margins for realisations. For example,
1) Why do I get different results when calculating margins with the "over" function and by manually setting all the values ( (3) and (4)) ?
2) Why do I get different results comparing margins at a certain value of realisation and i.realisation ((3) and (4) and (7) and (8))?
3) Lastly, in specification (2), run predict the margins setting all other variables in the regression at means, but what does happen with these values in specification (1)?
In the end, for each specification, I get a different prediction of the margin. I do not understand how Stata calculates these values in the background.
Can anybody elicit a solution on this?
Thanks a lot
Carsten
(This post is building upon an earlier post of me about marginal effects: https://www.statalist.org/forums/for...ossible-values)
I think I came across an issue that has not been subject to any discussion in the forum but is really important.
As I am conducting an analysis using a logistic model, I am interested in calculating the margins/ marginal effects to quantify the results.
However, using the - margins - command, I find that for seemingly the same calculation of the margins, I get different results.
Let me show you a quick example of what I mean. I run expectations of a firm's growth prospect on its current realization and a set of dummies. Both, expectations and realization are categorical variables that take on three different values: -1 (deteriorate), 0 (...remain stable), or 1 (...improve). Because factor variables do not allow negative values, I recode my realization to 1 2 3.
Code:
Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- expectation | 105,693 .0208339 .622453 -1 1 realisation | 97,551 1.987442 .6193167 1 3
I run the regression first with treating realization as continuous:
Code:
qui ologit expectation realisation i.size i.sector, vce(cluster permid)
First, I go with the first option and calculate the margins for expectation= -1, using several specifications:
1) Calculating the margin of realisation at the mean of realisation:
Code:
margins, at(realisation) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) at : realisation = 1.992652 (mean) ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .155702 .0012985 119.91 0.000 .1531571 .1582469 ------------------------------------------------------------------------------
Code:
margins, at(realisation) atmeans predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) at : realisation = 1.992652 (mean) 1.size = .3245139 (mean) 2.size = .3193684 (mean) 3.size = .2653454 (mean) 4.size = .0907724 (mean) 1.sector = .2788699 (mean) 2.sector = .1064883 (mean) 3.sector = .2574927 (mean) 4.sector = .357149 (mean) ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .1553722 .001295 119.97 0.000 .1528339 .1579104 ------------------------------------------------------------------------------
Code:
margins, over(realisation) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) over : realisation ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- realisation | 1 | .4234504 .0036308 116.63 0.000 .4163341 .4305668 2 | .1544277 .0012941 119.33 0.000 .1518912 .1569642 3 | .0432777 .0008042 53.82 0.000 .0417016 .0448539 ------------------------------------------------------------------------------
4) Lastly, I manually predict the margins for all values of realization:
Code:
margins, at(realisation=(1 2 3)) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) 1._at : realisation = 1 2._at : realisation = 2 3._at : realisation = 3 ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _at | 1 | .419882 .0036298 115.68 0.000 .4127678 .4269963 2 | .1543755 .0012932 119.37 0.000 .1518409 .1569102 3 | .0439714 .0008218 53.51 0.000 .0423608 .045582 ------------------------------------------------------------------------------
If I now specify the independent variable as a categorical variable, running the regression:
Code:
qui ologit expectation i.realisation i.size i.sector, vce(cluster permid)
5) Calculating the margin of realisation at the mean of realisation:
Code:
margins, at(realisation) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) at : 1.realisat~n = .1966162 (mean) 2.realisat~n = .6141154 (mean) 3.realisat~n = .1892684 (mean) ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .155136 .0013074 118.66 0.000 .1525736 .1576984 ------------------------------------------------------------------------------
Code:
margins, at(realisation) atmeans predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) at : 1.realisat~n = .1966162 (mean) 2.realisat~n = .6141154 (mean) 3.realisat~n = .1892684 (mean) 1.size = .3245139 (mean) 2.size = .3193684 (mean) 3.size = .2653454 (mean) 4.size = .0907724 (mean) 1.sector = .2788699 (mean) 2.sector = .1064883 (mean) 3.sector = .2574927 (mean) 4.sector = .357149 (mean) ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .1548055 .0013037 118.75 0.000 .1522503 .1573606 ------------------------------------------------------------------------------
Code:
margins, over(i.realisation) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) over : realisation ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- realisation | 1 | .4454591 .0045552 97.79 0.000 .436531 .4543871 2 | .1466192 .0014192 103.31 0.000 .1438377 .1494007 3 | .0468167 .0009317 50.25 0.000 .0449907 .0486427 ------------------------------------------------------------------------------
8) Lastly, I manually predict the margins for all values of realization:
Code:
margins, at(realisation=(1 2 3)) predict(outcome(-1)) Expression : Pr(expectation==-1), predict(outcome(-1)) 1._at : realisation = 1 2._at : realisation = 2 3._at : realisation = 3 ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _at | 1 | .4418217 .0045592 96.91 0.000 .4328858 .4507576 2 | .1465671 .0014182 103.35 0.000 .1437875 .1493468 3 | .0475713 .0009506 50.04 0.000 .0457081 .0494345 ------------------------------------------------------------------------------
All the results are really confusing me, because a lot of these specifications should yield the same margins for realisations. For example,
1) Why do I get different results when calculating margins with the "over" function and by manually setting all the values ( (3) and (4)) ?
2) Why do I get different results comparing margins at a certain value of realisation and i.realisation ((3) and (4) and (7) and (8))?
3) Lastly, in specification (2), run predict the margins setting all other variables in the regression at means, but what does happen with these values in specification (1)?
In the end, for each specification, I get a different prediction of the margin. I do not understand how Stata calculates these values in the background.
Can anybody elicit a solution on this?
Thanks a lot
Carsten
(This post is building upon an earlier post of me about marginal effects: https://www.statalist.org/forums/for...ossible-values)
Comment