Dear community,
I am puzzled about which specification of marginal effects to use when the independent variable is categorical and has three possible values.
I am doing an analysis with survey data where I want to measure the effect of subjective realization of past economic events (in this case, the availability of bank loans) of firms on their expectation/ forecast for the future. I am using an ordered logit model, as the dependent variable expectation takes on the values -1 (situation will deteriorate), 0 (...remain stable), or 1 (...improve). The independent variable, realistation, also can take on three different values -1 (situation has deteriorated), 0 (...remained stable), or 1 (...improved). I also include firm controls such as size, sector and country. Because factor variables do not allow negative values, I recode my independent variable.
As interpreting the ologit coefficients is not straight forward, I consider using marginal effects. However, I am not sure which specification of margins to use, given that my independent variable is neither continuous nor binary (0 or 1).
Basically, I try three different specifications, but I cannot really figure out which one is the right one to use, or, what is the difference between them.
1) First, I use the factor specification for the independent variable:
Here, the interpretation is clear: If the realized event of real_bankloan is "remained stable" (=0), the firm's probability of expecting a deterioration is 29,2 percentage points less than if it realized real_bankloan to be deteriorated
2) However,technically one could also go and not define real_bankloan as a factor variable:
But I am puzzled about how to interpret this finding, since, in the first place, getting the margins is not possible for this specification of the variable.
3) One could calculate the margins at certain values of the independent variable
But here, I get totally different values. Given all these different specifications, I am a bit confused, which is the most appropriate one. Also, for the last specification in 3), what does the constant tell me?
Is anybody able to give a quick overview of the differenced between these three specifications?
Thanks for taking the time to read my post, I really appreciate your help!
Carsten
I am puzzled about which specification of marginal effects to use when the independent variable is categorical and has three possible values.
I am doing an analysis with survey data where I want to measure the effect of subjective realization of past economic events (in this case, the availability of bank loans) of firms on their expectation/ forecast for the future. I am using an ordered logit model, as the dependent variable expectation takes on the values -1 (situation will deteriorate), 0 (...remain stable), or 1 (...improve). The independent variable, realistation, also can take on three different values -1 (situation has deteriorated), 0 (...remained stable), or 1 (...improved). I also include firm controls such as size, sector and country. Because factor variables do not allow negative values, I recode my independent variable.
Code:
Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- exp_bankloan | 98,802 .011771 .6245559 -1 1 real_bankl~n | 90,806 -.0258683 .622418 -1 1 real_bankl~e | 90,806 1.974132 .622418 1 3
As interpreting the ologit coefficients is not straight forward, I consider using marginal effects. However, I am not sure which specification of margins to use, given that my independent variable is neither continuous nor binary (0 or 1).
Basically, I try three different specifications, but I cannot really figure out which one is the right one to use, or, what is the difference between them.
1) First, I use the factor specification for the independent variable:
Code:
// 1 with factor specification i. qui ologit exp_bankloan i.real_bankloan_recode i.size i.sector, vce(cluster permid) margins , dydx(i.real_bankloan_recode) predict(outcome(-1)) atmeans post -------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] ---------------------+---------------------------------------------------------------- real_bankloan_recode | 2 | -.2923866 .0048704 -60.03 0.000 -.3019324 -.2828408 3 | -.3933965 .0047711 -82.45 0.000 -.4027478 -.3840453 --------------------------------------------------------------------------------------
2) However,technically one could also go and not define real_bankloan as a factor variable:
Code:
// 2 without factor specification i.: qui ologit exp_bankloan real_bankloan_recode i.size i.sector, vce(cluster permid) margins real_bankloan_recode, predict(outcome(-1)) atmeans post // -> does not work .factor 'real_bankloan_recode' not found in list of covariates // this does not work. qui ologit exp_bankloan real_bankloan_recode i.size i.sector, vce(cluster permid) margins , dydx(real_bankloan_recode) predict(outcome(-1)) atmeans post -------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] ---------------------+---------------------------------------------------------------- real_bankloan_recode | -.1848426 .0020565 -89.88 0.000 -.1888733 -.180812 --------------------------------------------------------------------------------------
But I am puzzled about how to interpret this finding, since, in the first place, getting the margins is not possible for this specification of the variable.
3) One could calculate the margins at certain values of the independent variable
Code:
// or: qui ologit exp_bankloan real_bankloan_recode i.size i.sector, vce(cluster permid) margins , at(real_bankloan_recode=(1 2 3)) predict(outcome(-1)) atmeans post ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _at | 1 | .4230018 .0037166 113.81 0.000 .4157173 .4302862 2 | .157907 .0013481 117.13 0.000 .1552648 .1605492 3 | .0457687 .0008784 52.11 0.000 .0440471 .0474903 ------------------------------------------------------------------------------ qui ologit exp_bankloan real_bankloan_recode i.size i.sector, vce(cluster permid) margins , at(real_bankloan_recode=(1)) predict(outcome(-1)) atmeans post ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .4230018 .0037166 113.81 0.000 .4157173 .4302862 ------------------------------------------------------------------------------
Is anybody able to give a quick overview of the differenced between these three specifications?
Thanks for taking the time to read my post, I really appreciate your help!
Carsten
Comment