How do I interpret insignificant levels of categorical predictors in binomial logistic regression?

Melissa Fiona

Join Date: Jul 2021
Posts: 7

How do I interpret insignificant levels of categorical predictors in binomial logistic regression?

08 Jul 2021, 06:17

Hello,

I have a problem with interpreting the results of my binomial logistic regression.
I am investigating the relationship between the perceived corruption of members of parliament (none, some of them, most of them, all of them) and satisfaction with democracy (coded as dummy with 1 = satisfied) with the help of the logit command.

When I run the analysis without control variables perceived corruption is statistically significant for the levels most of them and all of them at p<0.05 (none is reference). However, after I introduce my control variables (proven to be relatively good predictors of satisfaction with democracy using lrtest as well as by the theory), all levels of perceived corruption, my variable of interest and most of the levels of my control variables are statistically insignificant.
I am now wondering how to interpret these results.

Does it mean that considering some, most of them or all of parliament's members to be corrupt does not significantly impact a respondents chance to be satisfied with democracy compared to those in the reference category “None”?

This is my output:

Code:

logit sat_dem1 i.corrupt_mp winner anc ib3.well ib3.economy ib4.gov_economy ib4.gov_emp ib4.gov_
> crime ib3.trust

note: 8.corrupt_mp != 0 predicts success perfectly
8.corrupt_mp dropped and 2 obs not used

note: 8.economy != 0 predicts failure perfectly
8.economy dropped and 2 obs not used

Iteration 0: log likelihood = -1242.3333
Iteration 1: log likelihood = -1131.5192
Iteration 2: log likelihood = -1131.02
Iteration 3: log likelihood = -1130.9368
Iteration 4: log likelihood = -1130.9235
Iteration 5: log likelihood = -1130.9204
Iteration 6: log likelihood = -1130.9197
Iteration 7: log likelihood = -1130.9196
Iteration 8: log likelihood = -1130.9195

Logistic regression Number of obs = 1,817
LR chi2(36) = 222.83
Prob > chi2 = 0.0000
Log likelihood = -1130.9195 Pseudo R2 = 0.0897

-------------------------------------------------------------------------------------------------
sat_dem1 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------------------------+----------------------------------------------------------------
corrupt_mp |
1. Some of them | .1375941 .2266695 0.61 0.544 -.3066698 .5818581
2. Most of them | -.123168 .2379548 -0.52 0.605 -.5895508 .3432149
3. All of them | -.1250717 .2497171 -0.50 0.616 -.6145081 .3643648
8. Refused | 0 (empty)
9. Don't know/ Haven’t heard | .3542393 .2941798 1.20 0.229 -.2223425 .930821
|
winner | .2195418 .124617 1.76 0.078 -.0247029 .4637866
anc | .3254136 .1357487 2.40 0.017 .059351 .5914761
|
wellbeing |
1. Very Bad | -.2953002 .1774831 -1.66 0.096 -.6431606 .0525602
2. Fairly Bad | .0878332 .1808186 0.49 0.627 -.2665648 .4422311
4. Fairly Good | .2964598 .172577 1.72 0.086 -.0417849 .6347045
5. Very good | .1039665 .2075327 0.50 0.616 -.3027901 .510723
9. Don't know | .3214488 .73491 0.44 0.662 -1.118948 1.761846
|
economy |
Very Bad | -.42181 .185135 -2.28 0.023 -.7846679 -.058952
Fairly Bad | -.2583114 .1944428 -1.33 0.184 -.6394124 .1227895
Fairly Good | .1408969 .211429 0.67 0.505 -.2734964 .5552901
Very good | .1239676 .2494882 0.50 0.619 -.3650203 .6129556
Refused | 0 (empty)
Don't know | -.8153346 .4969947 -1.64 0.101 -1.789426 .1587571
|
gov_economy |
1. Very Badly | -.6374424 .2485057 -2.57 0.010 -1.124505 -.1503801
2. Fairly Badly | -.4355983 .2539268 -1.72 0.086 -.9332857 .0620891
3. Fairly Well | -.1935279 .2466126 -0.78 0.433 -.6768797 .289824
8. Refused | 13.79155 1031.079 0.01 0.989 -2007.087 2034.67
9. Don't know / Haven’t hear.. | -.852639 .3284448 -2.60 0.009 -1.496379 -.208899
|
gov_emp |
1. Very Badly | -.1599517 .3000783 -0.53 0.594 -.7480943 .4281909
2. Fairly Badly | -.2164558 .3088984 -0.70 0.483 -.8218855 .388974
3. Fairly Well | -.144183 .3084507 -0.47 0.640 -.7487353 .4603692
8. Refused | -28.50171 1230.658 -0.02 0.982 -2440.546 2383.543
9. Don't know / Haven’t hear.. | .8198749 .555076 1.48 0.140 -.2680541 1.907804
|
gov_crime |
1. Very Badly | .2947696 .2524195 1.17 0.243 -.1999635 .7895027
2. Fairly Badly | .2154838 .2660535 0.81 0.418 -.3059715 .736939
3. Fairly Well | .6979382 .2648499 2.64 0.008 .1788419 1.217034
8. Refused | 14.45072 671.8568 0.02 0.983 -1302.364 1331.266
9. Don't know / Haven’t hear.. | .12304 .5217258 0.24 0.814 -.8995237 1.145604
|
trust |
0. Not at all | -.7140299 .1679543 -4.25 0.000 -1.043214 -.3848454
1. Just a little | -.2366771 .1615504 -1.47 0.143 -.55331 .0799557
2. Somewhat | -.1290299 .178848 -0.72 0.471 -.4795655 .2215057
8. Refused | -.2341579 1.489882 -0.16 0.875 -3.154274 2.685958
9. Don’t know/Haven’t heard .. | -.3406794 .3233705 -1.05 0.292 -.974474 .2931151
|
_cons | .2401685 .4511124 0.53 0.594 -.6439957 1.124333

Last edited by Melissa Fiona; 08 Jul 2021, 06:32.

Tags: categorical, insignificant, interpretation, logit, logitic regression

Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

08 Jul 2021, 07:42

In order to include a variable as a control variable, that variable needs to influence the outcome and that variable needs to influence the explanatory variable of interest. Notice the direction of these relationships, this is also very important: Only when your control variable causes both the outcome and explanatory variable of interest is that variable a confounding variable or common cause, and needs to be in the model. If it is the explanatory variable of interest that causes the potential control variable, then that potential control variable should not be in the model. It is then a intervening variable.

So when you said "proven to be relatively good predictors of satisfaction with democracy using lrtest as well as by the theory", you have only done half of what is necessary to justify their inclusion in your model. You also need to argue that those control variables influence the perceived corruption.

Looking at the model, it seems to me that you have more variables in your model than your data can support. So you will want to find a way to reduce that. Maybe combine some categories, or combine some variables in an index of some sort, or remove some variables.

Important is also to consider the raw proportion of people that are satisfied with democracy. From a statistical point of view 50% being satisfied would be ideal, in the sense that that ensures maximum variance in the dependent variable and thus maximum power. (From a citizen's point of view 100% would be ideal, but that is a different story)

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Melissa Fiona

Join Date: Jul 2021

Posts: 7
#3

09 Jul 2021, 02:41

Hello,

First of all, thank you for your reply!

My control variables as seen in the model are: belonging to the electoral majority (winner), being close to the governing party (anc), personal wellbeing (well), evaluations of the national economy (economy), evaluation of the government handling of the economy (gov_economy), evaluation of how well the government is creating jobs (gov_emp) and reducing crime (gov_crime) as well as institutional trust (trust).
From what you said, I gather that winner and anc are both confounding variables, while trust is an intervening variable, meaning it's fine to leave them in the model. However, all economic indicators and those of government performance should be left out as I can't show that they influence perceptions of corruption?
When I said that the control variables were proven to be good predictors of satisfaction I meant that a large strand of literature has shown them to be important. Indeed, many of the previous studies focused on corruption and satisfaction with democracy have included controls for economic predictors/performance indicators in their model. Additionally, they significantly improve the fit of my model.

That said, I am still unsure how to interpret the results of my regression as it confuses me that most of my control variables are only significant on one level. Does that mean that I can I interpret predicted probabilities for all level or only for that one compared to the reference category?

Unfortunately, I can't follow your advice regarding the combination of multiple categories, formation of indexes as this is my bachelor thesis and my supervising professor wants me to keep the variables as they are in the dataset.

Concerning the data my sample includes n = 1,821 satisfied = 786 and dissatisfied = 1,035. To me that distributions did not seems so extreme.

Thank you again

Melissa
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#4

09 Jul 2021, 03:00

Well, you seem to have a selection of variables you should remove from your model. That should help.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Melissa Fiona

Join Date: Jul 2021

Posts: 7
#5

09 Jul 2021, 03:16

Ok, thank you.
One more thing, maybe it's a dumb question, but how can it be that others who have investigated this relationship were able to include economic predictors in their model? And how can I justify not using one of the main predictors as identified by the literature.
Having insignificant results and rejecting my hypothesis does is not my main concern as long as I am able to interpret the results and can show that I have specified the model correctly, therefore it doesn't seem like dropping them is a possibility...
Can you understand my concerns?
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#6

09 Jul 2021, 03:45

To include a variable in your model it has to be a confounder. If you argue that it is not a confounder, then it should not be in your model. So you need to read up on the distinction between confounding and intervening variables, and use that as a justification for what is and is not included in your model. Don't rely on what others have done before you, they could be wrong.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Melissa Fiona

Join Date: Jul 2021

Posts: 7
#7

09 Jul 2021, 04:22

Thank you!
Comment

Announcement

How do I interpret insignificant levels of categorical predictors in binomial logistic regression?

Comment

Comment

Comment

Comment

Comment

Comment