Logit Model if prob>chi2 is not significant

Patty Naibaho

Join Date: Nov 2019
Posts: 9

Logit Model if prob>chi2 is not significant

25 Nov 2019, 23:22

Dear all,

I found that my model prob>chi2 is not significant (0,27). The purpose of my study is to find the relation between the interaction variable and my dependent variable.

Code:

 

. logit briedu kis##edu age gender urban education business employment value religius1
$controls[pw
=
BOT_NAS_JBR_JTG],
robust
nolog

Logistic regression Number of obs = 778

Wald chi2(11) = 13.29

Prob > chi2 = 0.2747

Log pseudolikelihood = -152.1861 Pseudo R2 = 0.0334



Robust

briedu Coef. Std. Err. z P>z [95% Conf. Interval]

1.kis .1149079 .3068558 0.37 0.708 -.4865184 .7163343

1.edu 1.095056 .6147318 1.78 0.075 -.1097966 2.299908



kis#edu

1 1 -1.011685 1.228102 -0.82 0.410 -3.418721 1.395352



age -.026757 .0151822 -1.76 0.078 -.0565136 .0029996

gender -.2006767 .3634839 -0.55 0.581 -.9130922 .5117387

urban .0471323 .3164125 0.15 0.882 -.5730249 .6672895

education .04647 .1658182 0.28 0.779 -.2785276 .3714676

business .4497326 .4321611 1.04 0.298 -.3972876 1.296753

employment -.0773021 .4214165 -0.18 0.854 -.9032633 .7486591

value -.4337076 .3170597 -1.37 0.171 -1.055133 .187718

religius1 -.2829598 .2103322 -1.35 0.179 -.6952034 .1292838

_cons -.0085307 .8841085 -0.01 0.992 -1.741352 1.72429

The one that I found confusing is I have several variables that significant in the model. Thus I want to ask can I use this model in my research and just explain the variable that significant?

Can someone explain to me the usual reason why this happens?

Really grateful for your enlightenment on this matter.

Tags: None

Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

26 Nov 2019, 01:05

None of your variables are individually significantly different from 0 (using a significance level of 0.05), so it is no surprise that they also aren't jointly significant.

You have a variable called edu and another one called education in your model. Is that a mistake?

After estimating your model can you type: tab briedu if e(sample) and show us the result?

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10195

26 Nov 2019, 01:11

Your output is hard to read. The code delimiters will automatically format Stata's default results if you copy and paste directly.

The one that I found confusing is I have several variables that significant in the model. Thus I want to ask can I use this model in my research and just explain the variable that significant?

You are thinking in terms of the linear model where (Prob>F) is the probability that all regression coefficients are zero (therefore, the F-statistic has to be significant if at least one regression coefficient is significant). In nonlinear models such as logit and probit, the LR Chi2 statistic is a comparison of log-likelihoods, the default is the estimated model and the model with intercept only. It is calculated as 2*(log-likelihood estimated model- log-likelihood intercept only). Of course, if the coefficients are themselves significant, this difference should be significant but not strictly.

Code:

webuse lbw
logit low age smoke i.race
scalar ll1= e(ll)
logit low
scalar ll2= e(ll)
local Chi2= 2*(ll1-ll2)
di `Chi2'
*4 DF (RESTRICTIONS BEING TESTED)
di chi2tail(4, `Chi2')

Res.:

Code:

.
. logit low age smoke i.race

Iteration 0:   log likelihood =   -117.336  
Iteration 1:   log likelihood = -109.57893  
Iteration 2:   log likelihood = -109.43115  
Iteration 3:   log likelihood =  -109.4311  
Iteration 4:   log likelihood =  -109.4311  

Logistic regression                             Number of obs     =        189
                                                LR chi2(4)        =      15.81
                                                Prob > chi2       =     0.0033
Log likelihood =  -109.4311                     Pseudo R2         =     0.0674

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0348828   .0334024    -1.04   0.296    -.1003502    .0305847
       smoke |    1.10055   .3719453     2.96   0.003     .3715511     1.82955
             |
        race |
      black  |   1.011413   .4934234     2.05   0.040     .0443209    1.978505
      other  |    1.05673   .4059583     2.60   0.009     .2610665    1.852394
             |
       _cons |  -1.007554   .8616628    -1.17   0.242    -2.696382    .6812744
------------------------------------------------------------------------------

.
. di `Chi2'
15.809799

. di chi2tail(4, `Chi2')
.0032853

Last edited by Andrew Musau; 26 Nov 2019, 01:58.

Comment

Patty Naibaho

Join Date: Nov 2019
Posts: 9

26 Nov 2019, 02:09

Dear Maarten,

No, I don't make mistake on the variable. I used variable "edu" for quality of education and variable "education" for respondents' last education.

I see, so the significant level needs to be 95%.

here is the result of tab briedu

Code:

tab    briedu    if    e(sample)
  briedu
Freq.
Percent
			Cum.

0
694
89.20
89.20

1
84
10.80
100.00

Total
778
100.00

Do you think the respondent that answered 1 is too small? Thus, the result is not significant? Can you predict what is the problem here?

Hi, Andrew,

I already tried to copy and paste directly, but still, the table is not right. So I tried to input it manually. I hope it is clear now.

Code:

. logit briedu kis##edu age gender urban education business employment value religius1 $controls[pw = BOT_NAS_JBR_JTG], robust nolog
  Logistic Regression

Number of obs    =    778




Wald chi2(11)    =    13.29




Prob > chi2    =    0.2747


Log pseudolikelihood    = -152.1861

Pseudo R2    =    0.033




   briedu
Coef.
Robust
			Std. Err.
z
 P>|z|
 [95% Conf. Interval]


 1.kis
 .1149079
.3068558
0.37
0.708
-.4865184
.7163343

 1.edu
1.095056
.6147318
1.78
0.075
-.1097966
 2.299908

kis#edu







1 1
-1.011685
1.228102
 -0.82
 0.410
-3.418721
 1.395352

 age
 -.026757
.0151822
-1.76
0.078
 -.0565136
.0029996

gender
-.2006767
.3634839
-0.55
 0.581
-.9130922
 .5117387

 urban
.0471323
.3164125
 0.15
 0.882
-.5730249
.6672895

education
.04647
.1658182
0.28
0.779
 -.2785276
.3714676

business
.4497326
.4321611
 1.04
0.298
 -.3972876
1.296753

employment
-.0773021
.4214165
-0.18
 0.854
-.9032633
.7486591

value
 -.4337076
.3170597
 -1.37
 0.171
 -1.055133
.187718

religius1
 -.2829598
 .2103322
-1.35
 0.179
-.6952034
.1292838

_cons
-.0085307
 .8841085
-0.01
 0.992
 -1.741352
 1.72429

so, do you mean I have to test the variable one by one to see if they are significant based on the log likelihood?
But I use robust in my model, thus the log likelihood change to wald chi2. is the calculation still the same?

if the model are not significant with log likelihood, can I still interpret the variable, i.e age (since the p is significant)?

Many thanks for your response

Comment

Maarten Buis

Join Date: Mar 2014

Posts: 3456
#5

26 Nov 2019, 02:45

The significance level is arbitrary. I am used to 5%, but if you want to use 1% or 10% then that is perfectly fine to. If you feel fancy you could choose \(1-\frac{e}{\pi}\approx .1347\). I don't know why one would make that choice, but you could. So, the choice of significance level is completely arbitrary, which is one of the problems with significance, and why there is quite a bit of effort going on on the part of many statisticians to remove the use of significance altogether. Regardless, if you are going to use the term significance, you do need to say what level you are using. Saying something is significant without specifying the significance level is just meaningless.

Only 84 successes is probably not enough for that many explanatory variables and an interaction. I know that this is frustrating, but if the information is not present in the data, then no amount of statistical torture can extract it. So your options are: either simplify the model (remove explanatory variables, probably starting with the interaction effect), or find better data.

Last edited by Maarten Buis; 26 Nov 2019, 02:53.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#6

26 Nov 2019, 02:45

Patty:
as an aside to previous excellent replies, please note that -age- coefficient is not significant (p=0.078; 95%CI: -.0565136;.0029996).
That said, I fail to get what you mean by

...can I still interpret interpret the variable...

Kind regards,
Carlo
(Stata 19.0)
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#7

26 Nov 2019, 02:49

But I use robust in my model, thus the log likelihood change to wald chi2. is the calculation still the same?

With robust standard errors, you do not have a log-likelihood and Stata calculates a pseudolikelihood. The test now defaults to the test of joint significance, in your case,

Code:

testparm * 1.kis 1.edu 1.kis#1.edu

if the model are not significant with log likelihood, can I still interpret the variable, i.e age (since the p is significant)?

Yes. Goodness of fit statistics and the like are difficult for nonlinear models with binary dependent variables. Very low pseudo R2 statistics are a common place, but this does not stop anyone from interpreting and making sense of these estimated models.
Comment
Patty Naibaho

Join Date: Nov 2019

Posts: 9
#8

26 Nov 2019, 03:27

Carlo,

I only taught to see the significance by p value. For education purpose, could you please explain to me why age is not significant?
I know it is very basic.

Thank you all for your clear explanation.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#9

26 Nov 2019, 05:12

Patty:
as Maarten said, if you (arbitrarily) choose the 0.05 reference value (as you did in your regression model), 0.078>0.05: hence, -age- is not significant.
Conversely, if you (arbitrarily) choose the 0.10 reference value, 0.078<0.10: hence, -age- is significant.
That said, it's good habit to take a look at CIs bounds, too: in your model, the 95%CI for -age- straddles over the indifference value (zero): hence, -age- is not significant.
Eventually, I would consider non-significant coefficients as informative as significant ones.

Kind regards,
Carlo
(Stata 19.0)
Comment
Patty Naibaho

Join Date: Nov 2019

Posts: 9
#10

27 Nov 2019, 06:56

I see,, Thank you very much Carlo.
Comment

Announcement

Logit Model if prob>chi2 is not significant

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment