comparing methods for separation problem

ece bacaksiz

Join Date: Jun 2018

Posts: 16
#1

comparing methods for separation problem

01 Jun 2018, 14:33

Hi Dear Statalist,
I have a real dataset has quasi-complete separation. I have 3 independents and 53 obs. I’m trying to show which of -Firth method, exact and Bayesian logistic- the solutions is the best in modelling the data by using coefficients, ORs and their significances, SEs and CIs.
My questions:
--How can the constant be excluded from firth method (for ensuring to be the same with exact model, Bayes logistic provides supressing constant term) and is it appropriate to do this,
--One of the (causing separation) variables gives too big ORs. About from 550 to thousands, getting bigger in bayesian logistic. Is it possible in separation problem and does this prevent my study?
Many thanks.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29948
#2

01 Jun 2018, 15:38

I don't personally use -firthlogit- in my work, so I don't know it well.

1. Run -help firthlogit- to read the help file and see if there is a -nocons- option. Many estimation commands have that. If there is one, then use it; that would be the most direct solution.

2. If not, then after running your -firthlogit- command as is, run

Code:

test _cons = 0, coef

Stata will then test the hypothesis that _cons = 0 (which you may or may not care about) and it will also show you what the results would be if constrained to have _cons = 0.

I can't guarantee that this will work. All official Stata commands are designed to work with -test-. -firthlogit- is written by Joseph Coveney. He is one of the best Stata users and I would imagine that his program is designed to behave in all ways like a normal official Stata estimation command, and if so, this will work.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4373
#3

02 Jun 2018, 03:17

-firthlogit- used to have a -noconstant- option as I recall, but what is now up on SSC doesn't. You can fit a no-constant model with a -constraint-

Code:

constraint define 1 _b[_cons] = 0 firthlogit . . ., constraints(1)

(I'm not really sure what test _cons = 0, coef does, but it's not the same as either fitting a model with the constant term explicitly constrained to zero using -constraint- beforehand or using the -noconstant- option. At least with -logit-. See below.)

I've attached a new version of the firthlogit.ado file that allows a -noconstant- option. You can substitute it for the corresponding ADO file that was installed from SSC if you really want to go about it with an option instead of a constraint.

A couple of questions:

1. I'm curious as to how you can "show which . . . is the best . . . by using coefficients, ORs and their significances, SEs and CIs".

2. Why are you omitting the constant?

.ÿ
.ÿversionÿ15.1

.ÿ
.ÿclearÿ*

.ÿ
.ÿsysuseÿauto
(1978ÿAutomobileÿData)

.ÿ
.ÿquietlyÿlogitÿforeignÿc.headroom,ÿnolog

.ÿtestÿ_cons,ÿcoef

ÿ(ÿ1)ÿÿ[foreign]_consÿ=ÿ0

ÿÿÿÿÿÿÿÿÿÿÿchi2(ÿÿ1)ÿ=ÿÿÿÿ2.47
ÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.1158

Constrainedÿcoefficients

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
foreignÿÿÿÿÿÿ|
ÿÿÿÿheadroomÿ|ÿÿ-.3183087ÿÿÿ.0927322ÿÿÿÿ-3.43ÿÿÿ0.001ÿÿÿÿ-.5000604ÿÿÿÿ-.136557
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
------------------------------------------------------------------------------

.ÿ
.ÿconstraintÿdefineÿ1ÿ_b[_cons]ÿ=ÿ0

.ÿlogitÿforeignÿc.headroom,ÿconstraints(1)ÿnolog

LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿÿÿÿÿ=ÿÿÿÿÿÿ14.28
Logÿlikelihoodÿ=ÿ-42.958607ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0002

ÿ(ÿ1)ÿÿ[foreign]_consÿ=ÿ0
------------------------------------------------------------------------------
ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿheadroomÿ|ÿÿ-.3300145ÿÿÿ.0873327ÿÿÿÿ-3.78ÿÿÿ0.000ÿÿÿÿ-.5011835ÿÿÿ-.1588455
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
------------------------------------------------------------------------------

.ÿ
.ÿlogitÿforeignÿc.headroom,ÿnoconstantÿnolog

LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿÿÿÿÿ=ÿÿÿÿÿÿ14.28
Logÿlikelihoodÿ=ÿ-42.958607ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0002

------------------------------------------------------------------------------
ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿheadroomÿ|ÿÿ-.3300145ÿÿÿ.0873327ÿÿÿÿ-3.78ÿÿÿ0.000ÿÿÿÿ-.5011835ÿÿÿ-.1588455
------------------------------------------------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.
Attached Files

firthlogit.ado (2.0 KB, 1 view)
1 like
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 29948

02 Jun 2018, 17:19

Hmm. That's odd. I'm not sure what's going on either. In the help for -test- it explains the -coef- option as:

coef specifies that the constrained coefficients be displayed.

And with regress the technique does produce the correct coefficients, though the standard errors are different:

Code:

. sysuse auto, clear
(1978 Automobile Data)

. regress price mpg headroom, nocons

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 72)        =    129.68
       Model |  2.6987e+09         2  1.3493e+09   Prob > F        =    0.0000
    Residual |   749173087        72  10405181.8   R-squared       =    0.7827
-------------+----------------------------------   Adj R-squared   =    0.7767
       Total |  3.4478e+09        74  46592355.7   Root MSE        =    3225.7

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |   40.36135   39.07451     1.03   0.305    -37.53227     118.255
    headroom |   1680.574   277.2453     6.06   0.000     1127.896    2233.253
------------------------------------------------------------------------------

. regress price mpg headroom

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     10.44
       Model |   144280501         2  72140250.4   Prob > F        =    0.0001
    Residual |   490784895        71  6912463.32   R-squared       =    0.2272
-------------+----------------------------------   Adj R-squared   =    0.2054
       Total |   635065396        73  8699525.97   Root MSE        =    2629.2

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -259.1057   58.42485    -4.43   0.000    -375.6015   -142.6098
    headroom |  -334.0215   399.5499    -0.84   0.406    -1130.701    462.6585
       _cons |   12683.31   2074.497     6.11   0.000     8546.885    16819.74
------------------------------------------------------------------------------

. test _cons = 0, coef

 ( 1)  _cons = 0

       F(  1,    71) =   37.38
            Prob > F =    0.0000


Constrained coefficients

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |   40.36135   31.84822     1.27   0.205    -22.06001    102.7827
    headroom |   1680.574   225.9726     7.44   0.000     1237.676    2123.472
       _cons |          0  (omitted)
------------------------------------------------------------------------------

Comment

Joseph Coveney

Join Date: Apr 2014

Posts: 4373
#5

03 Jun 2018, 01:16

Originally posted by ece bacaksiz View Post

How can the constant be excluded . . . for ensuring to be the same with exact model

exlogistic doesn't exclude the constant. You can see that by executing the following code.

Code:

version 15.1 clear * set seed `=strreverse("1447250")' quietly set obs 53 foreach var of newlist predictor1 predictor2 predictor3 { generate byte `var' = runiform() < 0.5 } egen double xb = rowtotal(p*) quietly replace xb = xb / 5 + logit(0.75) generate byte response = rbinomial(1, invlogit(xb)) exlogistic response predictor1 predictor2 predictor3, coef nolog logit response predictor1 predictor2 predictor3, nolog logit response predictor1 predictor2 predictor3, noconstant nolog exit

Although an estimate for the intercept is not reported by exlogistic, it is clearly accounted for in forming the estimates for the remaining regression coefficients. So, if your interest is in comparing the performance of the three methods, then you will need to include the constant in the other two, as well.

Toward that larger question, you might want to take a look at the attached log file, which shows the results of a comparison of test size and power between exlogistic, firthlogit and a Bayesian-like penalization method using log-F(1, 1) priors (as implemented in the user-written penlogit command that is available from the Stata Journal website) for a model of three indicator variable predictors for your sample size of 53 and a baseline response of 50%.

I was rooting for penlogit, myself, but it looks as if firthlogit's frequentist operating characteristics are slightly better under those circumstances: it has a test size (nominal 0.05) of 0.049 (compared to exlogistic's 0.037—as expected—and penlogit's 0.041) and power to detect a log-odds coefficient of one of 0.412 (compared to 0.358 for exlogistic—again, as expected—and 0.389 for penlogit with the zero-centered prior).

To be fair, the class of log-F() priors is proposed by its advocates on the basis of interpretation and not any particular frequentist property. After some experimentation with the log-F(1, 1) prior in a few contexts, I've grown to appreciate it, and one of these days, I'll get around to adding its implementation as a built-in prior family to Stata's Bayesian command suites to the wishlist thread.
Attached Files

expefi.smcl (3.7 KB, 1 view)

expefi.do (1.7 KB, 1 view)
1 like
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#6

03 Jun 2018, 05:33

Dear Schecter and Coveney.
İm trying to do all you precious responds. Im trying to understand and apply now. Then i produce my new questions.
Thanks. Many thanks.
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#7

03 Jun 2018, 05:53

Dear Coveney
Answers here:
1-I couldnt find any other common ways to compare these three methods except for parameter significance, odds and odds' confidence intervals because they all use different estimation methods. In the end, i want to obtain outcomes like'' Firth and exact estimates more significant parameters (and keeps type1error better) or ''odds in Bayes have narrower confşdence intervals'' and maybe in a data like this, its better to use ''X'' method (accordong to the results)
2-For example in regression, if we want to compare models, we want them to have same number of parameters. Thats the cause i omit constant in Firth. Exact estimates without constant, Bayes has a ''supress constant term'' option and i thought, if i exclude constant in Firth, they have same numbers of parameters and maybe comparing would be stronger.

I need your advices about that. Am i in the right way? What would your precious advices be? I want to do the most accurate job.
Thanks.
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#8

03 Jun 2018, 06:07

Although an estimate for the intercept is not reported by exlogistic.... then you will need to include the constant in the other two, as well

First question is answered, as i can understand.
thanks.
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#9

03 Jun 2018, 06:09

Sorry, the second..
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4940
#10

03 Jun 2018, 06:09

I'm not following this thread closely, but perhaps this handout (and the articles it links to) will help.

https://www3.nd.edu/~rwilliam/stats3/RareEvents.pdf

In particular, see

https://www.europeansurveyresearch.o...ionLeitg_b.pdf

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#11

03 Jun 2018, 06:15

Thanks Sir Williams.
These articles are pioneer ones in understanding logitstic solutions. And i will read them again. One topic is i cannot fit Bayes to ML estimates in comparing.
Thanks again.
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#12

04 Jun 2018, 03:33

Hi again
What i could and could not do:
1)

constraint define 1 _b[_cons] = 0 firthlogit . . ., constraints(1)

worked and i exclude the constant, but i didnt mind as you said

Although an estimate for the intercept is not reported by exlogistic,,,, you will need to include the constant in the other two

in post 5. This is ok.
2) In post 5, in couldnt run the command starts with ''foreach var of new....'' (both in attached files) so i couldnt see the results of firth and exact together. But it's clear that firth is better under your data. Also firth gives more clean results than penlogit. So i got it and this is ok, too.

Questions:
Term ''Bayesian-like penalization method'' and ''Bayesian logistic'' have the same meaning?
Do you have any recommendations like postestimation, power or goodness of fit indicators common for three? In other words, am i right in thinking of SEs,CIs etc.
Does it help if i reveal my data to you?

Thanks. Many thanks.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4373
#13

04 Jun 2018, 16:27

Originally posted by ece bacaksiz View Post

Term ''Bayesian-like penalization method'' and ''Bayesian logistic'' have the same meaning?

Take a look at the citation that I gave, and to the Stata Journal article that the user-written command penlogit accompanies.

Do you have any recommendations like postestimation, power or goodness of fit indicators common for three? In other words, am i right in thinking of SEs,CIs etc.

Power was illustrated in #5 above. You can use a search engine to look at other indicators of model performance, such as prediction under various circumstances.
1 like
Comment
ece bacaksiz

Join Date: Jun 2018

Posts: 16
#14

05 Jun 2018, 09:35

Although there isn't in postestimation of Firth, we can get the value of under ROC area by writing

predict fitted

and

roctab response fitted

.
Is there a command similar after Exact and Bayesian logistic?
Comment

Announcement

comparing methods for separation problem

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment