Firth/penalized regression

john jose

Join Date: Aug 2015

Posts: 24
#1

Firth/penalized regression

22 Dec 2016, 21:07

When I use firthlogit command for binary variables, I get a message- string variable not allowed. How do I use Firth/penalized regression in STATA for categorical variables?
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#2

22 Dec 2016, 23:06

Originally posted by john jose View Post

When I use firthlogit command for binary variables, I get a message- string variable not allowed.

Don't use string variables. Type

Code:

help encode

at Stata's command line to see how to avoid using them.

Originally posted by john jose View Post

How do I use Firth/penalized regression in STATA for categorical variables?

Code:

firthlogit response i.predictor

See also this about user-written commands.
Comment
john jose

Join Date: Aug 2015

Posts: 24
#3

23 Dec 2016, 03:40

Thanks Joseph Coveney I encoded them as numerical as suggested in help encode I got the following . firthlogit response i.predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predic > tor11 initial: penalized log likelihood = -5.3709737 rescale: penalized log likelihood = -5.3709737 Iteration 0: penalized log likelihood = -5.3709737 (not concave) Iteration 1: penalized log likelihood = -4.6393957 (not concave) Iteration 2: penalized log likelihood = -4.5758953 (not concave) Iteration 3: penalized log likelihood = -4.2169041 (not concave) Iteration 4: penalized log likelihood = -4.1528348 (not concave) Iteration 5: penalized log likelihood = -4.1261668 (not concave) Iteration 6: penalized log likelihood = -4.117472 (not concave) Iteration 7: penalized log likelihood = -4.1100772 Iteration 8: penalized log likelihood = -4.0970614 Iteration 9: penalized log likelihood = -4.0675991 (not concave) Iteration 10: penalized log likelihood = -4.0635757 Iteration 11: penalized log likelihood = -4.0450612 (not concave) Iteration 12: penalized log likelihood = -4.0440434 Iteration 13: penalized log likelihood = -4.0427497 Iteration 14: penalized log likelihood = -4.0371908 Iteration 15: penalized log likelihood = -4.0371525 (not concave) Iteration 16: penalized log likelihood = -4.0369904 Iteration 17: penalized log likelihood = -4.0354216 Iteration 18: penalized log likelihood = -4.035416 Iteration 19: penalized log likelihood = -4.035416 Number of obs = 625 Wald chi2(11) = 2.37 Penalized log likelihood = -4.035416 Prob > chi2 = 0.9967 ------------------------------------------------------------------------------ response | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- predictor1 | 1 | -1.532435 2.141242 -0.72 0.474 -5.729193 2.664323 predictor2 | -.4538237 1.910201 -0.24 0.812 -4.197748 3.290101 predictor3 | .1414892 2.077999 0.07 0.946 -3.931315 4.214293 predictor4 | .9791276 1.80847 0.54 0.588 -2.565409 4.523664 predictor5 | -1.382989 1.686319 -0.82 0.412 -4.688114 1.922136 predictor6 | .1535513 2.312497 0.07 0.947 -4.378859 4.685961 predictor7 | -.1791151 1.894222 -0.09 0.925 -3.891721 3.533491 predictor8 | -.3405104 1.871929 -0.18 0.856 -4.009423 3.328402 predictor9 | .7565706 2.023236 0.37 0.708 -3.208898 4.722039 predictor10 | -.3020743 2.063096 -0.15 0.884 -4.345668 3.741519 predictor11 | -.4753602 2.274041 -0.21 0.834 -4.932398 3.981677 _cons | 5.725377 9.766529 0.59 0.558 -13.41667 24.86742 ------------------------------------------------------------------------------ . firthlogit,or Number of obs = 625 Wald chi2(11) = 2.37 Penalized log likelihood = -4.035416 Prob > chi2 = 0.9967 ------------------------------------------------------------------------------ response | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- predictor1 | 1 | .2160091 .4625277 -0.72 0.474 .0032497 14.35822 predictor2 | .6351947 1.213349 -0.24 0.812 .0150294 26.84556 predictor3 | 1.151988 2.39383 0.07 0.946 .0196179 67.64634 predictor4 | 2.662133 4.814388 0.54 0.588 .0768877 92.17271 predictor5 | .2508277 .4229757 -0.82 0.412 .009204 6.835544 predictor6 | 1.165968 2.696296 0.07 0.947 .0125397 108.4144 predictor7 | .8360097 1.583588 -0.09 0.925 .0204102 34.24331 predictor8 | .7114071 1.331703 -0.18 0.856 .0181439 27.89374 predictor9 | 2.130956 4.311426 0.37 0.708 .0404011 112.3973 predictor10 | .7392831 1.525212 -0.15 0.884 .0129629 42.16198 predictor11 | .6216611 1.413682 -0.21 0.834 .0072092 53.60687 _cons | 306.5489 2993.918 0.59 0.558 1.49e-06 6.31e+10 ------------------------------------------------------------------------------ which is the p value here? is it P>|z| column? By usual logistic regression I got 2 of them as predictors with p=0.002. By Firth I don't see any |z| Am I doing something wrong?
Comment

john jose

Join Date: Aug 2015
Posts: 24

23 Dec 2016, 03:45

Code:

 * Example generated by -dataex-. To install: ssc install dataex clear input long(response predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predictor11) 2 1 1 2 2 1 2 2 1 2 2 1 2 1 2 1 2 1 2 2 2 2 2 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 1 2 2 1 2 1 1 1 2 1 2 1 1 2 2 1 2 2 2 2 2 2 2 2 1 2 2 1 2 2 1 1 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1 2 2 2 2 2 2 2 1 2 1 2 1 1 2 1 2 2 2 2 1 1 1 2 1 2 2 1 2 2 1 2 2 1 2 2 1 1 2 2 2 2 2 2 1 1 1 2 1 2 2 1 2 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 2 2 1 2 1 1 2 1 2 1 2 1 2 2 1 1 2 1 2 1 2 1 2 2 2 2 1 2 2 1 2 1 1 1 2 1 2 2 1 2 2 1 2 2 1 2 2 1 2 2 2 2 2 2 2 1 2 1 2 1 2 2 1 2 2 2 2 1 1 1 2 1 1 2 1 1 2 2 2 1 1 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 2 1 2 2 2 2 1 2 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 1 2 1 2 2 2 2 1 1 1 2 1 1 2 1 1 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 1 2 2 2 2 2 2 1 1 2 2 2 1 2 2 1 2 2 2 1 1 1 2 1 1 2 2 1 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 2 2 2 1 2 2 2 2 2 2 2 1 1 1 2 1 2 2 1 1 2 1 2 2 1 2 1 1 1 2 1 1 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 1 1 1 1 2 2 2 2 2 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 2 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 1 2 2 1 2 1 1 1 1 2 2 2 1 2 2 2 2 1 2 2 1 1 2 1 2 2 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 2 2 1 1 1 1 2 1 1 1 2 1 1 2 1 2 1 2 2 1 1 1 2 1 1 2 2 2 1 2 1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 1 1 1 2 1 1 1 2 1 2 2 2 2 1 2 2 1 1 1 2 1 2 2 2 1 1 1 2 1 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 1 1 2 1 2 1 2 2 1 1 2 2 1 2 2 2 2 1 1 2 1 1 1 2 1 2 2 2 1 1 1 2 1 2 1 2 1 2 2 2 2 1 1 2 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 1 . 1 1 1 1 2 1 2 2 2 2 1 1 2 1 2 1 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 1 1 2 1 2 2 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 1 1 2 2 1 1 2 2 1 2 1 2 1 1 1 1 1 2 2 2 1 1 2 2 1 2 2 1 2 1 1 1 1 2 1 1 1 2 2 2 2 1 1 2 1 1 1 1 1 2 2 2 2 1 1 2 1 1 1 1 2 2 2 1 2 1 1 2 1 1 1 1 1 2 2 2 2 1 1 2 1 1 2 1 1 2 2 1 2 1 2 2 1 2 1 1 1 1 2 1 2 1 2 2 1 2 1 1 1 1 2 1 2 1 2 1 1 1 1 2 1 2 2 2 2 1 1 2 1 2 2 2 1 2 1 1 2 2 1 2 1 1 2 2 1 2 2 2 2 2 1 2 1 1 2 2 2 2 2 1 2 2 1 2 1 1 2 2 1 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2 2 2 2 2 1 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1 1 2 1 2 2 2 2 1 2 2 1 1 2 1 1 2 2 1 2 1 2 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 1 2 2 1 1 2 1 1 2 2 1 2 1 2 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 1 2 1 2 1 2 2 2 2 2 2 1 1 2 2 1 2 1 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 2 2 1 1 2 2 1 2 2 1 2 1 1 2 1 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 2 2 1 2 2 1 2 1 1 1 2 2 1 2 2 1 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 2 1 2 2 2 1 2 2 2 2 2 2 1 2 1 2 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 1 1 2 2 2 2 2 1 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 1 2 1 1 2 2 1 1 2 2 1 2 2 end label values response response label def response 1 "1", modify label def response 2 "2", modify label values predictor1 predictor1 label def predictor1 1 "0", modify label def predictor1 2 "1", modify label values predictor2 predictor2 label def predictor2 1 "1", modify label def predictor2 2 "2", modify label values predictor3 predictor3 label def predictor3 1 "1", modify label def predictor3 2 "2", modify label values predictor4 predictor4 label def predictor4 1 "1", modify label def predictor4 2 "2", modify label values predictor5 predictor5 label def predictor5 1 "1", modify label def predictor5 2 "2", modify label values predictor6 predictor6 label def predictor6 1 "1", modify label def predictor6 2 "2", modify label values predictor7 predictor7 label def predictor7 1 "1", modify label def predictor7 2 "2", modify label values predictor8 predictor8 label def predictor8 1 "1", modify label def predictor8 2 "2", modify label values predictor9 predictor9 label def predictor9 1 "1", modify label def predictor9 2 "2", modify label values predictor10 predictor10 label def predictor10 1 "1", modify label def predictor10 2 "2", modify label values predictor11 predictor11 label def predictor11 1 "1", modify label def predictor11 2 "2", modify

Comment

john jose

Join Date: Aug 2015

Posts: 24
#5

23 Dec 2016, 03:54

I used the following code: firthlogit response i.predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predictor11 Above I have inserted 100 sample example from my data. Am I doing something wrong?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

23 Dec 2016, 04:01

Your sample data are unreadable. Please try again and use the Preview feature to check that line breaks define separate lines.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#7

23 Dec 2016, 20:21

As Nick said, no one can really read your example well, but it seems as if every one of your variables takes on values of 1 or 2, including the response variable.

There is a passage in the help file for the official command, logit:

depvar equal to nonzero and nonmissing (typically depvar equal to one) indicates a positive outcome, whereas depvar equal to zero indicates a negative outcome.

This applies to the user-written command firthlogit (SSC), too.

Also, read the help file for Stata's factor variable notation. Type

Code:

help factor variables

at the command line.

Afterward, I'm guessing that you'll end up with something like

Code:

generate byte response2 = response == 2 firthlogit response2 i.(predictor*), nolog
Comment
john jose

Join Date: Aug 2015

Posts: 24
#8

24 Dec 2016, 07:12

Thanks Coveney. I encoded string as numericals. Then I changed response to 1- positive outcome and 0- negative outcome. Similarly for the predictors I changed 1- present 0-absent. Then I tried firthlogit response i.predictor1 i.predictor2 i.predictor3 i.predictor4 i.predictor5 i.predictor6 i.predictor7 i.predictor8 i.predictor9 i.predictor10 i.predictor11.
Comment
john jose

Join Date: Aug 2015

Posts: 24
#9

24 Dec 2016, 10:12

outcome Coef. Std. Err. z P>z [95% Conf. Interval] predictor1 1 2.503442 .7247074 3.45 0.001 1.083042 3.923843 predictor2 1 -.1585433 .5442104 -0.29 0.771 -1.225176 .9080895 predictor3 1 1.830119 .623115 2.94 0.003 .608836 3.051402 predictor4 1 -.8968628 .7170882 -1.25 0.211 -2.30233 .5086042 predictor5 1 -.1005266 .6822045 -0.15 0.883 -1.437623 1.23657 predictor6 1 .6214674 .6967634 0.89 0.372 -.7441637 1.987099 predictor7 1 -.3120722 .5484386 -0.57 0.569 -1.386992 .7628477 predictor8 1 -.0808703 .5309698 -0.15 0.879 -1.121552 .9598113 predictor9 1 .2397349 .656976 0.36 0.715 -1.047915 1.527384 predictor10 1 .052267 .5521066 0.09 0.925 -1.029842 1.134376 predictor11 1 3.941909 1.549027 2.54 0.011 .9058712 6.977947 _cons -7.685536 1.940974 -3.96 0.000 -11.48978 -3.881297 .
Comment
john jose

Join Date: Aug 2015

Posts: 24
#10

24 Dec 2016, 10:15

Sorry. Somehow I cannot post my results as a table. Is this 'P>z' column the p value?
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#11

24 Dec 2016, 10:44

Yes.

I kindly recommend you post the commands and output under CODE delimiters, as recommend in the FAQ. This way, they will be easily readable. Thanks.

Best regards,

Marcos
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#12

24 Dec 2016, 17:44

Originally posted by john jose View Post

Sorry. Somehow I cannot post my results as a table. Is this 'P>z' column the p value?

If you're using firthlogit because of separation or quasiseparation, then as described in the help file you shouldn't be using Wald tests. Use the likelihood-ratio test as shown in the command's help file and ancillary files.
Comment
john jose

Join Date: Aug 2015

Posts: 24
#13

25 Dec 2016, 08:34

Thanks Coveney & Almeida. The reason I was using firth regression was because I had a data size of 600 and events occurred in only 17 (rare outcome). There are no null counts.
Comment
john jose

Join Date: Aug 2015

Posts: 24
#14

28 Dec 2016, 00:46

I installed the ancillary file for firth regression. How do I open it in stata (sorry I am quite new to stata).
Comment
john jose

Join Date: Aug 2015

Posts: 24
#15

28 Dec 2016, 00:52

I installed the ancillary file for firth regression. How do I open it in stata (sorry I am quite new to stata).
Comment

Announcement

Firth/penalized regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment