Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Firth/penalized regression

    When I use firthlogit command for binary variables, I get a message- string variable not allowed. How do I use Firth/penalized regression in STATA for categorical variables?

  • #2
    Originally posted by john jose View Post
    When I use firthlogit command for binary variables, I get a message- string variable not allowed.
    Don't use string variables. Type
    Code:
    help encode
    at Stata's command line to see how to avoid using them.
    Originally posted by john jose View Post
    How do I use Firth/penalized regression in STATA for categorical variables?
    Code:
    firthlogit response i.predictor
    See also this about user-written commands.

    Comment


    • #3
      Thanks Joseph Coveney I encoded them as numerical as suggested in help encode I got the following . firthlogit response i.predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predic > tor11 initial: penalized log likelihood = -5.3709737 rescale: penalized log likelihood = -5.3709737 Iteration 0: penalized log likelihood = -5.3709737 (not concave) Iteration 1: penalized log likelihood = -4.6393957 (not concave) Iteration 2: penalized log likelihood = -4.5758953 (not concave) Iteration 3: penalized log likelihood = -4.2169041 (not concave) Iteration 4: penalized log likelihood = -4.1528348 (not concave) Iteration 5: penalized log likelihood = -4.1261668 (not concave) Iteration 6: penalized log likelihood = -4.117472 (not concave) Iteration 7: penalized log likelihood = -4.1100772 Iteration 8: penalized log likelihood = -4.0970614 Iteration 9: penalized log likelihood = -4.0675991 (not concave) Iteration 10: penalized log likelihood = -4.0635757 Iteration 11: penalized log likelihood = -4.0450612 (not concave) Iteration 12: penalized log likelihood = -4.0440434 Iteration 13: penalized log likelihood = -4.0427497 Iteration 14: penalized log likelihood = -4.0371908 Iteration 15: penalized log likelihood = -4.0371525 (not concave) Iteration 16: penalized log likelihood = -4.0369904 Iteration 17: penalized log likelihood = -4.0354216 Iteration 18: penalized log likelihood = -4.035416 Iteration 19: penalized log likelihood = -4.035416 Number of obs = 625 Wald chi2(11) = 2.37 Penalized log likelihood = -4.035416 Prob > chi2 = 0.9967 ------------------------------------------------------------------------------ response | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- predictor1 | 1 | -1.532435 2.141242 -0.72 0.474 -5.729193 2.664323 predictor2 | -.4538237 1.910201 -0.24 0.812 -4.197748 3.290101 predictor3 | .1414892 2.077999 0.07 0.946 -3.931315 4.214293 predictor4 | .9791276 1.80847 0.54 0.588 -2.565409 4.523664 predictor5 | -1.382989 1.686319 -0.82 0.412 -4.688114 1.922136 predictor6 | .1535513 2.312497 0.07 0.947 -4.378859 4.685961 predictor7 | -.1791151 1.894222 -0.09 0.925 -3.891721 3.533491 predictor8 | -.3405104 1.871929 -0.18 0.856 -4.009423 3.328402 predictor9 | .7565706 2.023236 0.37 0.708 -3.208898 4.722039 predictor10 | -.3020743 2.063096 -0.15 0.884 -4.345668 3.741519 predictor11 | -.4753602 2.274041 -0.21 0.834 -4.932398 3.981677 _cons | 5.725377 9.766529 0.59 0.558 -13.41667 24.86742 ------------------------------------------------------------------------------ . firthlogit,or Number of obs = 625 Wald chi2(11) = 2.37 Penalized log likelihood = -4.035416 Prob > chi2 = 0.9967 ------------------------------------------------------------------------------ response | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- predictor1 | 1 | .2160091 .4625277 -0.72 0.474 .0032497 14.35822 predictor2 | .6351947 1.213349 -0.24 0.812 .0150294 26.84556 predictor3 | 1.151988 2.39383 0.07 0.946 .0196179 67.64634 predictor4 | 2.662133 4.814388 0.54 0.588 .0768877 92.17271 predictor5 | .2508277 .4229757 -0.82 0.412 .009204 6.835544 predictor6 | 1.165968 2.696296 0.07 0.947 .0125397 108.4144 predictor7 | .8360097 1.583588 -0.09 0.925 .0204102 34.24331 predictor8 | .7114071 1.331703 -0.18 0.856 .0181439 27.89374 predictor9 | 2.130956 4.311426 0.37 0.708 .0404011 112.3973 predictor10 | .7392831 1.525212 -0.15 0.884 .0129629 42.16198 predictor11 | .6216611 1.413682 -0.21 0.834 .0072092 53.60687 _cons | 306.5489 2993.918 0.59 0.558 1.49e-06 6.31e+10 ------------------------------------------------------------------------------ which is the p value here? is it P>|z| column? By usual logistic regression I got 2 of them as predictors with p=0.002. By Firth I don't see any |z| Am I doing something wrong?

      Comment


      • #4
        Code:
         * Example generated by -dataex-. To install: ssc install dataex clear input long(response predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predictor11) 2 1 1 2 2 1 2 2 1 2 2 1 2 1 2 1 2 1 2 2 2 2 2 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 1 2 2 1 2 1 1 1 2 1 2 1 1 2 2 1 2 2 2 2 2 2 2 2 1 2 2 1 2 2 1 1 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1 2 2 2 2 2 2 2 1 2 1 2 1 1 2 1 2 2 2 2 1 1 1 2 1 2 2 1 2 2 1 2 2 1 2 2 1 1 2 2 2 2 2 2 1 1 1 2 1 2 2 1 2 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 2 2 1 2 1 1 2 1 2 1 2 1 2 2 1 1 2 1 2 1 2 1 2 2 2 2 1 2 2 1 2 1 1 1 2 1 2 2 1 2 2 1 2 2 1 2 2 1 2 2 2 2 2 2 2 1 2 1 2 1 2 2 1 2 2 2 2 1 1 1 2 1 1 2 1 1 2 2 2 1 1 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 2 1 2 2 2 2 1 2 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 1 2 1 2 2 2 2 1 1 1 2 1 1 2 1 1 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 1 2 2 2 2 2 2 1 1 2 2 2 1 2 2 1 2 2 2 1 1 1 2 1 1 2 2 1 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 2 2 2 1 2 2 2 2 2 2 2 1 1 1 2 1 2 2 1 1 2 1 2 2 1 2 1 1 1 2 1 1 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 1 1 1 1 2 2 2 2 2 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 2 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 1 2 2 1 2 1 1 1 1 2 2 2 1 2 2 2 2 1 2 2 1 1 2 1 2 2 2 2 2 1 1 1 2 1 2 2 2 2 2 1 2 1 1 1 2 1 2 2 1 1 1 1 2 1 1 1 2 1 1 2 1 2 1 2 2 1 1 1 2 1 1 2 2 2 1 2 1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 1 1 1 2 1 1 1 2 1 2 2 2 2 1 2 2 1 1 1 2 1 2 2 2 1 1 1 2 1 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 1 1 2 1 2 1 2 2 1 1 2 2 1 2 2 2 2 1 1 2 1 1 1 2 1 2 2 2 1 1 1 2 1 2 1 2 1 2 2 2 2 1 1 2 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 1 1 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 1 . 1 1 1 1 2 1 2 2 2 2 1 1 2 1 2 1 2 1 2 2 2 2 1 1 2 1 2 1 2 1 1 1 1 2 1 2 2 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 2 1 1 2 2 1 1 2 2 1 2 1 2 1 1 1 1 1 2 2 2 1 1 2 2 1 2 2 1 2 1 1 1 1 2 1 1 1 2 2 2 2 1 1 2 1 1 1 1 1 2 2 2 2 1 1 2 1 1 1 1 2 2 2 1 2 1 1 2 1 1 1 1 1 2 2 2 2 1 1 2 1 1 2 1 1 2 2 1 2 1 2 2 1 2 1 1 1 1 2 1 2 1 2 2 1 2 1 1 1 1 2 1 2 1 2 1 1 1 1 2 1 2 2 2 2 1 1 2 1 2 2 2 1 2 1 1 2 2 1 2 1 1 2 2 1 2 2 2 2 2 1 2 1 1 2 2 2 2 2 1 2 2 1 2 1 1 2 2 1 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2 2 2 2 2 1 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1 1 2 1 2 2 2 2 1 2 2 1 1 2 1 1 2 2 1 2 1 2 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 1 2 2 1 1 2 1 1 2 2 1 2 1 2 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 1 2 1 2 1 2 2 2 2 2 2 1 1 2 2 1 2 1 1 2 1 1 2 1 1 2 2 1 2 2 1 2 1 2 2 1 1 2 2 1 2 2 1 2 1 1 2 1 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 2 2 1 2 2 1 2 1 1 1 2 2 1 2 2 1 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 2 1 2 2 2 1 2 2 2 2 2 2 1 2 1 2 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 1 1 2 2 2 2 2 1 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 1 2 1 1 2 2 1 1 2 2 1 2 2 end label values response response label def response 1 "1", modify label def response 2 "2", modify label values predictor1 predictor1 label def predictor1 1 "0", modify label def predictor1 2 "1", modify label values predictor2 predictor2 label def predictor2 1 "1", modify label def predictor2 2 "2", modify label values predictor3 predictor3 label def predictor3 1 "1", modify label def predictor3 2 "2", modify label values predictor4 predictor4 label def predictor4 1 "1", modify label def predictor4 2 "2", modify label values predictor5 predictor5 label def predictor5 1 "1", modify label def predictor5 2 "2", modify label values predictor6 predictor6 label def predictor6 1 "1", modify label def predictor6 2 "2", modify label values predictor7 predictor7 label def predictor7 1 "1", modify label def predictor7 2 "2", modify label values predictor8 predictor8 label def predictor8 1 "1", modify label def predictor8 2 "2", modify label values predictor9 predictor9 label def predictor9 1 "1", modify label def predictor9 2 "2", modify label values predictor10 predictor10 label def predictor10 1 "1", modify label def predictor10 2 "2", modify label values predictor11 predictor11 label def predictor11 1 "1", modify label def predictor11 2 "2", modify

        Comment


        • #5
          I used the following code: firthlogit response i.predictor1 predictor2 predictor3 predictor4 predictor5 predictor6 predictor7 predictor8 predictor9 predictor10 predictor11 Above I have inserted 100 sample example from my data. Am I doing something wrong?

          Comment


          • #6
            Your sample data are unreadable. Please try again and use the Preview feature to check that line breaks define separate lines.

            Comment


            • #7
              As Nick said, no one can really read your example well, but it seems as if every one of your variables takes on values of 1 or 2, including the response variable.

              There is a passage in the help file for the official command, logit:
              depvar equal to nonzero and nonmissing (typically depvar equal to one) indicates a positive outcome, whereas depvar equal to zero indicates a negative outcome.
              This applies to the user-written command firthlogit (SSC), too.

              Also, read the help file for Stata's factor variable notation. Type
              Code:
              help factor variables
              at the command line.

              Afterward, I'm guessing that you'll end up with something like
              Code:
              generate byte response2 = response == 2
              firthlogit response2 i.(predictor*), nolog

              Comment


              • #8
                Thanks Coveney. I encoded string as numericals. Then I changed response to 1- positive outcome and 0- negative outcome. Similarly for the predictors I changed 1- present 0-absent. Then I tried firthlogit response i.predictor1 i.predictor2 i.predictor3 i.predictor4 i.predictor5 i.predictor6 i.predictor7 i.predictor8 i.predictor9 i.predictor10 i.predictor11.

                Comment


                • #9
                  outcome Coef. Std. Err. z P>z [95% Conf. Interval] predictor1 1 2.503442 .7247074 3.45 0.001 1.083042 3.923843 predictor2 1 -.1585433 .5442104 -0.29 0.771 -1.225176 .9080895 predictor3 1 1.830119 .623115 2.94 0.003 .608836 3.051402 predictor4 1 -.8968628 .7170882 -1.25 0.211 -2.30233 .5086042 predictor5 1 -.1005266 .6822045 -0.15 0.883 -1.437623 1.23657 predictor6 1 .6214674 .6967634 0.89 0.372 -.7441637 1.987099 predictor7 1 -.3120722 .5484386 -0.57 0.569 -1.386992 .7628477 predictor8 1 -.0808703 .5309698 -0.15 0.879 -1.121552 .9598113 predictor9 1 .2397349 .656976 0.36 0.715 -1.047915 1.527384 predictor10 1 .052267 .5521066 0.09 0.925 -1.029842 1.134376 predictor11 1 3.941909 1.549027 2.54 0.011 .9058712 6.977947 _cons -7.685536 1.940974 -3.96 0.000 -11.48978 -3.881297 .

                  Comment


                  • #10
                    Sorry. Somehow I cannot post my results as a table. Is this 'P>z' column the p value?

                    Comment


                    • #11
                      Yes.

                      I kindly recommend you post the commands and output under CODE delimiters, as recommend in the FAQ. This way, they will be easily readable. Thanks.
                      Best regards,

                      Marcos

                      Comment


                      • #12
                        Originally posted by john jose View Post
                        Sorry. Somehow I cannot post my results as a table. Is this 'P>z' column the p value?
                        If you're using firthlogit because of separation or quasiseparation, then as described in the help file you shouldn't be using Wald tests. Use the likelihood-ratio test as shown in the command's help file and ancillary files.

                        Comment


                        • #13
                          Thanks Coveney & Almeida. The reason I was using firth regression was because I had a data size of 600 and events occurred in only 17 (rare outcome). There are no null counts.

                          Comment


                          • #14
                            I installed the ancillary file for firth regression. How do I open it in stata (sorry I am quite new to stata).

                            Comment


                            • #15
                              I installed the ancillary file for firth regression. How do I open it in stata (sorry I am quite new to stata).

                              Comment

                              Working...
                              X