Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logit message meaning requested

    I am unable to post a data sample, as I am getting the following message:
    dataex in 1/101
    input statement exceeds linesize limit. Try specifying fewer variables
    r(1000);




    I undertook the following logistic regression:

    logit AssuredOrNot Words BoilerWords Fog HardInfoMix1000 RedundantWords Specificity1000 Polarity1000 Subjectivity1000 LnTotalAssets roa lev GovSysAtt Indpres FX_INC SP_ESG_SCORE INSTOWN_PERCENT ESI ib0.NumIndsector Fiscalyear, vce(cluster CompName) baselevels

    where NumIndsector is an indicator variable with levels 0-9:
    Indsectorlabel:
    0 Communication Services
    1 Consumer Discretionary
    2 Energy
    3 Industrials
    4 Health Care
    5 Information Technology
    6 Financials
    7 Materials
    8 Consumer Staples
    9 Utilities
    10 (Invalid Identifier)

    The tab command provides the following:

    tab NumIndsector
    tab NumIndsector
    Ind sector Freq. Percent Cum.
    Communication Services 38 3.08 3.08
    Consumer Discretionary 102 8.27 11.35
    Energy 89 7.22 18.57
    Industrials 187 15.17 33.74
    Health Care 161 13.06 46.80
    Information Technology 152 12.33 59.12
    Financials 189 15.33 74.45
    Materials 75 6.08 80.54
    Consumer Staples 126 10.22 90.75
    Utilities 114 9.25 100.00
    Total 1,233 100.00

    I am receiving the following message in the output, and, I would be very grateful if someone would please let me know what '0.NumIndsector != 0 predicts failure perfectly' means and how I should interpret it:

    note: 0.NumIndsector != 0 predicts failure perfectly;
    0.NumIndsector omitted and 37 obs not used.
    note: 9.NumIndsector omitted because of collinearity.
    Iteration 0: log pseudolikelihood = -589.832
    Iteration 1: log pseudolikelihood = -456.34971
    Iteration 2: log pseudolikelihood = -439.17782
    Iteration 3: log pseudolikelihood = -438.79576
    Iteration 4: log pseudolikelihood = -438.79559
    Iteration 5: log pseudolikelihood = -438.79559
    Logistic regression Number of obs = 1,171
    Wald chi2(26) = 107.06
    Prob > chi2 = 0.0000
    Log pseudolikelihood = -438.79559 Pseudo R2 = 0.2561
    (Std. err. adjusted for 186 clusters in CompName)
    Robust
    AssuredOrNot Coefficient std. err. z P>z [95% conf. interval]
    Words .0000148 .0000111 1.34 0.182 -6.93e-06 .0000366
    BoilerWords .1125526 2.702291 0.04 0.967 -5.183841 5.408946
    Fog -.0051466 .0569717 -0.09 0.928 -.1168091 .106516
    HardInfoMix1000 .0228301 .0080042 2.85 0.004 .0071422 .0385181
    RedundantWords 3.532224 8.404199 0.42 0.674 -12.9397 20.00415
    Specificity1000 -.0364986 .0115707 -3.15 0.002 -.0591767 -.0138204
    Polarity1000 -.0250452 .0077966 -3.21 0.001 -.0403262 -.0097642
    Subjectivity1000 -.0073231 .0046093 -1.59 0.112 -.0163571 .001711
    LnTotalAssets .223048 .1701327 1.31 0.190 -.1104059 .5565019
    roa 3.545522 2.142119 1.66 0.098 -.6529535 7.743997
    leverage -.9055115 1.217426 -0.74 0.457 -3.291623 1.4806
    GovSysAtt .5243119 .2818438 1.86 0.063 -.0280918 1.076716
    Indpres -.3471263 .0902281 -3.85 0.000 -.5239701 -.1702825
    FX_INC 2.68e-06 1.72e-06 1.56 0.119 -6.85e-07 6.04e-06
    SP_ESG_SCORE .0123406 .0099661 1.24 0.216 -.0071925 .0318738
    INSTOWN_PERCENT -.0119581 .0270028 -0.44 0.658 -.0648827 .0409664
    ESI -.1851819 .624183 -0.30 0.767 -1.408558 1.038194
    NumIndsector
    Communication Services 0 (empty)
    Consumer Discretionary 3.439047 1.510276 2.28 0.023 .4789607 6.399133
    Energy 2.107597 1.255424 1.68 0.093 -.352989 4.568183
    Industrials 2.482482 1.472968 1.69 0.092 -.4044817 5.369447
    Health Care .982115 1.384626 0.71 0.478 -1.731702 3.695933
    Information Technology .4398975 1.439663 0.31 0.760 -2.381791 3.261586
    Financials .8451889 1.567424 0.54 0.590 -2.226905 3.917283
    Materials 1.804325 1.3748 1.31 0.189 -.8902331 4.498882
    Consumer Staples 2.030461 1.370176 1.48 0.138 -.6550339 4.715956
    Utilities 0 (omitted)
    Fiscalyear .1213966 .0554497 2.19 0.029 .0127171 .230076
    _cons -242.4183 111.6316 -2.17 0.030 -461.2121 -23.6244
    .
    end of do-file

  • #2
    Sunita:
    logistic regression works well of the regressand is composed of 0 and 1. If, for any reason, a given predictor causes the regressand to be always 0 or 1, the MLE amchinery cannot work.
    Hence, the cuplrit level of the categorical predictor is omitted along with the observations related to it.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Every observation in Communication Services has AssuredOrNot==0 so it is not possible to estimate a coefficient for Communication Services in a logistic model (all failures implies the coefficient is negative infinity).

      Because of that, the Communication Services observations are omitted from the model.

      Comment


      • #4
        Thank you so much! This is so helpful.

        Comment

        Working...
        X