Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logit and 'convergence not achieved'

    Hello Statalist users,

    I am getting a strange instance of the error "Convergence not achieved" which does not occur with my controls for my whole analytical sample, but does occur when examining educational subgroups.

    The controls which work are as follows:

    Code:
    logit lfs sex##survmnth##loneyg i.noc_40 i.naics_21 i.ftpt i.cowmain c.tenlfs i.age_12 i.marstat i.immig i.prov i.edu [pweight=finalwt], or
    But they break down under a couple of conditions. Below are the controls applied to a subgroup (those with high school education or less):

    Code:
    logit lfs sex##survmnth##loneyg i.noc_40 i.naics_21 i.ftpt i.cowmain c.tenlfs i.age_12 i.marstat i.immig i.prov if edu==0 [pweight=finalwt], or
    With this code, Stata iterates until the "Convergence" error occurs. The option -difficult- does not help, and if I knew how many iterations to expect, I would run them, but it exceeds the number of observations in the subsample. If I knew how many to expect, I would also simply use the option "iterate(number)".

    There seem to be two offending elements in the second line of code that may be responsible.

    1) Without "naics_21" (21 occupational categories from the North American Industrial Classification System), Stata gives outputs with no issue. Curiously, if I prefix it with c. (instead of i.), the outputs come back with no issue, but of course it does not seem like an appropriate solution to treat "naics_21" as continuous.

    2) On the other hand, it is the subsample ("if edu==0") which also triggers numerous "iterations" in output. When this is removed (as well as "edu") it comes back fine, but this is not substantively very different from the first line of code in this post, and so is not meaningful or useful.

    Is it possible either to find out how many iterations to expect, or to run the above controls for the selected subgroup without causing all these iterations?

  • #2
    I don't know the answer for certain, but I can think of two possibilities. One is that perhaps there is very little outcome variation in the edu == 1 subset. How does -tab lfs if e(sample)- (run after the logit finally gives up) look? Another question is, in the output table that you finally get from -logit- (which is not usable as a solution to the problem but can be informative for diagnosing problems like this) are there any variables where the coefficient or standard error is missing or has absurdly large magnitude values (whether positive or negative)? Those often indicate troublesome explanatory variables. A variable can be troublesome in a subset if it is nearly perfectly predictive there even though in the full sample it is not.

    If this doesn't help and if nobody else responds with a more effective answer, I would recommend you post the output that logit gave you (omitting the iteration log) but showing all other messages, and the full regression table including its header. Sometimes there are clues there.

    Comment


    • #3
      Here is the output for -tab lfs if e(sample)- below, after the logit exhausts itself.

      Code:
              lfs |      Freq.     Percent        Cum.
      ------------+-----------------------------------
              not |        351       24.95       24.95
         Employed |      1,056       75.05      100.00
      ------------+-----------------------------------
            Total |      1,407      100.00
      As to the second question: there are indeed absurdly large magnitude values for coefficients and standard errors. Here is the output for the logit below.

      Code:
      note: 1.noc_40 != 0 predicts success perfectly;
            1.noc_40 omitted and 1 obs not used.
      
      note: 2.noc_40 != 0 predicts success perfectly;
            2.noc_40 omitted and 12 obs not used.
      
      note: 12.noc_40 != 0 predicts success perfectly;
            12.noc_40 omitted and 6 obs not used.
      
      note: 19.noc_40 != 0 predicts success perfectly;
            19.noc_40 omitted and 3 obs not used.
      
      note: 2.naics_21 != 0 predicts failure perfectly;
            2.naics_21 omitted and 2 obs not used.
      
      note: 13.naics_21 != 0 predicts success perfectly;
            13.naics_21 omitted and 23 obs not used.
      
      note: 21.naics_21 != 0 predicts success perfectly;
            21.naics_21 omitted and 41 obs not used.
      
      note: 3.cowmain != 0 predicts success perfectly;
            3.cowmain omitted and 26 obs not used.
      
      note: 4.cowmain != 0 predicts success perfectly;
            4.cowmain omitted and 19 obs not used.
      
      Logistic regression                                     Number of obs =  1,407
                                                              Wald chi2(89) =      .
                                                              Prob > chi2   =      .
      Log pseudolikelihood = -163993.8                        Pseudo R2     = 0.3044
      
      --------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                       |               Robust
                                                                                   lfs | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
      ---------------------------------------------------------------------------------+----------------------------------------------------------------
                                                                                   sex |
                                                                               Female  |   .6995441   .5349821    -0.47   0.640     .1562614    3.131688
                                                                                       |
                                                                              survmnth |
                                                                                  Mar  |   .3275475   .2522893    -1.45   0.147     .0723847    1.482182
                                                                                  Apr  |   .0490638   .0357137    -4.14   0.000     .0117806    .2043403
                                                                                  May  |   .2335203   .2031665    -1.67   0.095      .042439    1.284943
                                                                                       |
                                                                          sex#survmnth |
                                                                           Female#Mar  |   1.323036   1.164614     0.32   0.750     .2356645    7.427609
                                                                           Female#Apr  |   2.887731   2.468737     1.24   0.215     .5405801      15.426
                                                                           Female#May  |   .2876588   .2785515    -1.29   0.198     .0431147    1.919243
                                                                                       |
                                                                                loneyg |
                                                               Lone parents, yg child  |   .4605332   .3875742    -0.92   0.357     .0884931    2.396694
                                                                                       |
                                                                            sex#loneyg |
                                                        Female#Lone parents, yg child  |   1.522158   1.483498     0.43   0.666     .2253601    10.28116
                                                                                       |
                                                                       survmnth#loneyg |
                                                           Mar#Lone parents, yg child  |   3.759255   4.020351     1.24   0.216     .4621474    30.57898
                                                           Apr#Lone parents, yg child  |   20.82179   24.95907     2.53   0.011     1.986934    218.1991
                                                           May#Lone parents, yg child  |   .9214646   1.113021    -0.07   0.946     .0863629    9.831732
                                                                                       |
                                                                   sex#survmnth#loneyg |
                                                    Female#Mar#Lone parents, yg child  |   .6686817   .8290576    -0.32   0.745     .0588663    7.595779
                                                    Female#Apr#Lone parents, yg child  |   .0654888   .0898653    -1.99   0.047     .0044476    .9642856
                                                    Female#May#Lone parents, yg child  |   2.879818   3.990955     0.76   0.445     .1904317    43.55027
                                                                                       |
                                                                                noc_40 |
                                                        Senior management occupations  |          1  (empty)
                                            Specialized middle management occupations  |          1  (empty)
      Middle management occupations in retail and wholesale trade and customer serv..  |   17.34758   20.14149     2.46   0.014     1.782162    168.8614
      Middle management occupations in trades, transportation, production and utili..  |   38.47737   59.16709     2.37   0.018     1.889309    783.6242
                                     Professional occupations in business and finance  |   23.42781    30.7169     2.41   0.016     1.793477    306.0325
              Administrative and financial supervisors and administrative occupations  |   11.77561   13.20844     2.20   0.028       1.3068    106.1103
                   Finance, insurance and related business administrative occupations  |   8.636436   13.99537     1.33   0.183     .3605453    206.8756
                                                           Office support occupations  |   11.64522   11.12322     2.57   0.010     1.791027    75.71703
                      Distribution, tracking and scheduling co-ordination occupations  |    3.06502    4.04801     0.85   0.396     .2302719    40.79678
                             Professional occupations in natural and applied sciences  |    38.1829   55.05665     2.53   0.012     2.262065    644.5144
                        Technical occupations related to natural and applied sciences  |   14.87765   16.82461     2.39   0.017     1.621589    136.4986
                                                  Professional occupations in nursing  |          1  (empty)
                                  Professional occupations in health (except nursing)  |   .1333804    .199972    -1.34   0.179     .0070619    2.519214
                                                      Technical occupations in health  |   2.577426   3.451454     0.71   0.480     .1867818    35.56624
                                  Assisting occupations in support of health services  |   1.067632   1.434506     0.05   0.961     .0766865    14.86362
                                       Professional occupations in education services  |   2.143288   3.339632     0.49   0.625     .1011007    45.43674
        Professional occupations in law and social, community and government services  |   7.117452   12.76229     1.09   0.274      .211854     239.118
      Paraprofessional occupations in legal, social, community and education services  |   .5282123   .6145161    -0.55   0.583      .054017    5.165195
                                 Occupations in front-line public protection services  |          1  (empty)
      Care providers and educational, legal and public protection support occupations  |   .7194621   .8977096    -0.26   0.792      .062362    8.300334
                                          Professional occupations in art and culture  |   3.994771   6.123403     0.90   0.366     .1980268    80.58607
                          Technical occupations in art, culture, recreation and sport  |   4.868091   6.596362     1.17   0.243     .3419506    69.30332
                           Retail sales supervisors and specialized sales occupations  |   23.27399   29.44919     2.49   0.013     1.949051    277.9192
                              Service supervisors and specialized service occupations  |   3.145253   3.341709     1.08   0.281     .3920009    25.23621
                  Sales representatives and salespersons - wholesale and retail trade  |   13.28969   15.37566     2.24   0.025     1.376282    128.3283
         Service representatives and other customer and personal services occupations  |   2.659485   2.755492     0.94   0.345      .349034     20.2641
                                                            Sales support occupations  |   8.290963   8.886706     1.97   0.048     1.014474    67.75934
                                Service support and other service occupations, n.e.c.  |   2.528624   2.642393     0.89   0.375     .3261257    19.60575
                                       Industrial, electrical and construction trades  |   1.147315   1.321866     0.12   0.905     .1199447     10.9745
                                           Maintenance and equipment operation trades  |   .4873786   .5872475    -0.60   0.551     .0459457    5.169969
                      Other installers, repairers and servicers and material handlers  |    7.53042   9.624366     1.58   0.114     .6150754    92.19558
          Transport and heavy equipment operation and related maintenance occupations  |   32.85529   37.59408     3.05   0.002     3.488455    309.4408
                       Trades helpers, construction labourers and related occupations  |   .5435093   .6966689    -0.48   0.634     .0440696     6.70309
      Supervisors and technical occupations in natural resources, agriculture and r..  |   1.13e+11   2.62e+11    10.94   0.000     1.18e+09    1.08e+13
                     Workers in natural resources, agriculture and related production  |   233.3595    438.288     2.90   0.004     5.879328    9262.398
                              Harvesting, landscaping and natural resources labourers  |   .9604335   1.506756    -0.03   0.979     .0443688    20.79013
      Processing, manufacturing and utilities supervisors and central control opera..  |   3.609545   4.216615     1.10   0.272     .3656723    35.62976
        Processing and manufacturing machine operators and related production workers  |   4.693355   4.254429     1.71   0.088      .794121    27.73831
                                                          Assemblers in manufacturing  |    4.70765   4.881877     1.49   0.135     .6167372    35.93421
                                 Labourers in processing, manufacturing and utilities  |          1  (omitted)
                                                                                       |
                                                                              naics_21 |
                             Forestry and logging and support activities for forestry  |          1  (empty)
                                                        Fishing, hunting and trapping  |   3.95e-10          .        .       .            .           .
                                        Mining, quarrying, and oil and gas extraction  |   1227.356   2588.956     3.37   0.001     19.65472    76643.35
                                                                         Construction  |   74.62585   111.5709     2.88   0.004     3.983669    1397.962
                                                        Manufacturing - durable goods  |   104.1612   165.7325     2.92   0.004      4.60624    2355.405
                                                    Manufacturing - non-durable goods  |   91.85928     149.02     2.79   0.005     3.821637    2207.988
                                                                      Wholesale trade  |   333.0223   567.4645     3.41   0.001     11.80429    9395.218
                                                                         Retail trade  |   81.54912   133.1837     2.69   0.007     3.320941    2002.523
                                                       Transportation and warehousing  |   50.73692   85.46613     2.33   0.020     1.868438    1377.747
                                                                Finance and insurance  |   525.6366   929.4223     3.54   0.000     16.42875     16817.7
                                                   Real estate and rental and leasing  |          1  (empty)
                                      Professional, scientific and technical services  |   73.95773   122.9317     2.59   0.010     2.845283    1922.391
                                        Business, building and other support services  |    181.355   294.8485     3.20   0.001     7.492705    4389.554
                                                                 Educational services  |   60.78326   112.3185     2.22   0.026     1.625132    2273.418
                                                    Health care and social assistance  |   499.2151   839.6662     3.69   0.000     18.47523    13489.18
                                                  Information, culture and recreation  |   28.02678   47.45054     1.97   0.049     1.014968    773.9165
                                                      Accommodation and food services  |    62.5179   101.9251     2.54   0.011     2.560122     1526.68
                                        Other services (except public administration)  |   361.1071   616.1155     3.45   0.001     12.74468    10231.59
                                                                Public administration  |          1  (empty)
                                                                                       |
                                                                                2.ftpt |   .8635246   .2109848    -0.60   0.548     .5349347    1.393955
                                                                                       |
                                                                               cowmain |
                                                             Private sector employees  |   .4667557   .3272446    -1.09   0.277     .1181159    1.844468
                                           Self-employed incorporated, with paid help  |          1  (empty)
                                             Self-employed incorporated, no paid help  |          1  (empty)
                                         Self-employed unincorporated, with paid help  |   .2249334   .2768805    -1.21   0.225     .0201499    2.510936
                                           Self-employed unincorporated, no paid help  |   1.196081   .9677199     0.22   0.825     .2449484    5.840457
                                                                                       |
                                                                                tenlfs |   1.009405   .0019731     4.79   0.000     1.005546     1.01328
                                                                                       |
                                                                                age_12 |
                                                                       30 to 34 years  |   .6280167   .2022119    -1.44   0.149     .3341158    1.180444
                                                                       35 to 39 years  |   .4926083   .1720472    -2.03   0.043     .2484352    .9767656
                                                                       40 to 44 years  |   1.037089   .4046799     0.09   0.926      .482693    2.228236
                                                                       45 to 49 years  |    4.75889   2.455082     3.02   0.002     1.731315    13.08083
                                                                       50 to 54 years  |   .4432627   .3067673    -1.18   0.240     .1141756    1.720875
                                                                                       |
                                                                               marstat |
                                                                            Separated  |    4.46145    3.56719     1.87   0.061     .9308735    21.38264
                                                                             Divorced  |    4.23225    3.50403     1.74   0.081     .8352787    21.44427
                                                                Single, never married  |   2.463139   1.921003     1.16   0.248     .5341174    11.35903
                                                                                       |
                                                                                 immig |
                                         Immigrant, landed more than 10 years earlier  |    .323313   .2413428    -1.51   0.130     .0748563    1.396426
                                                                        Non-immigrant  |    .450517   .3124952    -1.15   0.250     .1156875    1.754429
                                                                                       |
                                                                                  prov |
                                                                 Prince Edward Island  |   4.547499   2.995419     2.30   0.021     1.250517    16.53696
                                                                          Nova Scotia  |   1.258353   .6585286     0.44   0.661     .4511766    3.509605
                                                                        New Brunswick  |   .8832761   .4702095    -0.23   0.816     .3111442    2.507444
                                                                               Quebec  |   4.695723   2.205846     3.29   0.001     1.870015    11.79125
                                                                              Ontario  |   1.474982   .6204908     0.92   0.356     .6467045    3.364089
                                                                             Manitoba  |   3.408187   1.784817     2.34   0.019      1.22113    9.512288
                                                                         Saskatchewan  |   3.175139   1.632446     2.25   0.025     1.159128    8.697493
                                                                              Alberta  |   2.313089   1.065131     1.82   0.069     .9380607    5.703659
                                                                     British Columbia  |   2.575941   1.277215     1.91   0.056     .9747412     6.80742
                                                                                       |
                                                                                 _cons |   .0225189   .0488349    -1.75   0.080     .0003211     1.57936
      --------------------------------------------------------------------------------------------------------------------------------------------------
      Note: _cons estimates baseline odds.
      Note: 4 failures and 12 successes completely determined.
      convergence not achieved
      r(430);
      In this case, it seems the troublesome variables are "noc_40" and "naics_21". They don't pose any trouble for logits with the main sample, but create significant trouble when used as independent variables for a logit in a subset of the sample.

      So then, I wonder if there is any way to usefully retain these variables as controls when running logits for subsamples. If not, I imagine I should remove it from the main sample's (and subsamples') controls also, for the sake of consistency.

      Comment


      • #4
        The reason those two variables are giving you trouble in the subsample but not in the whole sample is that they both have a large number of levels. It is one thing to take your entire sample and chop it up into a large number of pieces like that. But it is quite another to do that with a subsample of ~1400 observations. So, just as some of the levels of those variables are empty and get omitted, there will be some that have only a handful of observations. And if all but one or two of that handful turn out to have a 0 outcome, or if all but one have a 1 outcome, then you have near perfect prediction in a small group--the maximum likelihood estimate of the odds ratio for such a small group will be "close to infinity" if the outcomes are mostly 1, and "close to negative infinity" if mostly 0. And when Stata tries to hone in on the exact values of lots of these problematic coefficients, it ultimately gives up because the task requires more precision than can be squeezed out of the available data. When you don't restrict to edu == 1, the sample is, apparently, large enough to avoid these problems (or avoid enough of them that Stata doesn't quit out of desperation).

        Rather than omitting these variables, my thought would be, if possible, to coarsen those variables so they have fewer levels, by combining levels that are related. Could you, for example, combine Manufacturing - durable goods and Manufacturing non-durable goods into a single category of Manufacturing without doing violence to the theory underlying how these variables are related to your outcome? If you could reduce each of these variables to just, say half a dozen levels by grouping the current levels into coherent bunches, you would probably have no difficulty working in your edu == 1 subsample. And you could rerun the model for the whole sample using these reduced variables as well so everything is modeled the same way. I don't know if this approach is feasible. But if it is, that's what I would do.
        Last edited by Clyde Schechter; 14 Apr 2022, 20:04.

        Comment


        • #5
          I see what you mean about the fineness of the troublesome variables/the number of categories in them. I surmised a solution along the lines you have described in #4, using the variable "noc_10", which reduces the 40 categories of "noc_40" down to a mere 10. This seems to work for all my subsamples, and returns non-absurd values for the coefficients and standard errors of "noc_10".

          So it seems (knock on wood), I've settled my issues with these controls! Many thanks for your help, most appreciated.

          Comment

          Working...
          X