Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins become inestimable

    Dear Statalist users,

    I have had a logit with which I have been working for some weeks now. But for some unclear reason it has become non-functioning, while the same model with a recode of the dependent variable works fine. (Both are a recode of the same labour force status variable, "lfsstat" - for the troublesome one, "lfs": 1 = 1 | 2, while 0 = 3 | 4; in the functioning one, "lfs1": 1 = 1, while 0 = 2 | 3 | 4). The troublesome logit (and margins commands), with outputs, are immediately below.

    Code:
    logit lfs sex##survmnth##loneyg if edu==2 [pweight=finalwt], or
    margins loneyg [pweight=finalwt], atmeans
    margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex)
    
    
    . logit lfs sex##survmnth##loneyg if edu==2 [pweight=finalwt], or
    
    note: 1.sex#2.survmnth != 0 predicts success perfectly;
          1.sex#2.survmnth omitted and 69 obs not used.
    
    note: 2.sex#5.survmnth omitted because of collinearity.
    note: 2.sex#5.survmnth#1.loneyg omitted because of collinearity.
    Iteration 0:   log pseudolikelihood = -157751.54  
    Iteration 1:   log pseudolikelihood = -151719.07  
    Iteration 2:   log pseudolikelihood =  -150521.5  
    Iteration 3:   log pseudolikelihood = -150491.63  
    Iteration 4:   log pseudolikelihood = -150491.33  
    Iteration 5:   log pseudolikelihood = -150491.33  
    
    Logistic regression                                     Number of obs =  1,275
                                                            Wald chi2(13) =  24.96
                                                            Prob > chi2   = 0.0234
    Log pseudolikelihood = -150491.33                       Pseudo R2     = 0.0460
    
    ----------------------------------------------------------------------------------------------------
                                       |               Robust
                                   lfs | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
    -----------------------------------+----------------------------------------------------------------
                                   sex |
                               Female  |   .1029113   .1109299    -2.11   0.035     .0124434    .8511155
                                       |
                              survmnth |
                                  Mar  |   .0498319    .066795    -2.24   0.025     .0036021    .6893888
                                  Apr  |    .025142   .0316874    -2.92   0.003     .0021262    .2973024
                                  May  |   .5296605    .281006    -1.20   0.231     .1872411    1.498283
                                       |
                          sex#survmnth |
                             Male#Feb  |          1  (empty)
                           Female#Mar  |   9.421238   12.45168     1.70   0.090     .7064943     125.634
                           Female#Apr  |   16.90247    20.8673     2.29   0.022     1.503425    190.0285
                           Female#May  |          1  (omitted)
                                       |
                                loneyg |
               Lone parents, yg child  |   .2630576   .4563918    -0.77   0.441     .0087752    7.885751
                                       |
                            sex#loneyg |
        Female#Lone parents, yg child  |   2.712564   4.343995     0.62   0.533     .1175536    62.59275
                                       |
                       survmnth#loneyg |
           Mar#Lone parents, yg child  |    1.88169   4.047537     0.29   0.769     .0277718    127.4947
           Apr#Lone parents, yg child  |   11.26953   23.96171     1.14   0.255     .1746014    727.3838
           May#Lone parents, yg child  |   .4649369   .3974549    -0.90   0.370     .0870438     2.48342
                                       |
                   sex#survmnth#loneyg |
     Male#Feb#Lone parents, old child  |          1  (empty)
      Male#Feb#Lone parents, yg child  |          1  (empty)
    Female#Mar#Lone parents, yg child  |   1.022111   2.187296     0.01   0.992     .0154151    67.77209
    Female#Apr#Lone parents, yg child  |   .0507766     .10542    -1.44   0.151     .0008678      2.9709
    Female#May#Lone parents, yg child  |          1  (omitted)
                                       |
                                 _cons |   179.3486   206.0314     4.52   0.000     18.87375    1704.268
    ----------------------------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    
    . margins loneyg [pweight=finalwt], atmeans
    
    Adjusted predictions                                     Number of obs = 1,275
    Model VCE: Robust
    
    Expression: Pr(lfs), predict()
    At: 1.sex      = .1505247 (mean)
        2.sex      = .8494753 (mean)
        2.survmnth =  .209814 (mean)
        3.survmnth = .2792208 (mean)
        4.survmnth = .2633722 (mean)
        5.survmnth = .2475929 (mean)
        0.loneyg   = .7147883 (mean)
        1.loneyg   = .2852117 (mean)
    
    ------------------------------------------------------------------------------------------
                             |            Delta-method
                             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------------------+----------------------------------------------------------------
                      loneyg |
    Lone parents, old child  |          .  (not estimable)
     Lone parents, yg child  |          .  (not estimable)
    ------------------------------------------------------------------------------------------
    
    . margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex)
    
    Conditional marginal effects                             Number of obs = 1,275
    Model VCE: Robust
    
    Expression: Pr(lfs), predict()
    dy/dx wrt:  2.sex
    1._at: 1.sex    = .1505247 (mean)
           2.sex    = .8494753 (mean)
           survmnth =        2
           0.loneyg = .7147883 (mean)
           1.loneyg = .2852117 (mean)
    2._at: 1.sex    = .1505247 (mean)
           2.sex    = .8494753 (mean)
           survmnth =        3
           0.loneyg = .7147883 (mean)
           1.loneyg = .2852117 (mean)
    3._at: 1.sex    = .1505247 (mean)
           2.sex    = .8494753 (mean)
           survmnth =        4
           0.loneyg = .7147883 (mean)
           1.loneyg = .2852117 (mean)
    4._at: 1.sex    = .1505247 (mean)
           2.sex    = .8494753 (mean)
           survmnth =        5
           0.loneyg = .7147883 (mean)
           1.loneyg = .2852117 (mean)
    
    --------------------------------------------------------------------------------------------
                               |            Delta-method
                               |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    ---------------------------+----------------------------------------------------------------
    1.sex                      |  (base outcome)
    ---------------------------+----------------------------------------------------------------
    2.sex                      |
                    _at#loneyg |
    1#Lone parents, old child  |          .  (not estimable)
     1#Lone parents, yg child  |          .  (not estimable)
    2#Lone parents, old child  |          .  (not estimable)
     2#Lone parents, yg child  |          .  (not estimable)
    3#Lone parents, old child  |          .  (not estimable)
     3#Lone parents, yg child  |          .  (not estimable)
    4#Lone parents, old child  |          .  (not estimable)
     4#Lone parents, yg child  |          .  (not estimable)
    --------------------------------------------------------------------------------------------
    Note: dy/dx for factor levels is the discrete change from the base level.
    
    .
    All the same as the above code, but with more restricted dependent variable (lfs1), is immediately below.

    Code:
    logit lfs1 sex##survmnth##loneyg if edu==2 [pweight=finalwt], or
    margins loneyg [pweight=finalwt], atmeans
    margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex)
    
    Iteration 0:   log pseudolikelihood = -267248.24  
    Iteration 1:   log pseudolikelihood =  -252916.2  
    Iteration 2:   log pseudolikelihood =  -252160.9  
    Iteration 3:   log pseudolikelihood = -252137.97  
    Iteration 4:   log pseudolikelihood = -252137.76  
    Iteration 5:   log pseudolikelihood = -252137.76  
    
    Logistic regression                                     Number of obs =  1,344
                                                            Wald chi2(15) =  46.32
                                                            Prob > chi2   = 0.0000
    Log pseudolikelihood = -252137.76                       Pseudo R2     = 0.0565
    
    ----------------------------------------------------------------------------------------------------
                                       |               Robust
                                  lfs1 | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
    -----------------------------------+----------------------------------------------------------------
                                   sex |
                               Female  |   .4293814   .3171548    -1.14   0.252     .1009529    1.826282
                                       |
                              survmnth |
                                  Mar  |   .1535048     .12249    -2.35   0.019     .0321293    .7334032
                                  Apr  |   .1617239   .1287386    -2.29   0.022     .0339769     .769776
                                  May  |   .3252317   .3111625    -1.17   0.240      .049867    2.121154
                                       |
                          sex#survmnth |
                           Female#Mar  |   1.745711   1.520462     0.64   0.522     .3166658    9.623731
                           Female#Apr  |   3.286283   2.883827     1.36   0.175     .5884994    18.35117
                           Female#May  |   2.421548   2.502893     0.86   0.392     .3193728    18.36066
                                       |
                                loneyg |
               Lone parents, yg child  |   2.579623   3.264031     0.75   0.454     .2160328    30.80299
                                       |
                            sex#loneyg |
        Female#Lone parents, yg child  |    .197058   .2630922    -1.22   0.224     .0143931    2.697949
                                       |
                       survmnth#loneyg |
           Mar#Lone parents, yg child  |   .3008346   .4618945    -0.78   0.434     .0148393    6.098779
           Apr#Lone parents, yg child  |   1.969173   3.417777     0.39   0.696        .0656    59.11043
           May#Lone parents, yg child  |   .1082959   .1880019    -1.28   0.200     .0036052    3.253111
                                       |
                   sex#survmnth#loneyg |
    Female#Mar#Lone parents, yg child  |   6.021667   9.863883     1.10   0.273     .2428807    149.2933
    Female#Apr#Lone parents, yg child  |   .4975627   .9112254    -0.38   0.703     .0137397    18.01854
    Female#May#Lone parents, yg child  |   4.552324   8.388058     0.82   0.411     .1229757    168.5183
                                       |
                                 _cons |   16.27199   11.08744     4.09   0.000      4.28003    61.86347
    ----------------------------------------------------------------------------------------------------
    
    Adjusted predictions                                     Number of obs = 1,344
    Model VCE: Robust
    
    Expression: Pr(lfs1), predict()
    At: 1.sex      =  .193045 (mean)
        2.sex      =  .806955 (mean)
        2.survmnth = .2493665 (mean)
        3.survmnth = .2652445 (mean)
        4.survmnth = .2501892 (mean)
        5.survmnth = .2351997 (mean)
        0.loneyg   = .7181926 (mean)
        1.loneyg   = .2818074 (mean)
    
    ------------------------------------------------------------------------------------------
                             |            Delta-method
                             |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    -------------------------+----------------------------------------------------------------
                      loneyg |
    Lone parents, old child  |   .8049439   .0177846    45.26   0.000     .7700867     .839801
     Lone parents, yg child  |   .7137239    .033422    21.35   0.000      .648218    .7792297
    ------------------------------------------------------------------------------------------
    
    . margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex)
    
    Conditional marginal effects                             Number of obs = 1,344
    Model VCE: Robust
    
    Expression: Pr(lfs1), predict()
    dy/dx wrt:  2.sex
    1._at: 1.sex    =  .193045 (mean)
           2.sex    =  .806955 (mean)
           survmnth =        2
           0.loneyg = .7181926 (mean)
           1.loneyg = .2818074 (mean)
    2._at: 1.sex    =  .193045 (mean)
           2.sex    =  .806955 (mean)
           survmnth =        3
           0.loneyg = .7181926 (mean)
           1.loneyg = .2818074 (mean)
    3._at: 1.sex    =  .193045 (mean)
           2.sex    =  .806955 (mean)
           survmnth =        4
           0.loneyg = .7181926 (mean)
           1.loneyg = .2818074 (mean)
    4._at: 1.sex    =  .193045 (mean)
           2.sex    =  .806955 (mean)
           survmnth =        5
           0.loneyg = .7181926 (mean)
           1.loneyg = .2818074 (mean)
    
    --------------------------------------------------------------------------------------------
                               |            Delta-method
                               |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    ---------------------------+----------------------------------------------------------------
    1.sex                      |  (base outcome)
    ---------------------------+----------------------------------------------------------------
    2.sex                      |
                    _at#loneyg |
    1#Lone parents, old child  |   -.067308   .0485446    -1.39   0.166    -.1624537    .0278377
     1#Lone parents, yg child  |    -.19643   .0594253    -3.31   0.001    -.3129014   -.0799585
    2#Lone parents, old child  |  -.0622599   .0963162    -0.65   0.518    -.2510363    .1265164
     2#Lone parents, yg child  |  -.0267685   .1871766    -0.14   0.886    -.3936279    .3400909
    3#Lone parents, old child  |   .0631986   .0910159     0.69   0.487    -.1151893    .2415865
     3#Lone parents, yg child  |  -.2813428   .1032107    -2.73   0.006    -.4836321   -.0790536
    4#Lone parents, old child  |   .0051437   .0962974     0.05   0.957    -.1835958    .1938832
     4#Lone parents, yg child  |  -.0168634   .2516372    -0.07   0.947    -.5100632    .4763364
    --------------------------------------------------------------------------------------------
    Why do the margins work for "lfs1", but not for "lfs"? This especially has me stumped as both have worked for weeks as expected, but these commands with "lfs" are now giving useless results.

    Finally, -dataex- below (only where edu==2), for whomever it might be helpful in solving this.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(lfs lfs1) byte(sex survmnth) float loneyg
    1 1 1 2 0
    1 1 1 2 1
    1 1 2 2 0
    1 0 1 2 1
    1 1 2 2 0
    1 0 2 2 1
    1 1 1 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 1 2 0
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 0 2 2 1
    1 1 2 2 0
    1 0 2 2 1
    1 1 1 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 0 2 2 1
    1 1 1 2 1
    1 0 1 2 0
    1 0 2 2 1
    1 1 1 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 1 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 1 2 0
    1 1 2 2 0
    1 0 2 2 1
    1 1 1 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 0 2 2 1
    0 0 2 2 0
    1 1 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 0 2 2 1
    1 1 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 0 2 2 0
    1 0 2 2 0
    1 1 1 2 0
    1 1 2 2 0
    1 0 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 1 1 2 0
    0 0 2 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    0 0 2 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 1 1 2 1
    1 1 2 2 1
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 1
    1 0 2 2 0
    1 1 2 2 1
    1 1 1 2 0
    1 1 2 2 0
    1 1 2 2 0
    1 1 2 2 0
    0 0 2 2 1
    1 1 2 2 0
    1 1 1 2 0
    1 1 2 2 0
    end
    label values lfs lfs
    label def lfs 0 "not", modify
    label def lfs 1 "Employed", modify
    label values lfs1 lfs1
    label def lfs1 0 "not", modify
    label def lfs1 1 "Employed", modify
    label values sex SEX
    label def SEX 1 "Male", modify
    label def SEX 2 "Female", modify
    label values survmnth survmnth
    label def survmnth 2 "Feb", modify
    label values loneyg loneyg
    label def loneyg 0 "Lone parents, old child", modify
    label def loneyg 1 "Lone parents, yg child", modify

  • #2
    The answer can be found in the warnings that Stata is giving you in the -logit- output:
    Code:
    note: 1.sex#2.survmnth != 0 predicts success perfectly;
    1.sex#2.survmnth omitted and 69 obs not used.
    This says that because the outcome lfs = 1 whenever sex == 1 & survmnth == 2, all observations with those values of se and survmnth had to be removed from the estimation sample. Because the variable loneyg appears in the model interacted with sex and survmnth, that means that -margins- needs to look at all possible combinations of sex and survmnth to calculate the margins and marginal effects you are asking for. But because some of those are not available in the estimation sample, the calculations cannot be done. Hence "not estimable." A somewhat longer way to say that is that the margins you are asking Stata to calculate actually do not exist under this model.

    You can work around this problem by adding the -emptycells(reweight)- option to your -margins- command. That will enable -margins- to make an alternative calculation of the margins you are asking for. But the results have to be used with caution because they are not a true -margins- calculation, since they exclude a subset of the data.

    Comment


    • #3
      Hi Clyde Schechter, thanks for your insight.

      The weird thing about the warnings is that they pertain to something I thought I had resolved, namely, that the February margins are inestimable because there is no case for February where lfs = 0 among those with sex = 1 and edu = 2 (see: https://www.statalist.org/forums/for...imable-margins, where I was troubleshooting a preliminary version of this logit). But I still managed before to get outputs for the rest of the months (see the attached image).

      To my knowledge, I am using the same code as when I generated this graph in Stata, on 16 April. I'm glad I still have the graph, but of course, I can't report on it without the values of the margins outputs.

      In any case, I tried the option -emptycells(reweight)- and I show outputs below. Unfortunately no success by this route.

      Code:
      logit lfs sex##survmnth##loneyg if edu==2 [pweight=finalwt], or
      margins loneyg [pweight=finalwt], atmeans emptycells(reweight)
      margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex) emptycells(reweight)
      
      . margins loneyg [pweight=finalwt], atmeans emptycells(reweight)
      
      Adjusted predictions                                     Number of obs = 1,275
      Model VCE: Robust
      
      Expression:  Pr(lfs), predict()
      Empty cells: reweight
      At: 1.sex      = .1505247 (mean)
          2.sex      = .8494753 (mean)
          2.survmnth =  .209814 (mean)
          3.survmnth = .2792208 (mean)
          4.survmnth = .2633722 (mean)
          5.survmnth = .2475929 (mean)
          0.loneyg   = .7147883 (mean)
          1.loneyg   = .2852117 (mean)
      
      ------------------------------------------------------------------------------------------
                               |            Delta-method
                               |     Margin   std. err.      z    P>|z|     [95% conf. interval]
      -------------------------+----------------------------------------------------------------
                        loneyg |
      Lone parents, old child  |          .  (not estimable)
       Lone parents, yg child  |          .  (not estimable)
      ------------------------------------------------------------------------------------------
      
      . margins loneyg [pweight=finalwt], at (survmnth=(2 3 4 5)) atmeans dydx(sex) emptycells(reweight)
      
      Conditional marginal effects                             Number of obs = 1,275
      Model VCE: Robust
      
      Expression:  Pr(lfs), predict()
      dy/dx wrt:   2.sex
      Empty cells: reweight
      1._at: 1.sex    = .1505247 (mean)
             2.sex    = .8494753 (mean)
             survmnth =        2
             0.loneyg = .7147883 (mean)
             1.loneyg = .2852117 (mean)
      2._at: 1.sex    = .1505247 (mean)
             2.sex    = .8494753 (mean)
             survmnth =        3
             0.loneyg = .7147883 (mean)
             1.loneyg = .2852117 (mean)
      3._at: 1.sex    = .1505247 (mean)
             2.sex    = .8494753 (mean)
             survmnth =        4
             0.loneyg = .7147883 (mean)
             1.loneyg = .2852117 (mean)
      4._at: 1.sex    = .1505247 (mean)
             2.sex    = .8494753 (mean)
             survmnth =        5
             0.loneyg = .7147883 (mean)
             1.loneyg = .2852117 (mean)
      
      --------------------------------------------------------------------------------------------
                                 |            Delta-method
                                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
      ---------------------------+----------------------------------------------------------------
      1.sex                      |  (base outcome)
      ---------------------------+----------------------------------------------------------------
      2.sex                      |
                      _at#loneyg |
      1#Lone parents, old child  |          .  (not estimable)
       1#Lone parents, yg child  |          .  (not estimable)
      2#Lone parents, old child  |          .  (not estimable)
       2#Lone parents, yg child  |          .  (not estimable)
      3#Lone parents, old child  |          .  (not estimable)
       3#Lone parents, yg child  |          .  (not estimable)
      4#Lone parents, old child  |          .  (not estimable)
       4#Lone parents, yg child  |          .  (not estimable)
      --------------------------------------------------------------------------------------------
      Attached Files

      Comment


      • #4
        The weird thing about the warnings is that they pertain to something I thought I had resolved, namely, that the February margins are inestimable because there is no case for February where lfs = 0 among those with sex = 1 and edu = 2 (see: https://www.statalist.org/forums/for...imable-margins, where I was troubleshooting a preliminary version of this logit).
        Well, those -logit- messages are telling you that you didn't actually resolve that problem. It may be that you did something that appeared to resolve the problem in your entire data set, but that doesn't do so in the estimation sample. Remember that any observation that has missing values for any variable mentioned in the regression command is omitted from the estimation sample. Is it possible that you managed to get some observations with lfs = 0, sex = 1 survey month = 2 and edu = 2, but they don't have any values given for that variable finalwt?

        Comment


        • #5
          In fact, by "resolved", I meant I had taken your advice in #8 of the prior thread (https://www.statalist.org/forums/for...imable-margins), to simply forgo the February estimations (reflected in the attached picture in #3 of this thread). Which I am glad to do, except now the margins outputs for all months are gone.

          I think the issue for February is that there are no respondents, such as you have described (lfs = 0, sex = 1, survey month = 2, edu = 2). The tabulations below show the sample for such respondents, so with its negligible size, I don't think finalwt is causing the issue. I think because there are such respondents in later survey months, the code I used before was able to generate the margins and marginsplot for the image in #3. Apologies if I am misunderstanding what you've asked.

          Code:
          . . tab lfs sex if edu==2 & survmnth==2
          
                     |   Sex of respondent
                 lfs |      Male     Female |     Total
          -----------+----------------------+----------
                 not |         0         13 |        13 
            Employed |        69        277 |       346 
          -----------+----------------------+----------
               Total |        69        290 |       359 
          .

          Comment

          Working...
          X