Inestimable margins

Alex McIntosh

Join Date: Mar 2022
Posts: 17

Inestimable margins

03 Apr 2022, 11:06

Hello,

I am trying to run -margins- but the dy/dx of a key independent variable (month) is coming back "inestimable", while many other similar instances of -margins- which I run, using many of the same variables, come back fine. When I run the code

Code:

logistic lfs sex##survmnth##loneyg if loneyg==1 & edu==2, or
margins i.survmnth#loneyg, dydx(sex)

I obtain:

Code:

Conditional marginal effects                               Number of obs = 416
Model VCE: OIM

Expression: Pr(lfs), predict()
dy/dx wrt:  2.sex

---------------------------------------------------------------------------------------------
                            |            Delta-method
                            |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
----------------------------+----------------------------------------------------------------
1.sex                       |  (base outcome)
----------------------------+----------------------------------------------------------------
2.sex                       |
            survmnth#loneyg |
Feb#Lone parents, yg child  |          .  (not estimable)
Mar#Lone parents, yg child  |  -.0047096   .0789112    -0.06   0.952    -.1593727    .1499536
Apr#Lone parents, yg child  |  -.1016548   .1129416    -0.90   0.368    -.3230164    .1197067
May#Lone parents, yg child  |   .0138889    .146612     0.09   0.925    -.2734654    .3012432
---------------------------------------------------------------------------------------------

How can I make the dy/dx for February estimable in this case?

I include some output from -dataex- also, for reference:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(sex survmnth) float(loneyg edu)
1 5 . 2
1 3 . 1
1 3 . 1
1 2 . 1
1 3 . 0
1 5 . 2
1 2 . 1
1 4 . 2
1 2 . 2
1 4 . 0
1 5 . 0
1 4 . 1
1 2 . 1
1 2 . 0
1 2 . 0
1 2 . 1
1 2 . 0
1 4 . 1
1 2 . 1
1 3 . 1
1 5 . 0
1 4 . 2
1 5 . 2
1 2 . 1
1 2 . 1
1 3 . 0
1 3 . 0
1 5 . 0
1 4 . 1
1 2 . 0
1 2 0 1
1 3 . 2
1 5 . 0
1 5 0 0
1 4 . 1
1 5 . 1
1 3 . 1
1 3 . 0
1 2 . 2
1 5 . 1
1 2 . 0
1 3 . 0
1 4 . 1
1 4 . 1
1 5 . 1
1 4 . 0
1 2 . 2
1 5 . 1
1 5 . 2
1 2 . 0
1 4 . 1
1 3 . 1
1 2 . 1
1 5 . 0
1 3 . 1
1 2 . 0
1 3 . 0
1 5 . 2
1 3 . 1
1 5 . 2
1 2 . 1
1 2 . 1
1 4 . 2
1 2 . 2
1 2 . 1
1 3 . 1
1 5 . 1
1 3 . 2
1 2 . 1
1 2 . 0
1 4 . 1
1 5 . 1
1 4 . 0
1 5 . 0
1 4 . 1
1 5 . 1
1 4 . 1
1 4 . 2
1 4 . 1
1 2 . 1
1 2 . 0
1 3 . 0
1 5 . 1
1 4 . 1
1 3 . 0
1 3 . 1
1 5 . 1
1 3 . 0
1 4 . 0
1 3 . 1
1 5 . 0
1 2 . 2
1 3 . 2
1 5 . 1
1 2 . 1
1 4 . 0
1 4 . 1
1 3 . 2
1 2 . 0
1 3 . 0
1 2 . 0
1 5 . 2
1 4 . 1
1 2 . 1
1 4 . 2
1 4 . 0
1 5 . 0
1 2 . 1
1 3 . 1
1 5 . 2
1 2 . 0
1 3 . 1
1 3 . 0
1 5 . 2
1 4 . 1
1 2 . 0
1 4 . 2
1 3 . 2
1 5 . 1
1 3 . 1
1 3 . 0
1 5 . 2
1 4 . 1
1 2 . 0
1 4 . 0
1 4 . 0
1 2 . 0
1 5 1 1
1 2 . 2
1 3 . 1
1 5 . 1
1 3 . 1
1 5 . 1
1 2 . 0
1 5 . 2
1 5 . 2
1 2 . 1
1 2 . 2
1 4 . 1
1 3 . 1
1 3 . 1
1 5 . 0
1 3 . 1
1 2 . 2
1 3 . 2
1 3 . 1
1 4 . 1
1 5 . 2
1 2 . 1
1 3 . 0
end
label values sex SEX
label def SEX 1 "Male", modify
label values survmnth survmnth
label def survmnth 2 "Feb", modify
label def survmnth 3 "Mar", modify
label def survmnth 4 "Apr", modify
label def survmnth 5 "May", modify
label values loneyg loneyg
label def loneyg 0 "Lone parents, old child", modify
label def loneyg 1 "Lone parents, yg child", modify
label values edu edu
label def edu 0 "(<)HS", modify
label def edu 1 "some uni/college deg/trades", modify
label def edu 2 "BA degree+", modify

Tags: None

William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

03 Apr 2022, 13:14

To understand the output of your margins command it would help to see the output of your logistic command.

Although, without seeing it, I don't understand you can include loneyg as an independent variable in the model when you restrict your estimation sample to a single value of loneyg. I expect the logistic command will show that no coefficient was estimated on loneyg, and thus in your margins command survmnth#loneyg reduces to survmnth, and February is the omitted month in the logistic model results.
Comment

Alex McIntosh

Join Date: Mar 2022
Posts: 17

03 Apr 2022, 14:51

The output of the -logistic- command

Code:

logistic lfs sex##survmnth##loneyg if loneyg==1 & edu==2

is as follows:

Code:

note: 1.sex#2.survmnth != 0 predicts success perfectly;
      1.sex#2.survmnth omitted and 12 obs not used.

note: 2.sex#5.survmnth omitted because of collinearity.
note: 1.loneyg omitted because of collinearity.
note: 2.sex#1.loneyg omitted because of collinearity.
note: 3.survmnth#1.loneyg omitted because of collinearity.
note: 4.survmnth#1.loneyg omitted because of collinearity.
note: 5.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#3.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#4.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#5.survmnth#1.loneyg omitted because of collinearity.

Logistic regression                                     Number of obs =    416
                                                        LR chi2(6)    =  20.32
                                                        Prob > chi2   = 0.0024
Log likelihood = -148.51315                             Pseudo R2     = 0.0640

----------------------------------------------------------------------------------------------------
                               lfs | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-----------------------------------+----------------------------------------------------------------
                               sex |
                           Female  |   1.085714   .9257697     0.10   0.923     .2041319    5.774579
                                   |
                          survmnth |
                              Mar  |   .6797516   .9578863    -0.27   0.784       .04294    10.76066
                              Apr  |   .4531677   .6452557    -0.56   0.578     .0278132    7.383572
                              May  |   .1982609   .1010105    -3.18   0.001     .0730406    .5381578
                                   |
                      sex#survmnth |
                         Male#Feb  |          1  (empty)
                       Female#Mar  |   .8634868   1.204715    -0.11   0.916     .0560636    13.29935
                       Female#Apr  |   .4259868   .5895851    -0.62   0.538      .028268    6.419447
                       Female#May  |          1  (omitted)
                                   |
                            loneyg |
           Lone parents, yg child  |          1  (omitted)
                                   |
                        sex#loneyg |
    Female#Lone parents, yg child  |          1  (omitted)
                                   |
                   survmnth#loneyg |
       Mar#Lone parents, yg child  |          1  (omitted)
       Apr#Lone parents, yg child  |          1  (omitted)
       May#Lone parents, yg child  |          1  (omitted)
                                   |
               sex#survmnth#loneyg |
  Male#Feb#Lone parents, yg child  |          1  (empty)
Female#Mar#Lone parents, yg child  |          1  (omitted)
Female#Apr#Lone parents, yg child  |          1  (omitted)
Female#May#Lone parents, yg child  |          1  (omitted)
                                   |
                             _cons |   17.65351   16.77018     3.02   0.003     2.742969    113.6164
----------------------------------------------------------------------------------------------------

It may be hard to see in the dataex above because of so few non-missing values, but loneyg does take the value of either 0/1 (parents of older children or younger children) - for 0 "older", n = 3,678 and for 1 "younger", n = 1,919.

Also, some output from -dataex- that is more suitable:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(sex survmnth) float(edu loneyg lfs)
1 2 1 1 1
1 2 0 0 1
1 3 0 1 1
1 3 2 0 1
1 3 0 1 1
1 4 2 0 0
1 2 2 1 1
1 3 0 0 0
1 5 2 0 1
1 4 0 1 0
1 3 1 0 1
1 2 1 0 1
1 2 0 0 1
1 2 0 1 0
1 4 2 0 1
1 2 1 1 1
1 4 2 0 1
1 5 0 1 1
1 4 2 0 1
1 2 0 1 1
1 3 1 1 1
1 5 2 0 1
1 2 1 0 1
1 5 1 1 0
1 2 0 0 1
1 3 1 1 0
1 4 1 1 1
1 3 2 0 1
1 3 1 0 1
1 4 2 0 1
1 3 0 0 1
1 2 1 1 1
1 4 0 1 1
1 5 1 1 1
1 3 1 1 1
1 2 0 0 1
1 5 1 0 0
1 3 1 0 0
1 3 1 1 0
1 3 0 0 1
1 5 1 0 1
1 4 1 0 1
1 3 0 0 1
1 2 1 0 1
1 4 0 1 1
1 2 1 0 1
1 5 2 0 1
1 2 1 0 1
1 3 1 1 1
1 2 1 0 1
1 3 1 1 1
1 4 0 0 1
1 5 1 1 1
1 2 1 0 1
1 2 0 0 1
1 3 0 0 1
1 3 1 1 1
1 4 2 0 1
1 5 2 0 0
1 3 1 0 1
1 5 0 0 1
1 3 0 0 1
1 4 1 0 1
1 3 2 0 1
1 5 1 0 1
1 3 1 0 1
1 4 1 0 0
1 3 0 0 1
1 2 1 1 1
1 2 0 0 1
1 4 2 0 1
1 4 1 1 1
1 2 0 0 1
1 3 1 1 1
1 5 1 1 1
1 5 1 1 0
1 4 2 0 1
1 2 1 1 1
1 4 1 1 1
1 2 1 0 1
1 2 2 0 1
1 4 1 0 1
1 2 1 1 1
1 3 1 0 1
1 2 1 0 1
1 5 0 1 1
1 4 1 1 1
1 5 1 0 1
1 5 1 0 0
1 4 0 1 1
1 4 2 0 1
1 5 0 1 0
1 4 2 0 1
1 4 1 0 1
1 2 1 0 1
1 3 1 0 1
1 2 2 0 1
1 5 0 0 0
1 4 1 0 1
1 5 1 0 1
end
label values sex SEX
label def SEX 1 "Male", modify
label values survmnth survmnth
label def survmnth 2 "Feb", modify
label def survmnth 3 "Mar", modify
label def survmnth 4 "Apr", modify
label def survmnth 5 "May", modify
label values edu edu
label def edu 0 "(<)HS", modify
label def edu 1 "some uni/college deg/trades", modify
label def edu 2 "BA degree+", modify
label values loneyg loneyg
label def loneyg 0 "Lone parents, old child", modify
label def loneyg 1 "Lone parents, yg child", modify
label values lfs lfs
label def lfs 0 "not", modify
label def lfs 1 "Employed", modify

Last edited by Alex McIntosh; 03 Apr 2022, 15:19. Reason: Edited for -dataex-

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

03 Apr 2022, 15:37

It may be hard to see in the dataex above because of so few non-missing values, but loneyg does take the value of either 0/1 (parents of older children or younger children) - for 0 "older", n = 3,678 and for 1 "younger", n = 1,919.

No, it was apparent in the dataex output. But in your logisitic command and its output we see clearly

Code:

logistic lfs sex##survmnth##loneyg if loneyg==1 & edu==2 ... note: 1.loneyg omitted because of collinearity.

So among the 416 observations included in your logistic regression, all of them have loneyg==1 which means loneyg is collinear with the constant and thus loneyg is omitted from the model, and thus all the interactions within which it appears are also omitted.
Comment

Alex McIntosh

Join Date: Mar 2022
Posts: 17

03 Apr 2022, 16:24

Originally posted by William Lisowski View Post

in your logisitic command and its output we see clearly

Code:

logistic lfs sex##survmnth##loneyg if loneyg==1 & edu==2
...
note: 1.loneyg omitted because of collinearity.

So among the 416 observations included in your logistic regression, all of them have loneyg==1 which means loneyg is collinear with the constant and thus loneyg is omitted from the model, and thus all the interactions within which it appears are also omitted.

So this means I should remove if loneyg? Trying this out:

Code:

logistic lfs sex##survmnth##loneyg if edu==2

obtains:

Code:

note: 1.sex#2.survmnth != 0 predicts success perfectly;
      1.sex#2.survmnth omitted and 76 obs not used.

note: 2.sex#5.survmnth omitted because of collinearity.
note: 2.sex#5.survmnth#1.loneyg omitted because of collinearity.

Logistic regression                                     Number of obs =  1,397
                                                        LR chi2(13)   =  37.54
                                                        Prob > chi2   = 0.0003
Log likelihood = -418.21098                             Pseudo R2     = 0.0430

----------------------------------------------------------------------------------------------------
                               lfs | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-----------------------------------+----------------------------------------------------------------
                               sex |
                           Female  |   .2295918   .2395319    -1.41   0.158       .02971    1.774232
                                   |
                          survmnth |
                              Mar  |   .1933405   .2335623    -1.36   0.174     .0181151    2.063504
                              Apr  |   .1001228   .1164968    -1.98   0.048     .0102359    .9793544
                              May  |   .5921053   .2465027    -1.26   0.208     .2618366    1.338959
                                   |
                      sex#survmnth |
                         Male#Feb  |          1  (empty)
                       Female#Mar  |   2.650585   3.149767     0.82   0.412     .2581275    27.21756
                       Female#Apr  |   4.205364   4.794297     1.26   0.208     .4501912     39.2835
                       Female#May  |          1  (omitted)
                                   |
                            loneyg |
           Lone parents, yg child  |   .2133211   .3088487    -1.07   0.286     .0124927    3.642612
                                   |
                        sex#loneyg |
    Female#Lone parents, yg child  |   4.728889   6.371783     1.15   0.249       .33716    66.32575
                                   |
                   survmnth#loneyg |
       Mar#Lone parents, yg child  |   3.515826   6.525732     0.68   0.498     .0924933    133.6424
       Apr#Lone parents, yg child  |   4.526121   8.322717     0.82   0.412     .1231744    166.3151
       May#Lone parents, yg child  |   .3348406   .2203067    -1.66   0.096     .0922135    1.215855
                                   |
               sex#survmnth#loneyg |
 Male#Feb#Lone parents, old child  |          1  (empty)
  Male#Feb#Lone parents, yg child  |          1  (empty)
Female#Mar#Lone parents, yg child  |   .3257722   .5970293    -0.61   0.541     .0089733    11.82704
Female#Apr#Lone parents, yg child  |   .1012961    .181636    -1.28   0.202      .003015    3.403292
Female#May#Lone parents, yg child  |          1  (omitted)
                                   |
                             _cons |   82.75556   90.41691     4.04   0.000     9.722837    704.3707
----------------------------------------------------------------------------------------------------

I tried to use if loneyg==1 to observe the employment gender gap among lone parents of children <6 (loneyg=1), with university education (edu=2). For this model, I am only trying to examine the gap among those with younger children. Not sure if I should use different syntax to go about selecting this specific subgroup for examination, but that is my goal for this model. I wonder how I might reduce collinearity while still examining this subgroup?

But the same issue of an inestimable survmnth (February) persists with this line and its output, when if loneyg==1 is removed.

For whatever reason, if my dependent variable is coded slightly more exclusively (to include only those who were employed and at work during COVID), a line of code which is otherwise the same produces output where February is estimable. Hence, the following code

Code:

logistic lfs1 sex##survmnth##loneyg if loneyg==1 & edu==2

gives the result:

Code:

logistic lfs1 sex##survmnth##loneyg if loneyg==1 & edu==2
note: 1.loneyg omitted because of collinearity.
note: 2.sex#1.loneyg omitted because of collinearity.
note: 3.survmnth#1.loneyg omitted because of collinearity.
note: 4.survmnth#1.loneyg omitted because of collinearity.
note: 5.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#3.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#4.survmnth#1.loneyg omitted because of collinearity.
note: 2.sex#5.survmnth#1.loneyg omitted because of collinearity.

Logistic regression                                     Number of obs =    428
                                                        LR chi2(7)    =  16.38
                                                        Prob > chi2   = 0.0219
Log likelihood = -253.76826                             Pseudo R2     = 0.0313

----------------------------------------------------------------------------------------------------
                              lfs1 | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-----------------------------------+----------------------------------------------------------------
                               sex |
                           Female  |   .3490911   .3729439    -0.99   0.325     .0430107    2.833358
                                   |
                          survmnth |
                              Mar  |   .2045456    .246477    -1.32   0.188     .0192794    2.170132
                              Apr  |   .7272732   1.082614    -0.21   0.831     .0393192     13.4521
                              May  |   .1818183   .2293296    -1.35   0.177     .0153464    2.154113
                                   |
                      sex#survmnth |
                       Female#Mar  |   2.507714   3.120189     0.74   0.460      .218868    28.73253
                       Female#Apr  |   .6937658   1.055278    -0.24   0.810     .0351934    13.67616
                       Female#May  |   2.005207   2.613292     0.53   0.593     .1558936     25.7923
                                   |
                            loneyg |
           Lone parents, yg child  |          1  (omitted)
                                   |
                        sex#loneyg |
    Female#Lone parents, yg child  |          1  (omitted)
                                   |
                   survmnth#loneyg |
       Mar#Lone parents, yg child  |          1  (omitted)
       Apr#Lone parents, yg child  |          1  (omitted)
       May#Lone parents, yg child  |          1  (omitted)
                                   |
               sex#survmnth#loneyg |
Female#Mar#Lone parents, yg child  |          1  (omitted)
Female#Apr#Lone parents, yg child  |          1  (omitted)
Female#May#Lone parents, yg child  |          1  (omitted)
                                   |
                             _cons |   10.99999   11.48912     2.30   0.022     1.420174    85.20071
----------------------------------------------------------------------------------------------------

So I guess at this point I have a twofold issue of 1) the original question: how to make February estimable?; and 2) how to deal with the collinearity while selecting a certain subgroup for comparison?

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

03 Apr 2022, 16:58

So this means I should remove if loneyg?
...
I tried to use if loneyg==1 to observe the employment gender gap among lone parents of children <6 (loneyg=1), with university education (edu=2). For this model, I am only trying to examine the gap among those with younger children. Not sure if I should use different syntax to go about selecting this specific subgroup for examination, but that is my goal for this model.

To me that suggests your model should perhaps be

Code:

logistic lfs sex##survmnt if loneyg==1 & edu==2, or margins i.survmnth, dydx(sex)

For whatever reason, if my dependent variable is coded slightly more exclusively (to include only those who were employed and at work during COVID), a line of code which is otherwise the same produces output where February is estimable.

With the original dependent variable you see

Code:

note: 1.sex#2.survmnth != 0 predicts success perfectly; 1.sex#2.survmnth omitted and 12 obs not used. ... Logistic regression Number of obs = 416

while with the revised dependent variable you see

Code:

Logistic regression Number of obs = 428

because it is no longer the case that all 12 observations with sex==1 and survmnth==2 have ifs==1.

When I read "for whatever reason" in your explanation, it suggests to me that you are grasping at straws, trying whatever you think of to get some results, regardless of the interpretation of what you are trying.

Your understanding of logistic regression and the interpretation of its output and the margins that result would benefit from the time spent reviewing the first three lectures in the Categorical Data Analysis course notes prepared by Richard Williams, a frequent contributor here, at https://www3.nd.edu/~rwilliam/xsoc73994/index.html.
2 likes
Comment

Alex McIntosh

Join Date: Mar 2022
Posts: 17

03 Apr 2022, 18:02

When I read "for whatever reason" in your explanation, it suggests to me that you are grasping at straws, trying whatever you think of to get some results, regardless of the interpretation of what you are trying.

Apologies, I must admit, I am not proficient at Stata. Still, I can't imagine why I would be on statalist.org if I were paying no regard to the interpretation of my results. In any case, I appreciate the recommendation of the resources from the Categorical Data Analysis course, as I am always glad to learn more.

That said, when I make use of the suggested code

Code:

logistic lfs sex##survmnt if loneyg==1 & edu==2, or

the output is

Code:

note: 1.sex#2.survmnth != 0 predicts success perfectly;
      1.sex#2.survmnth omitted and 12 obs not used.

note: 2.sex#5.survmnth omitted because of collinearity.

Logistic regression                                     Number of obs =    416
                                                        LR chi2(6)    =  20.32
                                                        Prob > chi2   = 0.0024
Log likelihood = -148.51315                             Pseudo R2     = 0.0640

------------------------------------------------------------------------------
         lfs | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   1.085714   .9257697     0.10   0.923     .2041319    5.774579
             |
    survmnth |
        Mar  |   .6797516   .9578863    -0.27   0.784       .04294    10.76066
        Apr  |   .4531677   .6452557    -0.56   0.578     .0278132    7.383572
        May  |   .1982609   .1010105    -3.18   0.001     .0730406    .5381578
             |
sex#survmnth |
   Male#Feb  |          1  (empty)
 Female#Mar  |   .8634868   1.204715    -0.11   0.916     .0560636    13.29935
 Female#Apr  |   .4259868   .5895851    -0.62   0.538      .028268    6.419447
 Female#May  |          1  (omitted)
             |
       _cons |   17.65351   16.77018     3.02   0.003     2.742969    113.6164
------------------------------------------------------------------------------

So the original issue persists, and February remains inestimable. According to

Code:

1.sex#2.survmnth omitted and 12 obs not used.

it seems those 12 observations with lfs==0 are not used here (or at least those observations where sex==1 and survmnth==2). I do not know why using the revised dependent variable would include these 12, but using the original DV excludes them (also unsure what is causing the collinearity in sex==2 & survmnth==5).

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 29809
#8

03 Apr 2022, 19:48

In the logistic regression output in #7, all of the observations for male sex and February, (1.sex#2.survmnth) have been omitted due to perfect prediction: that is, because lfs is always 1 for that combination of month and sex (at least in the subpopulation you are trying to estimate in, namely with edu == 2 and loneyg == 1). So you have no such observations in the estimation sample. Because of that, effects involving February become inestimable due to lack of information about February with respect to males. That this doesn't happen with the revised variable tells me that with the revised version of lfs there are at least some observations with male sex in February for which lfs = 0.

The fact that there are no observations with male sex in February in the estimation sample is also the reason why you cannot get a marginal effect estimate for February: the marginal effect in any month has to be averaged over males and females, but you have no males available to do that, so Stata honestly confesses that what you have asked it to do is not possible.

One option is to simply forgo trying to get the February estimate. Is it important? If you really need it, then you have to abandon using logistic regression. Two alternatives that might work for you are to use a linear probability model, or to use -firthlogit- (by Joseph Coveney, available from SSC). A linear probability model might be dicey for this data: if a large subset of the data offer near-perfect prediction, then you may have a substantial part of the data set where predicted probabilities are very close to 1. Linear probability models don't work that well near 1 or 0. So -firthlogit- might be a better bet. It fits a logistic regression model, but instead of estimating by maximum likelihood, it estimates with penalized maximum likelihood. And that enables it to tolerate perfect prediction without having to expel any observations from the estimation sample.

Added: I think the reason 2.sex#5.survmnth becomes colinear is that, but for its elimination due to perfect prediction, 1.sex#2.survmnth would ordinarily be the reference category for the sex#survmnth interaction terms. Since there are no such observations retained in the estimation sample, you are left with a complete set of indicators for all the interaction combinations, with no omitted reference category--so they are all colinear with the constant term. So one of them has to be eliminated to break that colinearity; and 2.sex#5.survmnth being the "last" one gets picked for that distinction. You will see this whenever the anticipated reference category for some group of indicators gets omitted from the estimation sample--some other catgegory must also be omitted to break the colinearity with the constant term that results.

Last edited by Clyde Schechter; 03 Apr 2022, 19:53.
3 likes
Comment

Announcement

Inestimable margins

Comment

Comment

Comment

Comment

Comment

Comment

Comment