Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Constraining a parameter estimate doesn't solve the identification issue

    Dear Statalisters,

    I'm using Stata 17.0.

    I have data clustered into 7 centers, and am doing some tests to see whether, after controlling for the (multiple) variables at the individual level and for the (single) variable at the center variable, there is center-level information left. Among such tests, I thought of a fixed-effect model, including the center-level variable with a pre-determined value (through the "constraint" option) to avoid multicollinearity forcing Stata to exclude one parameter. This is not possible: Stata excludes one parameter anyway. I prepared the simplest example to show the problem. This is the Stata output when I only include fixed-effects for center,

    Code:
    mi estimate, saving(level_only, replace) eform: logit MINI_ever_6m_extended  i.center_num
    
    Multiple-imputation estimates                   Imputations       =         70
    Logistic regression                             Number of obs     =      1,101
                                                    Average RVI       =     0.3043
                                                    Largest FMI       =     0.3202
    DF adjustment:   Large sample                   DF:     min       =     681.55
                                                            avg       =   1,879.02
                                                            max       =   2,911.00
    Model F test:       Equal FMI                   F(   6, 7047.0)   =       5.69
    Within VCE type:          OIM                   Prob > F          =     0.0000
    
    ---------------------------------------------------------------------------------------
    MINI_ever_6m_extended |     exp(b)   Std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
               center_num |
                       2  |   2.604526   .8952626     2.78   0.005     1.327127    5.111461
                       3  |   .3797197   .2548682    -1.44   0.150     .1016528    1.418426
                       4  |   3.228002   1.432298     2.64   0.008     1.351679    7.708928
                       5  |     1.3551    .599254     0.69   0.492     .5693079    3.225489
                       6  |   2.784751   1.248195     2.28   0.022     1.156235    6.706977
                       7  |   3.238115   .8276384     4.60   0.000     1.961645    5.345201
                          |
                    _cons |   .1869149   .0443115    -7.07   0.000     .1174269    .2975229
    ---------------------------------------------------------------------------------------
    where Stata behaves as expected: the first value of the categorical variable assuming k values is used as reference, so we have k-1 parameters.
    Then, I add one variable at the center level, but with the constraint to be equal to 0, so that I just pretend to add it to the model.

    Code:
    . constraint 1 GDP_K=0
    
    . mi estimate, saving(level_only, replace) eform: logit MINI_ever_6m_extended  GDP_K i.center_num, constraint(1)
    
    Multiple-imputation estimates                   Imputations       =         70
    Logistic regression                             Number of obs     =      1,101
                                                    Average RVI       =     0.3195
                                                    Largest FMI       =     0.3405
    DF adjustment:   Large sample                   DF:     min       =     602.76
                                                            avg       =   1,548.05
                                                            max       =   2,583.16
    Model F test:       Equal FMI                   F(   5, 5291.2)   =       2.77
    Within VCE type:          OIM                   Prob > F          =     0.0168
    
     ( 1)  [MINI_ever_6m_extended]GDP_K = 0
    ---------------------------------------------------------------------------------------
    MINI_ever_6m_extended |     exp(b)   Std. err.      t    P>|t|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
                    GDP_K |          1  (omitted)
                          |
               center_num |
                       2  |   .9730252   .2499621    -0.11   0.915     .5878483    1.610582
                       3  |   .1418595   .0898179    -3.08   0.002     .0409108    .4919018
                       4  |   1.205949   .4733273     0.48   0.633     .5581092    2.605788
                       5  |    .506252   .1910981    -1.80   0.071     .2414948     1.06127
                       6  |   1.040356    .410396     0.10   0.920     .4798984    2.255352
                       7  |          1  (omitted)
                          |
                    _cons |   .5003208   .0412343    -8.40   0.000      .425657    .5880813
    ---------------------------------------------------------------------------------------
    Stata behaves as if the extra-parameter had to be estimated: one other parameter has to be excluded from the regression. Why does it happen? Does the inclusion of a parameter with constrained estimate leads to some restraints (for example, due to the relationship between the related variable and the error) so that, with fixed-effect, no higher-level variable can be inserted in the model even if with a constrained coefficient?

  • #2
    After an email exchange with StataCorp, I received the solution to this issue: with the "logit" command, it is possible to use the "collinear" option, that also works with "mi estimate". This prevents Stata from checking collinearity between predictors, leaving to the user the responsibility to check that the model is identified.

    Comment

    Working...
    X