How to Solve Ommited because of Collinearity?

Zahra Annisa

Join Date: Jul 2024
Posts: 8

How to Solve Ommited because of Collinearity?

16 Jul 2024, 07:12

Hi, everyone.

I am doing a linear regression with 15 dummy variables.
When I did the command for regress, it showed that all of my dummy variables, especially on the last category of every dummy variables, are ommited because of collinearity.
Why is that happened and how to solved it?

Code:

. regress setara_hak prs_ayah dukunghub_ayah libat_ayah fisik_bayi fisik_umum hub_ibuayah peng_ayah ibu_kakek ayah_kakek ibn.gender ibn.usia ibn.prodi ibn.smt ibn.domisili ibn.SD ibn.SMP ibn.SMA ibn.ukt ibn.uang_jjn ibn.pddkn_ayah ibn.psntrn_ayah ibn.pddkn_ibu ibn.psntrn_ibu ibn.aff_agama 
note: 2.gender omitted because of collinearity
note: 24.usia omitted because of collinearity
note: 2.prodi omitted because of collinearity
note: 8.smt omitted because of collinearity
note: 2.domisili omitted because of collinearity
note: 2.SD omitted because of collinearity
note: 2.SMP omitted because of collinearity
note: 3.SMA omitted because of collinearity
note: 7.ukt omitted because of collinearity
note: 4.uang_jjn omitted because of collinearity
note: 6.pddkn_ayah omitted because of collinearity
note: 2.psntrn_ayah omitted because of collinearity
note: 6.pddkn_ibu omitted because of collinearity
note: 2.psntrn_ibu omitted because of collinearity
note: 6.aff_agama omitted because of collinearity

Code:

      Source |       SS       df       MS              Number of obs =     315
-------------+------------------------------           F( 52,   262) =    1.18
       Model |  59.5815843    52   1.1457997           Prob > F      =  0.2029
    Residual |  254.418416   262  .971062656           R-squared     =  0.1898
-------------+------------------------------           Adj R-squared =  0.0289
       Total |         314   314           1           Root MSE      =  .98543

--------------------------------------------------------------------------------
    setara_hak |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
      prs_ayah |  -.2095765   .1230398    -1.70   0.090    -.4518491    .0326962
dukunghub_ayah |   .0630295   .1129592     0.56   0.577     -.159394     .285453
    libat_ayah |   .1737381   .1178075     1.47   0.141     -.058232    .4057081
    fisik_bayi |  -.0709772   .0756123    -0.94   0.349    -.2198623     .077908
    fisik_umum |  -.0494432   .0727011    -0.68   0.497    -.1925959    .0937095
   hub_ibuayah |   .0784077   .1172691     0.67   0.504    -.1525021    .3093175
     peng_ayah |   .1246556    .075395     1.65   0.099    -.0238018    .2731129
     ibu_kakek |   .0210194    .068921     0.30   0.761    -.1146901    .1567289
    ayah_kakek |  -.2258749   .0761587    -2.97   0.003    -.3758359   -.0759139
               |
        gender |
            1  |  -.2315363   .1314557    -1.76   0.079    -.4903804    .0273078
            2  |          0  (omitted)
               |
          usia |
           18  |   .7021237   .8391361     0.84   0.404    -.9501854    2.354433
           19  |   .3517069   .8133291     0.43   0.666    -1.249787      1.9532
           20  |   .5562574    .792089     0.70   0.483    -1.003413    2.115928
           21  |   .5096164    .760594     0.67   0.503    -.9880385    2.007271
           22  |   .2072689   .7644801     0.27   0.787    -1.298038    1.712576
           23  |   .7572285   .7959871     0.95   0.342    -.8101176    2.324575
           24  |          0  (omitted)
               |
         prodi |
            0  |  -.4330367   .1501185    -2.88   0.004    -.7286289   -.1374445
            1  |   -.344707   .1450419    -2.38   0.018    -.6303031   -.0591109
            2  |          0  (omitted)

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35439
#2

16 Jul 2024, 07:40

There is nothing to solve here -- unless you

* don't follow that omitting at least one level is always going to happen

* prefer now to remove some predictors as possibly not helpful, which might have the side effect of fewer predictors being omitted for collinearity

* want different binary predictors to be omitted, in which case you need to spell that out using factor variable notation.

* want to examine whether some indicators (you say dummies) are positive for very small subsamples so that groups might be better amalgamated.
1 like
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#3

16 Jul 2024, 07:40

Lets take gender and assume there are only 2 (men and women. In that case the coefficient of an indicator (dummy) variable for men tells you the difference between that category and a reference category (in this case men). If you add another indicator variable for men, than that would tell you the difference between men and the reference category (men), which makes no sense. So your problem was that you wanted a coefficient for all categories, which (normally) makes no sense. As always there are exceptions, but learning those should not be the priority when you are surprised this output. First understand the "standard" way of using indicator variables, and only afterwards start learning the exceptions. So don't use the ibn. prefix, just use the i. prefix for categorical variables.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
4 likes
Comment
Zahra Annisa

Join Date: Jul 2024

Posts: 8
#4

18 Jul 2024, 08:12

Thank you for the reply and very clear explanation from Nick Cox and Maarten Buis. It helps me a lot
Comment

Announcement

How to Solve Ommited because of Collinearity?

Comment

Comment

Comment