Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Solve Ommited because of Collinearity?

    Hi, everyone.

    I am doing a linear regression with 15 dummy variables.
    When I did the command for regress, it showed that all of my dummy variables, especially on the last category of every dummy variables, are ommited because of collinearity.
    Why is that happened and how to solved it?

    Code:
    . regress setara_hak prs_ayah dukunghub_ayah libat_ayah fisik_bayi fisik_umum hub_ibuayah peng_ayah ibu_kakek ayah_kakek ibn.gender ibn.usia ibn.prodi ibn.smt ibn.domisili ibn.SD ibn.SMP ibn.SMA ibn.ukt ibn.uang_jjn ibn.pddkn_ayah ibn.psntrn_ayah ibn.pddkn_ibu ibn.psntrn_ibu ibn.aff_agama 
    note: 2.gender omitted because of collinearity
    note: 24.usia omitted because of collinearity
    note: 2.prodi omitted because of collinearity
    note: 8.smt omitted because of collinearity
    note: 2.domisili omitted because of collinearity
    note: 2.SD omitted because of collinearity
    note: 2.SMP omitted because of collinearity
    note: 3.SMA omitted because of collinearity
    note: 7.ukt omitted because of collinearity
    note: 4.uang_jjn omitted because of collinearity
    note: 6.pddkn_ayah omitted because of collinearity
    note: 2.psntrn_ayah omitted because of collinearity
    note: 6.pddkn_ibu omitted because of collinearity
    note: 2.psntrn_ibu omitted because of collinearity
    note: 6.aff_agama omitted because of collinearity
    Code:
          Source |       SS       df       MS              Number of obs =     315
    -------------+------------------------------           F( 52,   262) =    1.18
           Model |  59.5815843    52   1.1457997           Prob > F      =  0.2029
        Residual |  254.418416   262  .971062656           R-squared     =  0.1898
    -------------+------------------------------           Adj R-squared =  0.0289
           Total |         314   314           1           Root MSE      =  .98543
    
    --------------------------------------------------------------------------------
        setara_hak |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
          prs_ayah |  -.2095765   .1230398    -1.70   0.090    -.4518491    .0326962
    dukunghub_ayah |   .0630295   .1129592     0.56   0.577     -.159394     .285453
        libat_ayah |   .1737381   .1178075     1.47   0.141     -.058232    .4057081
        fisik_bayi |  -.0709772   .0756123    -0.94   0.349    -.2198623     .077908
        fisik_umum |  -.0494432   .0727011    -0.68   0.497    -.1925959    .0937095
       hub_ibuayah |   .0784077   .1172691     0.67   0.504    -.1525021    .3093175
         peng_ayah |   .1246556    .075395     1.65   0.099    -.0238018    .2731129
         ibu_kakek |   .0210194    .068921     0.30   0.761    -.1146901    .1567289
        ayah_kakek |  -.2258749   .0761587    -2.97   0.003    -.3758359   -.0759139
                   |
            gender |
                1  |  -.2315363   .1314557    -1.76   0.079    -.4903804    .0273078
                2  |          0  (omitted)
                   |
              usia |
               18  |   .7021237   .8391361     0.84   0.404    -.9501854    2.354433
               19  |   .3517069   .8133291     0.43   0.666    -1.249787      1.9532
               20  |   .5562574    .792089     0.70   0.483    -1.003413    2.115928
               21  |   .5096164    .760594     0.67   0.503    -.9880385    2.007271
               22  |   .2072689   .7644801     0.27   0.787    -1.298038    1.712576
               23  |   .7572285   .7959871     0.95   0.342    -.8101176    2.324575
               24  |          0  (omitted)
                   |
             prodi |
                0  |  -.4330367   .1501185    -2.88   0.004    -.7286289   -.1374445
                1  |   -.344707   .1450419    -2.38   0.018    -.6303031   -.0591109
                2  |          0  (omitted)

  • #2
    There is nothing to solve here -- unless you

    * don't follow that omitting at least one level is always going to happen

    * prefer now to remove some predictors as possibly not helpful, which might have the side effect of fewer predictors being omitted for collinearity

    * want different binary predictors to be omitted, in which case you need to spell that out using factor variable notation.

    * want to examine whether some indicators (you say dummies) are positive for very small subsamples so that groups might be better amalgamated.

    Comment


    • #3
      Lets take gender and assume there are only 2 (men and women. In that case the coefficient of an indicator (dummy) variable for men tells you the difference between that category and a reference category (in this case men). If you add another indicator variable for men, than that would tell you the difference between men and the reference category (men), which makes no sense. So your problem was that you wanted a coefficient for all categories, which (normally) makes no sense. As always there are exceptions, but learning those should not be the priority when you are surprised this output. First understand the "standard" way of using indicator variables, and only afterwards start learning the exceptions. So don't use the ibn. prefix, just use the i. prefix for categorical variables.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Thank you for the reply and very clear explanation from Nick Cox and Maarten Buis. It helps me a lot

        Comment

        Working...
        X