Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Constraining panel data regression to deal with multicolinearity

    Dear All,

    I'm struggling with the following problem. I have a panel dataset with return and marketcap data on various securities. Besides I have dummies for industry and country. See sample below:
    Code:
    input int(date _j) double return long marketcap byte industry str14 country
    14252 295 -.029411764705882495  207874 3 "Austria"
    14259 319  -.03891233005157064  552000 2 "Austria"
    14266 339 -.025473071324599684 4299390 7 "Portugal"
    14273 301                    0  609427 8 "Portugal"
    14280 327                    .  130248 3 "Germany"
    14287 319  -.08069414316702818  552000 2 "Germany"
    14294 307 -.004160887656033322  228750 8 "France"
    14301 337                    0   46500 8 "France"
    I want to run a regression with a dummy for all countries and industries without omitting dummies. I want to realize this by restricting my regression such that the sum of all industry dummy betas times their number of securities in that specific industry is zero, and the same for all country dummy betas. Mathematically this solves the problem of multicollinearity only I don't know how to realize this in stata.

    I already tried the following code:
    Code:
    bysort _j: drop if industry==. 
    
    levelsof country1, local(country1)
    levelsof industry, local(industry)
    
    local country1_constraint 0
    local industry_constraint 0
    
    foreach c of local country1 {
        count if country1 == `c'
        local country1_constraint `country1_constraint' + `r(N)'*cc`c'
        gen byte cc`c' = `c'.country1
    }
    
    foreach i of local industry {
        count if industry == `i'
        local industry_constraint `industry_constraint' + `r(N)'*ii`i'
        gen byte ii`i' = `i'.industry
    }
    display `"`industry_constraint'"'
    
    constraint def 1 `country1_constraint' = 0
    constraint def 2 `industry_constraint' = 0
    
    cnsreg ret cc* ii*, noconstant constraints(1 2)
    The problem with this code is that it does not work for panel data and that it still omits two dummies, one industry and one country.

    Perhaps some of you can help me with this interesting case?

  • #2
    FYI this is the output of my code:
    Code:
    cnsreg ret cc* ii*, noconstant constraints(1 2)
    note: cc12 omitted because of collinearity
    note: ii9 omitted because of collinearity
    
    Constrained linear regression                   Number of obs     =  1,318,853
                                                    F(  18,1318835)   =      32.37
                                                    Prob > F          =     0.0000
                                                    Root MSE          =     0.0570
    
     ( 1)  46463*cc1 + 87208*cc2 + 47568*cc3 + 237840*cc4 + 236090*cc5 + 48559*cc6 +
           33694*cc7 + 145677*cc8 + 94145*cc9 + 44595*cc10 + 101082*cc11 +
           518293*o.cc12 = 0
     ( 2)  44581*ii0 + 98129*ii1 + 340901*ii2 + 193278*ii3 + 96144*ii4 + 208134*ii5 +
           27733*ii6 + 56496*ii7 + 472727*ii8 + 103091*o.ii9 = 0
    ------------------------------------------------------------------------------
          return |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             cc1 |   .0003957   .0002856     1.39   0.166    -.0001641    .0009556
             cc2 |   .0000548   .0002046     0.27   0.789    -.0003461    .0004557
             cc3 |   .0005556   .0002751     2.02   0.043     .0000163    .0010948
             cc4 |   .0009013   .0001129     7.98   0.000       .00068    .0011226
             cc5 |   .0009292   .0001164     7.98   0.000      .000701    .0011574
             cc6 |   .0003177   .0002664     1.19   0.233    -.0002046    .0008399
             cc7 |   .0009946   .0003508     2.84   0.005      .000307    .0016822
             cc8 |  -.0006696    .000161    -4.16   0.000    -.0009851   -.0003541
             cc9 |  -.0041605   .0001988   -20.92   0.000    -.0045502   -.0037707
            cc10 |   .0002611   .0002857     0.91   0.361    -.0002988    .0008211
            cc11 |  -.0005411   .0002012    -2.69   0.007    -.0009354   -.0001467
            cc12 |          0  (omitted)
             ii0 |   .0001002   .0002989     0.34   0.738    -.0004857    .0006861
             ii1 |   .0004428    .000195     2.27   0.023     .0000607    .0008249
             ii2 |   .0002788   .0000955     2.92   0.004     .0000917     .000466
             ii3 |   .0002778   .0001318     2.11   0.035     .0000195    .0005361
             ii4 |  -.0001704   .0002131    -0.80   0.424    -.0005881    .0002474
             ii5 |  -.0001667   .0001314    -1.27   0.205    -.0004243    .0000909
             ii6 |  -.0009099   .0003944    -2.31   0.021    -.0016829    -.000137
             ii7 |  -.0009552    .000261    -3.66   0.000    -.0014667   -.0004437
             ii8 |  -.0001405   .0000786    -1.79   0.074    -.0002944    .0000135
             ii9 |          0  (omitted)
    ------------------------------------------------------------------------------

    Comment


    • #3
      Cross-posted at https://stats.stackexchange.com/ques...lticolinearity

      Please note our policy on cross-posting: You are asked to tell us about it.

      Comment


      • #4
        Dear Nick Cox

        my apologies for the inconvenience. Thank you for posting the link!

        Is still hope the question gets answered.

        Kind regards,

        Ruben

        Comment

        Working...
        X