Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction terms being dropped because of collinearity - confused as to why?

    Hello,

    I am currently writing a thesis on how female ownership of firms affects firm performance in SSA using firm level WBES data.

    I have ran regressions of Log(sales) and Sales growth using 'Female' as the key independent var, where Female =1 when a female is present in a firms ownership and 0 if not. These results indicated no significant difference between male and female owned firms.

    So my next stage of analysis was to start running some interaction terms.

    My first interaction term is Female*Fshare (where Fshare is a continuous var which indicates the % of female ownership, male owned firms are 0%).

    I ran the following code:

    reg lnmsales i.Female##c.Fshare Fmanager Registration Certification Export foreign Web Overdraft Credit Transport Elec medium large age manufacturing retail Experience ib10.Region, robust cluster(Region)

    Stata returned:

    note: 1.Female#c.Fshare omitted because of collinearity.
    Log (sales) Coefficient std. err. t P>t [95% conf. interval]
    Female
    Yes .201533 .1214493 1.66 0.131 -.0732044 .4762704
    Fshare -.0048666 .0024718 -1.97 0.080 -.0104583 .0007251
    Female#c.Fshare
    Yes 0 (omitted)

    So, my understanding is that stata has returned Female and Fshare on their own, and omitted the interaction term. Why has stata omitted the interaction term?

    I then added, noomitted to the end of the code as I saw this was a suggestion on another thread from statalist. It returns the same thing.

    I have then tried a different interaction with Female*Fmajority interaction term, where Fmajority is a dummy which =1 if Fshare>49%. Stata still omits the interaction terms.

    I am stuck and cannot figure out why.

    Apologies in advance is this is a simple fix, but I have read what seems like every statalist thread on this and nothing has worked.

    Huge thanks in advance for any help.

    Best,
    Sam



  • #2
    Sam:
    welcome to this forum.
    The best advice I can give you is to starting off by coding your regression with the dependent variable and the interaction (##) and see if it works. Then add one predictor at a time and see when the collinearity issue bites.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo,

      Appreciate the quick response!

      The code:
      reg lnmsales i.Female##c.Fshare

      returns this:
      note: 1.Female#c.Fshare omitted because of collinearity.


      So the issue is (i assume) at the heart of the interaction term? Does this mean it just isn't plausible statistically?

      Best,
      Sam

      Comment


      • #4
        Sam:
        let's get rid of this interaction, keep Female and Fshare add one predictor at a time and see what happens.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hello Carlo,

          I added one predictor at a time until I had the full regression which is:

          Reg1: ​​​​​​reg lnmsales Female Fshare Fmanager Registration Certification Export foreign Web Overdraft Credit Transport Elec medium large age manufacturing retail Experience ib10.Region, robust cluster(Region)

          There was no collinearity in any of the regressions (for each predictor).

          Reg1 reports Female insignificant and Fshare significant. So does this mean that there is some threshold where the % of Female ownership matters for firm performance?


          Then when I run Reg2 which is:

          reg lnmsales i.Female##c.Fshare Fmanager Registration Certification Export foreign Web Overdraft Credit Transport Elec medium large age manufacturing retail Experience ib10.Region, robust cluster(Region)

          Stata omits the interaction term:

          note: 1.Female#c.Fshare omitted because of collinearity.

          Stata reports the same results for Female and Fshare on their own in Reg1 and Reg2.

          Comment


          • #6
            Sam:
            please share via CODE delimiters what Stata gave you back. Thanks.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              I ran the regression:
              Code:
              reg lnmsales i.Female##c.Fshare Fmanager Registration Certification Export foreign Web Overdraft Credit Transport Elec medium large age manufacturing retail Experience shareholding partnership ib10.Region, robust cluster(Region)
              Stata returned:
              Code:
              reg lnmsales Female##c.Fshare Fmanager Registration Certification Export foreign    Web    Overdraft    Credit    Transport    Elec    medium    large    age    manufacturing    retail    Experience    shareholding    partnership    ib10.Region,    robust    cluster(Region) 
              note: 1.Female#c.Fshare omitted because of collinearity.
              
              Linear regression                               Number of obs     =        766
              F(8, 9)           =          .
              Prob > F          =          .
              R-squared         =     0.5525
              Root MSE          =      1.458
              
              (Std. err. adjusted for 10 clusters in Region)
              
              Robust
              lnmsales  Coefficient  std. err.      t    P>t     [95% conf. interval]
              
              Female 
              Yes    -.1114652   .1211944    -0.92   0.382    -.3856261    .1626957
              Fshare   -.0006571   .0021939    -0.30   0.771      -.00562    .0043058
              
              Female#c.Fshare 
              Yes            0  (omitted)
              Sorry, I am new to this forum. Have i used the CODE delimiters correctly? Apologies in advance if I haven't.

              Comment


              • #8
                As a side note, I have already taken out the Robust se to see if this was causing the issue and the problem still occured. I have also tried the noomitted command, this did not help either. Thanks for your help!

                Comment


                • #9
                  Sam:
                  1) with 10 panels only, go back to default standard errors;
                  2) my gut-feeling is that the terms of the interaction are one another collinear. Double-check your dataset;
                  3) to check whether or not you used CODE delimiters the right way, you should see the code and the results of your Stata session. Using -Preview- before posting your reply can give you an idea of how interested listers will receive your post.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Thank you for your help Carlo, much appreciated.

                    Comment

                    Working...
                    X