Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Relationship between dependent variable and independent variables which are perfect multicollinearity

    Dear all,

    I want to know the relationship between a dependent variable and multiple independent variables (about 9). However the (continueus) independent variables have a lineair relationship between them (they add up to 100). To be more specific, I want to know whether an independent variable has a positive/negative small/big deviation from the average (dependent variable)

    - I don't want to omit one of the independent variables since I dont know its effect on the dependent variable, So I don't know whether the coefficients truly reflect the positive/negative correctly.
    - I don't want to remove the constant term (same argument as above).

    Any suggestions? Is it for instance possible to force an intercept which equals the average?

    Thanks in advance

  • #2
    - I don't want to omit one of the independent variables since I don't know its effect on the dependent variable, So I don't know whether the coefficients truly reflect the positive/negative correctly.
    - I don't want to remove the constant term (same argument as above).
    You are asking the impossible. It is not just impossible in Stata, it is impossible in the universe.

    Comment


    • #3
      You might want to reconsider your substantive question. If these data reflect something true about the phenomenon of interest, then you substantive question might not make a lot of sense. It would be interesting to know something about the phenomenon you are investigating here, so we could better discuss this.

      Best
      Daniel

      Comment


      • #4
        Forcing a value for the intercept gives you almost the same problems as you get if you omit the intercept (i.e., force the intercept to equal zero). It makes all the parameter values estimable but dependent on this arbitrary value you've set the intercept to.

        When you say " To be more specific, I want to know whether an independent variable has a positive/negative small/big deviation from the average (dependent variable)" it sounds like you are thinking of something that is not a regression of the dv on these ivs. Maybe you don't want a normal regression model at all.

        As daniel noted, you need to rethink what you're doing.


        Comment


        • #5
          I want to investigate whether voting preferences in a municipality has an influence to a dependent variable (for instance on education level). I have the percentage of votes on each party (about 9 parties) as the indepent variables (so they do add up to 100). And the score of education level as the dependent variable. (NOTE: I don't need causality, just correlation)

          I figured that a normal regression might not be the answer, but I don't know what other option I have...

          Comment


          • #6
            Patrick:
            if by normal regression you mean OLS, that is probably not the way to go, as education level is not a continuous variable; perhaps -ologit- may support what you're after.
            That said, I fear that your regression model may suffer from reverse causality (read endogeneity), in that education level may well contribute to explain difference in voting preferences.
            As far as I get your query, you may be probably safer with some sort of correlation.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              From a substantive point of view, you should carefully think about what exactly you mean by voting preferences. Preferences usually arise from comparing at least two goals. Thus, excluding one party from the model will accurately represent the preferences for all other parties in comparison to the omitted one.

              The (probably impossible) model that you ask for here, would instead try to estimate the effects of a one unit (percentage point of votes) increase for one party, holding constant the percentage of votes for all others. This is impossible in reality, since increasing the votes for one party must decrease the votes for at least one other party. It is also not necessarily informative since the effects might well depend on which of the other parties loses votes.

              From a statistical point of view, you could still estimate what you (seemingly) want, by using variance between municipalities in the data or variance within municipalities over time.

              Best
              Daniel

              Comment

              Working...
              X