Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multicollinearity on poisson regression

    Hello. I will do poisson regression analysis using stata. But, I want to check collinearity first.
    In linear regression, we can check collinearity by using VIF and TOL from output.
    Can we use VIF and TOL to check collinearity on poisson regression?

    How to detect multicollinearity on poisson regression using stata?

  • #2
    Collinearity is a property of the independent variables--it has nothing to do with the type of regression that you apply to them. For some reason, Stata has only made VIF available after -regress-. But all you have to do is run whatever model you were planning to do with -regress- instead and then run -estat vif-.

    Comment


    • #3
      Thank you so much for your response. That was a huge help.

      Comment


      • #4
        Dear Professor Schechter,

        If you allow, I would also have a question related to this topic.

        You said, that:
        Originally posted by Clyde Schechter View Post
        But all you have to do is run whatever model you were planning to do with -regress- instead and then run -estat vif-.

        I´m running my model (xtpoisson) with the implemented countrypair fixed effects.
        I guess if I want to run the same model with -regress-, I have to set i.CountryPairID variables to controll for countrypair fixed effects.
        But due to the version of my stata, I´m running out of variables ("maxvar" error).
        Do you have a suggestion for me, how to still test the collinearity?

        Thank you in advance for your time and also for your previous posts (also in other threads).

        Best
        Onur

        Comment


        • #5
          So, first, you might look into some other posts of mine on multicolinearity testing: my primary advice is that it is a waste of time and pixels and you just shouldn't do it at all. Find a copy of Arthur Goldberger's econometrics textbook and read the chapter on multicollinearity--he writes a very amusing and devastating takedown of the whole concept.

          But assuming somebody is forcing you to do this, here is a workaround.

          Let's say your original regression -xtpoisson y x1 x2 x3, fe- and your data have been -xtset- with p as the panel variable.

          Code:
          preserve
          xtdata y x1 x2 x3, fe clear
          regress y x1 x2 x3
          estat vif
          restore
          The -xtdata- command will replace your data with a de-meaned version. Running -regress- on the demeaned data is equivalent to running -xtreg, fe-.

          Let me also note, as an aside, that including i.CountryPairID variables is not equivalent to having CountryPairID fixed effects for a Poisson regression. That trick is only correct for linear models. Nevertheless, for the purpose of identifying multicolinearity, it is not a problem to do this.

          Comment


          • #6
            Dear Professor Schechter,

            Thank you very much for this great and helpful post.

            Despite the critiques, I will implement your suggestions first.
            But I reserved the book already and maybe can discuss this part with my supervisor.

            Thank you again for this specific and incredibly fast response!

            Best wishes
            Onur

            Comment


            • #7
              Dear Professor Schechter,

              Sorry for bothering you with the concept of multicollinearity again, but I got a further task today and would have one more question:

              Now I have to check the pairwise correlation (matrix) of the variables.
              Is it for my concept correct to replace my data with a de-meaned version (through xtdata..) and then run "pwcorr var1 var2 ...".

              Thank you for your time in advance!

              Best wishes
              Onur

              P.S.: I have read the chapter you suggested (of Arthur Goldberger's book). Since I still have to check the multicollinearity, I think about implementing his arguments in my thesis. Thank you again for your input!

              Comment


              • #8
                Well, I think you have to ask the person who is requesting this pairwise correlation matrix what they are looking for. In my mind, the correlations of the demeaned data are meaningful and reflective of the correlations that actually influence the regression analysis. By contrast, the correlations of the original data are distorted by the fixed effects and, unless the fixed effects are negligibly small, those correlations have little relationship to what -xtreg, fe- calculates. So the correlations of the demeaned variables would be what I would want to see.

                But I don't know what your supervisor has in mind. Evidently they see things differently from me with regard to multicollinearity, so they may see this differently as well. At least for now, you have to either do things your supervisor's way or persuade them differently. So I would ask them this question explicitly. And if they respond that they want a table of correlations of the original variables, you might ask them to explain why and how it is helpful (though only you can judge how tolerant they are of being questioned/challenged.)

                Comment


                • #9
                  Correct me if I'm wrong, but multicollinearity can be a real problem, no?

                  Here's my situation. I'm interested in x1 (in a poisson regression). I use xi:poisson of y on x1, and get a coefficient of .007. x1 is very highly correlated with x2. I include x2, and get a coefficient of .028 on x1 (4x as large). Standard errors also go up by 6x, but the coefficient is nevertheless still significant. X2 is large and negative, and also highly significant. It's a problem, no? And, this is a control function IV case where the control is x2, so can't be easily dropped. Solutions?

                  Comment


                  • #10
                    Yes, when the colinearity involves the variable of interest, not just some covariates included to reduce omitted variable bias, then it can be a problem. And in your case it very well may be. A 6-fold inflation of the standard error is a large loss of precision. But the question remains: with the (reduced) precision you still have, is that good enough to answer your research question? If so, it's not a problem.

                    But if your study has become inconclusive by virtue of the reduced precision, then, yes, it's a problem. And there are no solutions other than to do a new study with either much more data or a sampling design that breaks the colinearity between x1 and x2.

                    Comment


                    • #11
                      Clyde Schechter thanks for this, just seeing your response now. I also concluded it was a problem. This was from an AER replication we did of a paper by Aghion, Van Reenen, and Zingales here: https://drive.google.com/file/d/1XMl...6kr8UiQer/view

                      Comment

                      Working...
                      X