Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • variance inflation factor/eigenvalues: GEE and logistic regression

    Hi Statalisters,

    Does anyone by any chance know the command for variance inflation factors and eigenvalues for GEE analysis as well as logistic regression?

    I looked through post estimation commands for both GEE/Logit but couldn't find any. However, surprising, I see in a paper that I am using as a basis for my analysis that mentions calculating VIF and eigenvalues for GEE and logit.

    Any help will be highly appreciated!

    Best,
    Mohsin

  • #2
    VIF and associated statistics refer to the right hand side (independent) variables only. You can run a regular regression and then get what you want with estat vif or there is a user-written routine called collin. I can't tell from your note if you have longitudinal data in long form, in which case you may want to rearrange your data. See http://www.ats.ucla.edu/stat/stata/faq/svycollin.htm for details.
    Richard T. Campbell
    Emeritus Professor of Biostatistics and Sociology
    University of Illinois at Chicago

    Comment


    • #3
      I used collin on long form and it worked. Thank you for your guidance.

      Comment


      • #4
        The VIF statistics provided by collin measure variance inflation exactly only for OLS models, not for GEE or for logistic models (Carter and Adkins, 2003). The reason: collin operates on the X'X matrix, which is proportional to the inverse of the variance-covariance matrix only for OLS.

        I would prefer condition numbers computed with John Hendrickx's coldiag2 (SSC), still applied to X'X. John's perturb (SSC), gets at collinearity for any model by studying the effects of small changes in the predictors on estimated coefficients ( Schall and Dunne, 1992).

        See also the discussion at: http://www.statalist.org/forums/foru...4-vif-with-svy

        References:
        Hill, R Carter, and Lee C Adkins. 2003. Chapter 12: Collinearity. In A Companion to Theoretical Econometrics, ed. BH Baltagi, 256-278. Oxford: Blackwell Publishing.


        Schall, Robert, and Timothy T Dunne. 1992. A note on the relationship between parameter collinearity and local influence. Biometrika 79, no. 2: 399-404.
        Last edited by Steve Samuels; 12 Sep 2015, 16:03.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          Hi Steve,

          Thank you for replying. I am trying to run it for xtgee model but I get error for "conformability error"

          Code:
          perturb: xtgee cino ten_1 tmt_1 ari_1 lemp_1 td_1 oc_1, poptions(pvars(ten_1 tmt_1 ari_1 lemp_1 td_1) prange
          > (5 5 5 5 5) pfac(oc_1) pc(96))
          
          Iteration 1: tolerance = .1939647
          Iteration 2: tolerance = .01492608
          Iteration 3: tolerance = .00187656
          Iteration 4: tolerance = .00024871
          Iteration 5: tolerance = .00003319
          Iteration 6: tolerance = 4.434e-06
          Iteration 7: tolerance = 5.923e-07
          
          GEE population-averaged model                   Number of obs      =       480
          Group variable:                         id      Number of groups   =        96
          Link:                             identity      Obs per group: min =         5
          Family:                           Gaussian                     avg =       5.0
          Correlation:                  exchangeable                     max =         5
                                                          Wald chi2(6)       =      8.63
          Scale parameter:                  .0525701      Prob > chi2        =    0.1955
          
          ------------------------------------------------------------------------------
                  cino |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                 ten_1 |   .0016143   .0021414     0.75   0.451    -.0025828    .0058114
                 tmt_1 |   .0008179   .0041992     0.19   0.846    -.0074123    .0090481
                 ari_1 |   .1836229   .1100982     1.67   0.095    -.0321655    .3994113
                lemp_1 |   .0189954   .0128328     1.48   0.139    -.0061563    .0441472
                  td_1 |  -.0711932   .0454898    -1.57   0.118    -.1603515    .0179651
                  oc_1 |  -.0229941   .0276349    -0.83   0.405    -.0771575    .0311693
                 _cons |   .0359777    .045286     0.79   0.427    -.0527812    .1247366
          ------------------------------------------------------------------------------
          
          Perturb variables:
          --------------------------------------------------
          ten_1                             normal(0,5)
          tmt_1                             normal(0,5)
          ari_1                             normal(0,5)
          lemp_1                            normal(0,5)
          td_1                              normal(0,5)
          
          Perturb factors:
          --------------------------------------------------
          
          Reclassification probabilities for oc_1:
          
          -------------------------------
          original  |
          variable  |reclassifed variable
          oc_1      |     1      2  Total
          ----------+--------------------
                  1 | 0.960  0.040  1.000
                  2 | 0.040  0.960  1.000
          -------------------------------
          
          Initial expected table based on the reclassification probabilities:
          
          -------------------------------------
          original  |
          variable  |   reclassifed variable   
          oc_1      |       1        2    Total
          ----------+--------------------------
                  1 | 312.960   13.040  326.000
                  2 |   6.160  147.840  154.000
                    | 
              Total | 319.120  160.880  480.000
          -------------------------------------
          
          The reclassification probabilities will be adjusted to let
          the expected frequencies of the reclassified variable be equal to those of oc_1
          
          The expected table will be quasi-independent
          ln(q)=:         3.178
          
          Adjusted expected table:
          
          -------------------------------------
          original  |
          variable  |   reclassifed variable   
          oc_1      |       1        2    Total
          ----------+--------------------------
                  1 | 317.064    8.936  326.000
                  2 |   8.936  145.064  154.000
                    | 
              Total | 326.000  154.000  480.000
          -------------------------------------
          
          Final reclassification probabilities:
          
          -------------------------------
          original  |
          variable  |reclassifed variable
          oc_1      |     1      2  Total
          ----------+--------------------
                  1 | 0.973  0.027  1.000
                  2 | 0.058  0.942  1.000
          -------------------------------
          conformability error
          r(503);
          I read the description of error term and it says: "You have issued a matrix command attempting to combine two matrices that are not conformable, for example, multiplying a 3x2 matrix by a 3x3 matrix. You will also get this message if you attempt an operation that requires a square matrix and the matrix is not square."

          I am not sure what to make of this. When I remove the factor variable, then it works.

          Can you please guide me in the right direction?

          Thank you,
          Mohsin

          Comment


          • #6
            Yes perturb can be flaky, as the posts in the other thread show. Also I believe that it was written before Stata changed to new factor variables. I suspect a problem that can't be fixed, but it's also possible that your syntax is incorrect. Hendrickx no longer maintains the Stata version of perturb, but does actively maintain the R version, if that's accessible to you.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              Thank you for replying, Steven. I am using four measures for prior performance (with two of them having a strong significant correlation of greater than 0.5 with other dependent variables). Luckily, I can still carry out the analysis with the other two prior performance so it shouldn't be a serious problem if perturb doesn't work out.

              Thank you,
              Mohsin

              Comment


              • #8
                That sounds like a good plan. I also suggested coldiag2. I take it by "dependent" you mean "independent". My objection is to presenting "variance inflation factors" which don't actually measure variance inflation.
                Steve Samuels
                Statistical Consulting
                [email protected]

                Stata 14.2

                Comment


                • #9
                  Thank you, Steve. I am on a bit of a time crunch, hence I decided to make some changes to the thesis to focus on things that I already am comfortable with. And yes, indeed, I meant independent variables.

                  Comment

                  Working...
                  X