Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multicollinearity, VIF and interaction

    Hello
    I have seen many posts (questions and responses) related to multicollinearity, VIF and interaction. After reading all these threads, I still do not get my answer.
    Background: I'm currently using a large survey data set in which I tried to test multicollinearity between two independent variables (both are dummies). My outcome/dependent variable is a dummy variable.

    My questions are
    (Q1) Can I use Vif for this situation? I saw many posts here and in publications that Vif is used, but most of them are used for the continuous outcome/dependent variables.
    I am also aware that the interaction term can also be used to check for collinearity. But, mostly I've seen using "regress" but again, their outcome variables are continuous variables.
    (Q2) Currently, I used interaction terms to see if there is any interaction between them.
    So, here is my code for interaction. Is the code below correct for checking interaction (outcome var= dummy variable; two independent var checking for collinearity are also dummy variables?

    . svy, subpop (if COUTYP4==3): logistic illyr i.new_amdeyr##i. anysuideation2, or

    Below is the output.
    -----------------------------------------------------------------------------------------

    illyr | Odds ratio std. err. t P>|t| [95% conf. interval]
    --------------------------+----------------------------------------------------------------
    new_amdeyr |
    Yes | 2.368648 .1665209 12.27 0.000 2.056722 2.727881
    1.anysuideation2 | 4.202448 .3737046 16.14 0.000 3.515055 5.024265

    new_amdeyr#anysuideation2 |
    Yes#1 | .5716726 .0724573 -4.41 0.000 .4431859 .7374096
    |
    _cons | .155493 .0033261 -87.01 0.000 .1489538 .1623192
    -------------------------------------------------------------------------------------------



    (Q2.1) If it is not correct, can anyone suggest the correct code?

    (Q3) if it is correct, can the code be used for or checking interaction (outcome var= categorical variable; independent variables checking for collinearity are also dummy variables, categorical and independent var?


    Thank you so much.
    Last edited by Wah Myint; 10 Jan 2023, 21:30.

  • #2
    Wah:
    foreword: I never ever challenged myself with survey. Therefore, please consider what follows as a tentative reply.
    That said, I would rely on -estat vce, corr- to grasp the correlation patterns:
    Code:
    . use https://www.stata-press.com/data/r17/nhanes2b.dta
    
    . svy: logistic highbp height weight age female
    (running logistic on estimation sample)
    
    Survey: Logistic regression
    
    Number of strata = 31                            Number of obs   =      10,351
    Number of PSUs   = 62                            Population size = 117,157,513
                                                     Design df       =          31
                                                     F(4, 28)        =      368.33
                                                     Prob > F        =      0.0000
    
    ------------------------------------------------------------------------------
                 |             Linearized
          highbp | Odds ratio   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          height |   .9657022   .0051511    -6.54   0.000     .9552534    .9762654
          weight |   1.053023   .0026902    20.22   0.000     1.047551    1.058524
             age |   1.050059   .0019761    25.96   0.000     1.046037    1.054097
          female |   .6272129   .0368195    -7.95   0.000     .5564402     .706987
           _cons |    .716868   .6106878    -0.39   0.699     .1261491    4.073749
    ------------------------------------------------------------------------------
    Note: _cons estimates baseline odds.
    
    . estat vce, corr
    
    Correlation matrix of coefficients of logistic model
    
                 | highbp                                           
            e(V) |   height    weight       age    female     _cons 
    -------------+--------------------------------------------------
    highbp       |                                                  
          height |   1.0000                                         
          weight |  -0.6406    1.0000                               
             age |   0.4553   -0.0988    1.0000                     
          female |   0.5294   -0.1320    0.4688    1.0000           
           _cons |  -0.9741    0.4880   -0.5754   -0.6242    1.0000 
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo: Thank you so much. I appreciate it. I will try this way.

      Comment

      Working...
      X