Dear Stata forum,
I have imputed a data set consisting of continuous and binary variables and I am creating a conditional logistic regression model with independent variables associated with the recurrence of TB infection (recurrence being my dependent variable). I believe that there are some variables that are highly correlated e.g. the interruption of drug treatment and reaction to medication. When I search online for methods to detect collinearity and multiple collinearity papers suggest to use methods such as VIF, the condition index and / or using the unexpected direction of associations between the outcome and explanatory variables is an important sign of collinearity and multicollinearity (http://www.nature.com/bdj/journal/v199/n7/full/4812743a.html). Using the last recommendation I believe I have detected collinearity but I cannot use VIF / the condition index with multiple imputed data. I was wondering if there is a better approach to assess my conditional logistic regression model for the presence of collinear and multiple collinear variables when working with multiply imputed data?
Many thanks for your help
I have imputed a data set consisting of continuous and binary variables and I am creating a conditional logistic regression model with independent variables associated with the recurrence of TB infection (recurrence being my dependent variable). I believe that there are some variables that are highly correlated e.g. the interruption of drug treatment and reaction to medication. When I search online for methods to detect collinearity and multiple collinearity papers suggest to use methods such as VIF, the condition index and / or using the unexpected direction of associations between the outcome and explanatory variables is an important sign of collinearity and multicollinearity (http://www.nature.com/bdj/journal/v199/n7/full/4812743a.html). Using the last recommendation I believe I have detected collinearity but I cannot use VIF / the condition index with multiple imputed data. I was wondering if there is a better approach to assess my conditional logistic regression model for the presence of collinear and multiple collinear variables when working with multiply imputed data?
Many thanks for your help
Comment