Hello everyone,
I'm working with a binary outcome variable and have encountered an interesting issue regarding model specification using both logit and probit regression models in Stata. I have a specific predictor variable that I've found significant in both models. Initially, the link test was significant for both the logit and probit models, indicating potential misspecification.
When I include a polynomial term of this predictor in the probit model, the link test becomes insignificant, suggesting that the non-linearity may be adequately captured. However, in the logit model, the link test remains significant even after adding the polynomial term. Interestingly, when I only include the polynomial term in the logit model (excluding the base term), the sign of the coefficients flips and the polynomial term becomes positive, but when both polynomial and base term are included they are both negative. Additionally, I noticed probit and logit adjusted R2 are very comparable with the lowest R2 achieved with only the polynomial, followed by only the base term, and the highest R2 is in the model with both base term and polynomial. The predictor variable also appears as an interaction with another contiuous variable, VIF shows no problem with multicolinearity, and all continous variables are centered when relevant. Furthermore, I have treid many different speciications including polynomials of other variables and interaction terms with and between other variables, but the source of non-linearity is clearly coming from the variable I'm disucssing.
Given this context, how should I interpret the significance of the link test in both models? Why might the probit model capture the relationship better with the polynomial while the logit model does not? Also, is a 10% significance level too lenient for model evaluation in this scenario? Any insights or recommendations on how to proceed, which model to select or any other relevant matters would be greatly appreciated!
Thank you!
I'm working with a binary outcome variable and have encountered an interesting issue regarding model specification using both logit and probit regression models in Stata. I have a specific predictor variable that I've found significant in both models. Initially, the link test was significant for both the logit and probit models, indicating potential misspecification.
When I include a polynomial term of this predictor in the probit model, the link test becomes insignificant, suggesting that the non-linearity may be adequately captured. However, in the logit model, the link test remains significant even after adding the polynomial term. Interestingly, when I only include the polynomial term in the logit model (excluding the base term), the sign of the coefficients flips and the polynomial term becomes positive, but when both polynomial and base term are included they are both negative. Additionally, I noticed probit and logit adjusted R2 are very comparable with the lowest R2 achieved with only the polynomial, followed by only the base term, and the highest R2 is in the model with both base term and polynomial. The predictor variable also appears as an interaction with another contiuous variable, VIF shows no problem with multicolinearity, and all continous variables are centered when relevant. Furthermore, I have treid many different speciications including polynomials of other variables and interaction terms with and between other variables, but the source of non-linearity is clearly coming from the variable I'm disucssing.
Given this context, how should I interpret the significance of the link test in both models? Why might the probit model capture the relationship better with the polynomial while the logit model does not? Also, is a 10% significance level too lenient for model evaluation in this scenario? Any insights or recommendations on how to proceed, which model to select or any other relevant matters would be greatly appreciated!
Thank you!
Comment