Hello Statalist community,
I'd like to ask for your advice whether or not it is reasonable to treat a specific independent variable in my model as continuous or ordinal. Arguing from a purely theoretical perspective, I’d say that the variable I'm speaking about – gorigin (5 ordered groups of social origin) – should be treated as an ordered categorical variable. However, when comparing the model fit between the first model that treats gorigin as a continuous variable and the second model that uses factor notation to treat gorigin as an ordinal variable, the afterwards computed Likelihood-ratio test indicates that the second model (i.gorigin) does not provide a better fit to my data compared to the first model (c.gorigin).
The test I did relies heavily on a technical paper written by Richard Williams from the University of Notre Dame – https://www3.nd.edu/~rwilliam/stats3...ndependent.pdf . What I did is essentially the same just with the difference that I applied Richard Williams' approach to my own dataset. Similar to Richard's example, my data shows me no significant differences between both models.
The example he used can be easily replicated using the following code:
The stupid question I'm asking is whether or not I should stay with my original plan to use groups of social origin as an ordinal variable or argue in favour of my data and treat it as continuous? I'm concerned about the correctness of my approach. Could some reviewer tell me that it is a mistake to treat variable gorigin as ordinal when the test I did clearly showed that I can go either way? Isn't it a more conservative approach when I say that I'm not treating an ordered categorical variable as continuous?
Some input on your side would be highly appreciated! I'm a bit afraid to make a mistake here. Even though I think my approach is theoretically correct.
Thanks
Patrick
I'd like to ask for your advice whether or not it is reasonable to treat a specific independent variable in my model as continuous or ordinal. Arguing from a purely theoretical perspective, I’d say that the variable I'm speaking about – gorigin (5 ordered groups of social origin) – should be treated as an ordered categorical variable. However, when comparing the model fit between the first model that treats gorigin as a continuous variable and the second model that uses factor notation to treat gorigin as an ordinal variable, the afterwards computed Likelihood-ratio test indicates that the second model (i.gorigin) does not provide a better fit to my data compared to the first model (c.gorigin).
The test I did relies heavily on a technical paper written by Richard Williams from the University of Notre Dame – https://www3.nd.edu/~rwilliam/stats3...ndependent.pdf . What I did is essentially the same just with the difference that I applied Richard Williams' approach to my own dataset. Similar to Richard's example, my data shows me no significant differences between both models.
The example he used can be easily replicated using the following code:
Code:
webuse nhanes2f, clear logit diabetes c.health, nolog est store m1 logit diabetes i.health, nolog est store m2 lrtest m1 m2, stats
Some input on your side would be highly appreciated! I'm a bit afraid to make a mistake here. Even though I think my approach is theoretically correct.
Thanks
Patrick
Comment