Hello everybody!
I would have a question on the multinomial logit model.
I have a dataset of 25,000 observations (individuals), and I am using a multinomial logit regression to test the effect of some control variables (namely age, income, etc.) on a set of dependent variables denoting the different 5 choices that these individuals chose in terms of heating system (i.e., gas, electric, wood, carbon, other). The problem is that there are cases where an individual chooses more than one category (e.g., gas and wood; or gas, carbon and other, etc.), therefore I was wondering how to treat these special cases.
For the theory behind the multinomial logit model (if I'm not wrong) it is assumed that choices (in the dependent variable) should be independent among each other.
To cope with this issue, I thought about creating different choice categories comprehensive of multiple responses (e.g., gas and electric; gas, carbon and other,...) but the overall number of choices available, if I proceed this way, increases remarkably! Luckily, there are few cases when a person selects more than one alternative, but still, I do think this constitutes a problem.
Alternatively, I thought about inserting a dummy variable, in the controls, for capturing the event of individuals selecting more than one alternative (even though I am not sure whether this way is correct).
Whether you could provide me some help on this, I would be extremely grateful.
Thank you very much.
With best regards,
Kodi
I would have a question on the multinomial logit model.
I have a dataset of 25,000 observations (individuals), and I am using a multinomial logit regression to test the effect of some control variables (namely age, income, etc.) on a set of dependent variables denoting the different 5 choices that these individuals chose in terms of heating system (i.e., gas, electric, wood, carbon, other). The problem is that there are cases where an individual chooses more than one category (e.g., gas and wood; or gas, carbon and other, etc.), therefore I was wondering how to treat these special cases.
For the theory behind the multinomial logit model (if I'm not wrong) it is assumed that choices (in the dependent variable) should be independent among each other.
To cope with this issue, I thought about creating different choice categories comprehensive of multiple responses (e.g., gas and electric; gas, carbon and other,...) but the overall number of choices available, if I proceed this way, increases remarkably! Luckily, there are few cases when a person selects more than one alternative, but still, I do think this constitutes a problem.
Alternatively, I thought about inserting a dummy variable, in the controls, for capturing the event of individuals selecting more than one alternative (even though I am not sure whether this way is correct).
Whether you could provide me some help on this, I would be extremely grateful.
Thank you very much.
With best regards,
Kodi
Comment