Problems with mulitlevel multinomial logistic regression syntax

Sabina Thapa Magar

Join Date: Aug 2019

Posts: 4
#1

Problems with mulitlevel multinomial logistic regression syntax

13 Aug 2019, 14:09

Hello Everyone,
I am working on the determinants of employment status choice. I tried using the multi-level multinomial logistic regression. I am not fully aware if I am doing it correctly. First, I tried the simplified version of non-multi level command :
_{gsem ( 2.employment_status <- i.dist i.sex schooling Total_hrs occupation industry), mlogit nocapslatent}
The stata converged the result. Then I used multi level version.
_{gsem ( 2.employment_status <- i.dist i.sex schooling Total_hrs occupation industry M1[hhid]@1), mlogit nocapslatent latent(M1)}
The stata also converged this result.
I used number 2 infront of employment_status since the stata did not converge when I type i.employment_status. Hence, I used the given categorical number for depvar to calculate. Therefore, I am not sure if above given stata command makes any difference.
Second, I used hhid (household number) as individual are naturally nested within household.
I think household is also nested within districts. Here, I am stuck. This looks like three level model. If so, how do I calculate them? This is my first time working with this model and I am not sure if I am following the path.

Last edited by Sabina Thapa Magar; 13 Aug 2019, 14:44.
Tags: None
Sabina Thapa Magar

Join Date: Aug 2019

Posts: 4
#2

14 Aug 2019, 14:30

Hello again, I am updating my above mentioned question here.
I have a household data. My dependent variable is employment status which is a categorical variable having 5 choices. I want to run a multi-level multinomial logistic regression. I first run the simple version of command
gsem ( i.employment_status <- i.sex age schooling Total_hrs occupation industry), mlogit nocapslatent

Now I want to check three level model. level 1 = individual, level 2= household and level 3= districts. Here, I am quite not sure about the stata command. I run the below mentioned command but I am not sure if it's correct.
gsem ( i.employment_status <- i.sex age schooling Total_hrs occupation industry M1[ newdist ]@1 M2[ newdist >hhid]@2), mlogit nocapslatent latent(M1 M2)

newdist = 75 districts
hhid= household numbers

Could anyone please help me with the command for three level hierarchical data? Moreover, is it better to perform step-wise?

Last edited by Sabina Thapa Magar; 14 Aug 2019, 14:41.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#3

17 Aug 2019, 10:12

Your syntax is almost correct. It should be:

Code:

gsem ( i.employment_status <- i.sex age schooling Total_hrs occupation industry M1[ newdist ]@1 M2[ newdist >hhid]@1), mlogit nocapslatent latent(M1 M2)

When doing random intercepts in -gsem-, the coefficient for the random intercept should always be constrained to 1, no matter how many of them there are. The underlying equation for a nested random intercept model with three levels looks like this:

Code:

outcome = (function of) b0 + b1*x1 + ... + bn*xn + u_i + v_ij + e_ijk

Notice that the coefficients of u_i and v_ij are both 1. That is what the @1 in the syntax tells -gsem-: that the coefficients of u_i and v_ij must be constrained to be 1. By setting one of them to @2, you are estimating a different model, outcome = (function of) b0 + b1x1 + ... + bnxn + 2*u_i + v_ij + e_ijk, and although it could be interpreted, asking Stata to estimate half the value of the random intercept seems pointless.

(I think perhaps you got the idea of using @1 and @2 from syntax used for estimating latent growth models. But that is a different thing altogether and those coefficients are applied to something entirely different.)

Moreover, is it better to perform step-wise?

I'm not sure what you mean by performing this stepwise. If you mean would it be better to try it with two levels first before going on to three, it might be. For example, if you run into convergence problems with the three level model, you might be able to overcome those by running the two-level model and using those results as starting estimates for the three level and get convergence.

If you mean, should you use stepwise variable selection, the answer is a resounding no for a variety of reasons:

1. Most important, stepwise variable selection is a statistically nonsensical procedure that produces models that don't replicate and provides p-values that are beyond meaningless. See https://www.stata.com/support/faqs/s...sion-problems/ for a succinct explanation of the problems with stepwise variable selection. In my vehemently held opinion, it should never be used to draw conclusions: at best it might occasionally be helpful for exploratory data analysis, though I think it would always be my last resort even for that.

2. The Stata -stepwise- command does not support -gsem-, so you would have to, in effect, write your own specialized -stepwise- command to implement it.
Comment
Sabina Thapa Magar

Join Date: Aug 2019

Posts: 4
#4

18 Aug 2019, 05:20

Thank you very much for your help. I understood your explanation. I will change my syntax and constrained it to 1. Regarding stepwise, it is about stepwise variable selection. Thank you for the clarification. I will follow your suggestion.
I have one last question to ask. Do you think running multi-level model is better idea to study about household data? Will it be more effective to consider household and district variable as random variables?

Thanks a lot again. I really appreciate your time and effort.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#5

18 Aug 2019, 06:06

Do you think running multi-level model is better idea to study about household data? Will it be more effective to consider household and district variable as random variables?

Better or more effective than what alternative? As compared to a non-hierarchical multinomial logistic regression model, it's much better. Ignoring the hierarchical structure of the data would, at best, give you distorted standard errors and confidence intervals.
Comment
Sabina Thapa Magar

Join Date: Aug 2019

Posts: 4
#6

18 Aug 2019, 06:19

Yes, as compared to non-hierarchical model and ignoring that district and household varies.
Thank you very much for your clarification. I was really confused before. Your comment/suggestion helped my confidence level at using this model. I am once again very thankful to your contribution.

Best regards,
Sabina
Comment
Anna Geo

Join Date: Feb 2021

Posts: 11
#7

24 Mar 2021, 17:02

Originally posted by Clyde Schechter View Post

Your syntax is almost correct. It should be:

Code:

gsem ( i.employment_status <- i.sex age schooling Total_hrs occupation industry M1[ newdist ]@1 M2[ newdist >hhid]@1), mlogit nocapslatent latent(M1 M2)

When doing random intercepts in -gsem-, the coefficient for the random intercept should always be constrained to 1, no matter how many of them there are. The underlying equation for a nested random intercept model with three levels looks like this:

Code:

outcome = (function of) b0 + b1*x1 + ... + bn*xn + u_i + v_ij + e_ijk

Notice that the coefficients of u_i and v_ij are both 1. That is what the @1 in the syntax tells -gsem-: that the coefficients of u_i and v_ij must be constrained to be 1. By setting one of them to @2, you are estimating a different model, outcome = (function of) b0 + b1x1 + ... + bnxn + 2*u_i + v_ij + e_ijk, and although it could be interpreted, asking Stata to estimate half the value of the random intercept seems pointless.

(I think perhaps you got the idea of using @1 and @2 from syntax used for estimating latent growth models. But that is a different thing altogether and those coefficients are applied to something entirely different.)

I'm not sure what you mean by performing this stepwise. If you mean would it be better to try it with two levels first before going on to three, it might be. For example, if you run into convergence problems with the three level model, you might be able to overcome those by running the two-level model and using those results as starting estimates for the three level and get convergence.

If you mean, should you use stepwise variable selection, the answer is a resounding no for a variety of reasons:

1. Most important, stepwise variable selection is a statistically nonsensical procedure that produces models that don't replicate and provides p-values that are beyond meaningless. See https://www.stata.com/support/faqs/s...sion-problems/ for a succinct explanation of the problems with stepwise variable selection. In my vehemently held opinion, it should never be used to draw conclusions: at best it might occasionally be helpful for exploratory data analysis, though I think it would always be my last resort even for that.

2. The Stata -stepwise- command does not support -gsem-, so you would have to, in effect, write your own specialized -stepwise- command to implement it.

Hi Clyde,

I found the answer in this post helpful for my question too, but I would like to ask where should I insert the weights in this code? I am aware that in survey design it would be wrong to assume equiprobability of participants or households being chosen.

I am working with data from the Survey for Health and Retirement in Europe (SHARE) where individuals are nested in households and the latter are nested in countries. When I was considering a binary outcome I wrote the following code, including weights:

melogit depressed agec i.sex i.partner i.edu i.work i.social i.cac010_ i.cac013_ i.sph i.chroni2 i.newchronic death_c string_c gini_c gdp_c [pw=cciw_w8_ca] || country: || hhid8:, pw(cchw_w8_ca)

Later, I had to reconsider the choice of outcome so I need to use a nominal outcome, and in such case, I don't know where to insert the weights in the code.

gsem (i.nominal_dep <- agec i.sex i.partner i.edu i.work i.social i.cac010_ i.cac013_ i.sph i.chronic2 i.newchr string_c death_c gdp_c gini_c M1[country]@1 M2[country>hhid]@1), mlogit nocapslatent latent(M1 M2)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#8

24 Mar 2021, 17:26

I don't know. I have no experience with -gsem- emulating multilevel models when there are different weights to be used at multiple levels. The -gsem- syntax I am familiar with allows including a weighting variable, but I don't think it accepts weights a multiple levels.

I hope somebody else will respond, as I would like to learn this myself.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#9

24 Mar 2021, 19:05

though I have never done, my reading of the help file

Code:

help gsem_estimation_options

leads me to believe this is possible; here is a quote:

pweights(varlist), fweights(varlist), and iweights(varlist) specify
weight variables to be applied from the observation (first) level
to the highest level groups. Only one of these options may be
specified, and none of them are allowed with crossed models or
intmethod(laplace).

and there is more
Comment
Anna Geo

Join Date: Feb 2021

Posts: 11
#10

24 Mar 2021, 19:42

Thank you! I will try it and let you know if and how it works. For now, Stata is still running the model without weights (over 3 hours now)
Comment
Indrakshi Roy

Join Date: Dec 2018

Posts: 2
#11

19 Jan 2022, 20:02

Hello,

I was wondering if someone could help me with a question I have in regards to multilevel multinomial logistic regression also. I am new to running this kind of model in STATA and could use some guidance. I have a data structure where I have patients nested within counties. My outcome variable is nihss_cat3 (4 different categories of stroke). Currently my code for running this model is:

Code:

gsem (i.nihss_cat3<- age i.sex i.new_race##i.dual elx_tot_grp_nihss i.ccw_stroke_hist i.thrombectomy i.hypertnsn i.alcohol i.diab i.obese i.hprlpdma i.tobaco i.np_phys_prop_county_rank i.pct_pov_county_rank i.pct_hseduc_county_rank i.county_stroke_belt M1[bene_county]@1), mlogit

Question 1: Does the above code look correct? Is there anything else I should add to this to improve the model in any way?

After running the model, I follow it with:

Code:

ereturn display, eform(or)

Question 2: I want my results displayed as odds ratios. From the stata manual, I am a little confused as to whether the above command is giving me odds or odds ratios? And if it is not giving odds ratios, how do I get odds ratios?

Question 3: My last question is regarding the interaction term new_race#dual. The new_race variable has 4 categories: Non-Hispanic White, Non-Hispanic Black, Hispanic, Other races. The variable dual has two categories : Non-duals and Duals. From the interaction term, I want my base category to be Whites who are non-Duals. and I want to compare all the other 7 categories to the base. However, the interaction term only returns estimates for Black-Duals, Hispanic-Duals and Other-Duals. What command will get me the odds ratio estimates for all the following (with White-nonDuals as the base) : White-duals, Black-duals, Black-nonDuals, Hispanic-Duals, Hispanic-nonDuals, Other-Duals, Other-nonDual??

Thank you in advance for your time and assistance.

Best,
Indrakshi
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29961
#12

19 Jan 2022, 22:41

Question 1: It is syntactically correct. Whether it is a correct model in terms of accurately reflecting the real world data generating process, probably not. Simple models seldom are. But what you might need to do to make it a better model is not a statistical or Stata question: it is a question about the underlying substantive science and the meaning of those variables.

Question 2: Yes, because you have specified the -eform()- option, the output from ereturn display will show odds ratios for all of the exogenous variables.

Question 3: When working with an interaction, just as ordinary indicator ("dummy") variables have an omitted category, the interaction has an omitted category for each of the variables in the interaction--it does not have only a single reference category. So if you have an interaction i.X##i.Y, and X has nx levels, and Y has ny levels, then there will be nx-1 indicators for X, and ny-1 variables for Y, so the number of interaction terms will be (nx-1)*(ny-1), and this is exactly what you see in your situation where nx = 4 and ny = 2. You get 3*1 = 3 interaction terms reported. Now, you ask "What command will get me the odds ratio estimates for all the following (with White-nonDuals as the base)" The first thing you need to reconceive is while the outputs from -ereturn display, eform(or)- for the variables are odds ratios, the outputs for the interaction terms are not, odds ratios. They are ratios of odds ratios. A ratio of odds ratio is the multiplicative analog of a difference in differences.

Added: Actually, in a multinomial model, there areno odds ratios at all: the correct terminology is relative risk ratio. An odds ratio is, strictly speaking, only defined in a binomial (logistic) model. And the things shown for the interaction terms are ratios of relative risk ratios. But hopefully no confusion arises here from using the term odds ratio or ratio of odds ratios.

So, let's clear up what the odds ratios you did get mean. For the moment, just focus on one level of the outcome--the fact that you have a multinomial rather than binomial outcome variable doesn't really change this aspect of interpretation, and there is no point in repeating the same language for each multinomial level. White is your race reference category. Non-Dual is your dual reference category. Now, what does the odds ratio given for, say, Black mean? It is not the Black:White odds ratio. In fact, by virtue of having adopted an interaction model, you have stipulated that there is no such thing as the Black:White odds ratio. There are, in fact, four Black:White odds ratios, depending on whether the Black and White under consideration are Dual or non-Dual. The odds ratio that is labeled Black is in fact the Black:White odds ratio conditional on both of them being Non-Dual. Similarly, the odds ratio shown for Dual is not the Dual:Non-Dual odds ratio because, again, there is no such thing in an interaction model. The odds ratio shown for Dual is the Dual:Non-Dual odds ratio conditional on both of them being White. To get odds ratios that are not directly shown you have to multiply the shown odds ratios to get the exact combination you want. For example, if you want the odds ratio for Black Dual compared to White non-Dual, That would be the OR labeled Black times the OR labeled Dual times the OR labeled Black#Dual. You will see that all of the 7 odds ratios compared to White Non-Duals can be calculated in this way as odds ratios directly shown or a product of 2 or 3 of them. Now, you don't have to do these multiplications by hand. The -lincom- command will do them for you, and given that you want the results in odds ratio metric, you specify the -or- option in -lincom-. Do read -help lincom- and the PDF manual section on it for more information about using -lincom-.

Moving beyond your original question, I want to suggest a different way of looking at your results. Because, frankly, I can never really understand the outputs of multinomial models presented as odds ratios, and I have never met anyone else who can either, though I suppose there are a handful of people out there who can. I strongly suggest that you look at your model in terms of the model's predicted probabilities of each of the outcome levels conditional on all combinations of new_race and dual. The command -margins new_race#dual- will do that. (Note #, not ## in this command. Also, be patient--the calculations can be slow in a large data set.) You will get these, along with standard errors and confidence intervals for all 8 combinations of race and dual, in combination with each possible outcome level. You may find them surprising. Sometimes an odds ratio is > 1 but the corresponding probabilities are in the opposite order. Don't be surprised if that happens. One of the complications of multinomial models is that the sum of the predicted probabilities of all outcome levels for any given combination of predictors must be 1. So sometimes even though an odds ratio makes you think that a certain probability should be higher than another, that constraint on the total being one may not "leave room" for that to happen: another category with an even bigger odds ratio may "crowd out" the expansion. But probabilities are much easier to understand than odds ratios, and -margins- gives them to you in a straight forward way: you don't have to figure out what pieces to put together, and you don't have to use -lincom- to do that. And the best part is that whereas conclusions drawn by looking at the odds ratios may be misleading, as just explained, the conclusions drawn from probabilities are not.

Last edited by Clyde Schechter; 19 Jan 2022, 22:57.
1 like
Comment
Indrakshi Roy

Join Date: Dec 2018

Posts: 2
#13

20 Jan 2022, 08:00

Hi Clyde, thank you so much for your detailed response, it was very very helpful

As for the last part, I completely agree with you on reporting predicted probabilities vs odd ratios. While I strongly prefer predicted probabilities over odds ratios for such models, my team members who are from a more clinical background, are more familiar with odds ratios and are of the opinion that clinical journals will not be favorable to predicted probabilities. I shall try to convince them one more time

Thank you again for your response, this was very very helpful.
Comment

Announcement

Problems with mulitlevel multinomial logistic regression syntax

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment