Multinomial Logistic Regression

andy macrobarty

Join Date: Jan 2017

Posts: 65
#1

Multinomial Logistic Regression

14 Aug 2017, 11:54

In Multinomial Logistic Regression, we can apply robust or cluster option.

Do we need to form panel to make use of the robust option?

mlogit $ylist $xlist ib3.Industry11_num ib1.advisor11_num, robust
mlogit $ylist $xlist ib3.Industry11_num ib1.advisor11_num, vce (cluster firmid)

without any robust or cluster option it changes from LR chi2 (69) to Lrwald chi2 (69) (the values become 20,000 etc). Does it matter of concern and sometime I even get blank values with no cluster option? Is it something to worry about.

Hope to get reply soon

Regards,
Andy
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29959
#2

14 Aug 2017, 13:10

without any robust or cluster option it changes from LR chi2 (69) to Lrwald chi2 (69) (the values become 20,000 etc). Does it matter of concern and sometime I even get blank values with no cluster option? Is it something to worry about.

This description does not provide enough information to answer the question. Please show the entire output you get from Stata. When doing so, please use code delimiters so that it aligns in a readable way. (If you are not familiar with code delimiters, please read FAQ #12 for instructions.)
Comment
Lina Saltik

Join Date: Aug 2017

Posts: 4
#3

15 Aug 2017, 04:38

Hello guys, I am trying to run a multinomial logistic regression to investigate the determinants of the availability of essential medicines (dependent variable consisting of 4 categories- very low, low, middle, high availability). I have to admit, I am relatively unexperienced in econometrics and only started working with STATA a few months ago. I hope you can help me with the following problem: I have created my own datset and it only has a sample size of 55 countries. Yet, I was still hoping to use mlogit to see which of independent variables (about 20 including dummy, continious and indexes) has an impact on the availability of medicines. Unfortunately, I discovered that almost every independent variable does not have any statistical significance. I have already tried to reduce the number of categories to 3 but the result was the same. Then I reduced it to 2 and ran a logit- same happens, namely nothing :/ I am pretty sure that my dataset is consistent, i.e. that the data for variables I included is correct. So I do not know what I did wrong.. Do you think that the relatively small sample size is the problem? To give you an idea please see some of the miserable outputs. I would be very happy to get some help, Lina

Multinomial logistic regression Number of obs = 34
LR chi2(12) = 15.23
Prob > chi2 = 0.2292
Log likelihood = -37.368318 Pseudo R2 = 0.1693

------------------------------------------------------------------------------
Availab~bgen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Very_low |
Corruption | 1.807573 .9688877 1.87 0.062 -.0914117 3.706558
IPR | -.3211244 .920429 -0.35 0.727 -2.125132 1.482883
ICESCR_rat | -.5113196 1.227471 -0.42 0.677 -2.917118 1.894479
GINI | .0420208 .0725674 0.58 0.563 -.1002087 .1842502
_cons | .0384586 3.580522 0.01 0.991 -6.979235 7.056152
-------------+----------------------------------------------------------------
Low | (base outcome)
-------------+----------------------------------------------------------------
Middle |
Corruption | .3591732 .963486 0.37 0.709 -1.529225 2.247571
IPR | -1.849635 1.195705 -1.55 0.122 -4.193174 .4939039
ICESCR_rat | .953146 1.417671 0.67 0.501 -1.825438 3.73173
GINI | .0395393 .0860171 0.46 0.646 -.129051 .2081297
_cons | 2.595281 3.444002 0.75 0.451 -4.154839 9.345401
-------------+----------------------------------------------------------------
High |
Corruption | -.0180616 1.541592 -0.01 0.991 -3.039527 3.003404
IPR | .7446991 .978602 0.76 0.447 -1.173326 2.662724
ICESCR_rat | 15.49492 2941.657 0.01 0.996 -5750.046 5781.036

Multinomial logistic regression Number of obs = 25
LR chi2(8) = 11.94
Prob > chi2 = 0.1538
Log likelihood = -11.722996 Pseudo R2 = 0.3375

----------------------------------------------------------------------------------
Affordabili~vgen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
not_affordable | (base outcome)
-----------------+----------------------------------------------------------------
affordable |
GDP_PC | -.0011941 9.933169 -0.00 1.000 -19.46985 19.46746
Corruption | -91.71389 27916.48 -0.00 0.997 -54807.01 54623.59
pov_gap_national | -2.413926 2023.87 -0.00 0.999 -3969.125 3964.298
unemployment | -2.487889 2806.815 -0.00 0.999 -5503.745 5498.769
_cons | -82.77787 32532.14 -0.00 0.998 -63844.6 63679.05
-----------------+----------------------------------------------------------------
very_affordable |
GDP_PC | -.0004514 .0004038 -1.12 0.264 -.0012428 .0003401
Corruption | -.5333177 1.380451 -0.39 0.699 -3.238951 2.172316
pov_gap_national | -.1266533 .11099 -1.14 0.254 -.3441896 .0908831
unemployment | -.008122 .0945659 -0.09 0.932 -.1934677 .1772238
_cons | .6709298 1.900513 0.35 0.724 -3.054007 4.395866
----------------------------------------------------------------------------------

Last edited by Lina Saltik; 15 Aug 2017, 04:59.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4947
#4

15 Aug 2017, 07:13

Hi Lina. Welcome to Statlist. Some comments:

First off, your output would be much easier to read with code tags; see pt. 12 in the FAQ. As it is, 2 or more consecutive spaces get stripped down to one space, so things don't line up correctly.

Most things I've read say you should have at least 100 cases for a maximum likelihood analysis, and more if the model is complex. Further, you said you had 55 countries, but only 34 show up in your first table and 25 in your second, so you must have a lot of missing data.

My guess is you'll have to settle for a simpler analysis. Use descriptive stats, maybe bivariate models. If you can fill in those missing values that may help a little too.

Finally, in the future I would suggest starting a new thread. True, your question is about mlogit but it is different than the original Q. Somebody who has read what was here before may have decided they aren't interested in any followups, so you could lose a potential audience that could help you.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
andy macrobarty

Join Date: Jan 2017

Posts: 65
#5

16 Aug 2017, 15:57

Please find attached document and assist me
Attached Files

stata.docx (52.3 KB, 2 views)
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

16 Aug 2017, 19:16

andy macrobarty Please act according to the advice given in #2.

In short, you may use the CODE delimiters or install the SSC dataex so as to share data, command and output.

Please also take a look at the FAQ where you will find a recommendation concerning 'foreign' extensions such as .doc and .xls.

Thanks.

Best regards,

Marcos
Comment
andy macrobarty

Join Date: Jan 2017

Posts: 65
#7

18 Aug 2017, 13:18

Can you please provide me the guide, how to do

In short, you may use the CODE delimiters or install the SSC dataex so as to share data, command and output.!!!!
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4947
#8

18 Aug 2017, 14:11

See the FAQ, especially pt. 12. The link is near the top of the page.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
andy macrobarty

Join Date: Jan 2017

Posts: 65
#9

18 Aug 2017, 14:26

mlogit $ylist $xlist ib3.Industry, ib1.advisor

EPS 0.119*** -0.120** 0.051

(0.028) (0.058) (0.038)

TSR -3.673*** -2.193 -1.726

(1.326) (1.334) (1.984)

MTB 0.000 -0.001 -0.060*

Observations 1931

LR chi2 1129***

Log likelihood -1834

Pseudo R-squared 0.230

Model 2
mlogit $ylist $xlist ib3.Industry ib1.advisor, robust

1 2 3

Column 1 Column 2 Column 3

EPS 0.119*** -0.120** 0.051

(0.021) (0.058) (0.033)

TSR -3.673*** -2.193** -1.726

(1.304) (1.102) (1.642)

MTB 0.000 -0.001 -0.060**

LR chi2 (69) 19998***

Log likelihood -1834

Pseudo R-squared 0.2355

mlogit $ylist $xlist ib3.Industry ib1.advisor, vce (cluster Industry11_num)

If I do this I get Prob> Chi as blank and wald Chi2 (6) blank as well. Please assist me with that
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#10

19 Aug 2017, 08:23

Please do really read the FAQ. You were supposed to use the CODE delimiters. In order to present command and output under code delimiters, you just need to click on the A button, and this you see in the top right corner of each message you write. Then, you click on the hashtag button, Finally, you may copy and paste the output between CODE delimiters. Thanks.

That said, I wish to make a few comments. Maybe the lack of p-value for the omnibus test in the third model is due to "issues" related to (few? single? none?) clusters. Additionally, the results presented above seem to be somewhat "edited", so to speak. For example, the comma before ib1.advisor would probably entice and error message,albeit you managed to get results... What is more, you showed the number of observations in the first part and hid them in the second part. Last but not least, the dfs for the LR chi2 are also hidden in one of the halves of the output. Not to forget, there should be a third output.

This is to say, again, that the best way to entail a truly helpful reply is, basically, following the FAQ advice.

Best regards,

Marcos
1 like
Comment
andy macrobarty

Join Date: Jan 2017

Posts: 65
#11

19 Aug 2017, 15:20

For, example, even if I do this but I want my data to be confidential, I just want to show the output to get a feedback. I understand how delimiters help
Comment

andy macrobarty

Join Date: Jan 2017
Posts: 65

#12

19 Aug 2017, 18:12

Code:

. mlogit $ylist $xlist ib3.Industry11_num ib1.advisor11_num  n,    robust

Iteration 0:   log pseudolikelihood = -2568.9947  
Iteration 1:   log pseudolikelihood = -2097.6237  
Iteration 2:   log pseudolikelihood =   -2004.56  
Iteration 3:   log pseudolikelihood = -1988.8591  
Iteration 4:   log pseudolikelihood = -1987.1632  
Iteration 5:   log pseudolikelihood = -1986.7815  
Iteration 6:   log pseudolikelihood = -1986.6959  
Iteration 7:   log pseudolikelihood = -1986.6753  
Iteration 8:   log pseudolikelihood = -1986.6711  
Iteration 9:   log pseudolikelihood = -1986.6704  
Iteration 10:  log pseudolikelihood = -1986.6703  
Iteration 11:  log pseudolikelihood = -1986.6703  

Multinomial logistic regression                   Number of obs   =       2064
Wald chi2(63)   =   18076.25
Prob > chi2     =     0.0000
Log pseudolikelihood = -1986.6703                 Pseudo R2       =     0.2267


Robust
TSR1       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]

1                 
epsv55     .124748   .0206923     6.03   0.000     .0841918    .1653042
tsrv441   -3.629497    1.21833    -2.98   0.003     -6.01738   -1.241614

Comment

andy macrobarty

Join Date: Jan 2017
Posts: 65

#13

19 Aug 2017, 18:13

Code:

. mlogit $ylist $xlist ib3.Industry11_num ib1.advisor11_num

Iteration 0:   log likelihood = -2568.9947  
Iteration 1:   log likelihood = -2130.3895  
Iteration 2:   log likelihood = -2069.8776  
Iteration 3:   log likelihood = -2064.4225  
Iteration 4:   log likelihood = -2063.7197  
Iteration 5:   log likelihood = -2063.5607  
Iteration 6:   log likelihood = -2063.5215  
Iteration 7:   log likelihood = -2063.5136  
Iteration 8:   log likelihood = -2063.5119  
Iteration 9:   log likelihood = -2063.5115  
Iteration 10:  log likelihood = -2063.5114  
Iteration 11:  log likelihood = -2063.5114  

Multinomial logistic regression                   Number of obs   =       2064
                                                  LR chi2(48)     =    1010.97
                                                  Prob > chi2     =     0.0000
Log likelihood = -2063.5114                       Pseudo R2       =     0.1968

----------------------------------------------------------------------------------
            TSR1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
1                |
          epsv55 |   .1319641   .0281565     4.69   0.000     .0767784    .1871498
         tsrv441 |  -3.516992   1.209637    -2.91   0.004    -5.887838   -1.146147

Comment

andy macrobarty

Join Date: Jan 2017
Posts: 65

#14

19 Aug 2017, 18:14

HTML Code:

. mlogit $ylist $xlist ib3.Industry11_num ib1.advisor11_num, vce (cluster Industry11_num)

Iteration 0:   log pseudolikelihood = -2568.9947  
Iteration 1:   log pseudolikelihood = -2130.3895  
Iteration 2:   log pseudolikelihood = -2069.8776  
Iteration 3:   log pseudolikelihood = -2064.4225  
Iteration 4:   log pseudolikelihood = -2063.7197  
Iteration 5:   log pseudolikelihood = -2063.5607  
Iteration 6:   log pseudolikelihood = -2063.5215  
Iteration 7:   log pseudolikelihood = -2063.5136  
Iteration 8:   log pseudolikelihood = -2063.5119  
Iteration 9:   log pseudolikelihood = -2063.5115  
Iteration 10:  log pseudolikelihood = -2063.5114  
Iteration 11:  log pseudolikelihood = -2063.5114  

Multinomial logistic regression                   Number of obs   =       2064
Wald chi2(6)    =          .
Prob > chi2     =          .
Log pseudolikelihood = -2063.5114                 Pseudo R2       =     0.1968

(Std. Err. adjusted for 9 clusters in Industry11_num)

Robust
TSR1       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]

1                
epsv55    .1319641   .0246742     5.35   0.000     .0836036    .1803246
tsrv441   -3.516992   .9354278    -3.76   0.000    -5.350397   -1.683588

Comment

andy macrobarty

Join Date: Jan 2017

Posts: 65
#15

19 Aug 2017, 18:17

I have done as you said. Now, the question is I have set of control. Question is which model to choose. clustered by industry do not give me Prob>chi2 and wald chi2(6) does not give anything else as well. However, with robust there are Wald chi2(63) = 18076.25. Is it a matter of concern? I head that robust does not make sense in non-linear models?
Comment


EPS	0.119***	-0.120**	0.051
	(0.028)	(0.058)	(0.038)
TSR	-3.673***	-2.193	-1.726
	(1.326)	(1.334)	(1.984)
MTB	0.000	-0.001	-0.060*

Observations	1931
LR chi2	1129***
Log likelihood	-1834
Pseudo R-squared	0.230

	1	2	3
	Column 1	Column 2	Column 3
EPS	0.119***	-0.120**	0.051
	(0.021)	(0.058)	(0.033)
TSR	-3.673***	-2.193**	-1.726
	(1.304)	(1.102)	(1.642)
MTB	0.000	-0.001	-0.060**

LR chi2 (69)	19998***
Log likelihood	-1834
Pseudo R-squared	0.2355

Announcement

Multinomial Logistic Regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment