Multilevel Analysis

Alejandro Torres

Join Date: Jan 2018

Posts: 152
#1

Multilevel Analysis

04 Jan 2018, 10:15

Hello Everyone,

I am working in my thesis on three models, model one with with the dependent variable proportions including zeros (proportions of women in the board, without 1s (there is not boards full of women) ) and 2 models with dichotomic variables as dependent variable. My independent variable is dichotomic, testing interaction with culture (Hofstede) and several additional controlling variables.

I have measures for 44 countries, eight years (2007 - 2014) and a total of 5400 companies, unbalanced. I was asked to test using an HLM3 level: firm-year, firm and country but I am not clear how to consider firm-year as a level, could be a level 1? I have information about industry too, for me could be more natural thinking about firm, industry, country but I was asked for the firm-year. I am using stata 15. Now I am reading to understand (from the scratch) multilevel analysis and learn the commands for testing my models.

After all that explanation, I would like to ask you please if you could help me to understand how this firm-year/firm/country levels works and if you have any paper or summary of commands to use in this combination of Stata. I saw some melogit command for the dichotomic variables, but what if it is a panel data? I used fracreg for the simple reggression of the first model, but I don`t see any mefracreg command to use.

Thank you very much for any help.

Best Regards !!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

04 Jan 2018, 11:00

Well, if your data contains (at most) one observation for each firm and each year, then firm-year is just the bottom level of your data and you don't need to do anything explicit to include it in the model. It would just be:

Code:

appropriate_me_command fixed_effects || country: || firm:

If you have multiple observations for firms in each year, then that is a different story and you would need to create a variable identifying firm-year combinations (-egen firm_year = group(firm_year)-) and then include that as another level under firm.

Assuming that each firm belongs to only one industry, and it is the same industry at each time point in the data, then I am inclined to agree with you that this is really a four-level model, with firms nested in industries, but industries would then probably be crossed with, rather than nested in, country. Or perhaps a simpler approach would be to include industry among the fixed effects, rather than as a separate level in the model (especially if the number of industries in your data is small). I can imagine arguments for doing this either way, and it would depend on the details of your particular research question and the details of your data. If your adviser has told you to just ignore the industry level altogether, there may well be a very good reason for it, but I think you should ask him or her to explain why. Your tuition pays your advisor to teach you. Issuing directives without explaining them is not teaching. Insist on getting value for your money.

You are correct that there is no mefracreg command, and I'm not aware of any user-written command that would do multi-level fractional logistic regression either. If all you have is the proportion itself, you might just use -mixed-. If the proportions observed in the actual data come near 0 or 1, then this will have the limitation that predicted proportions might be outside the 0 to 1 range. In that case you might consider doing a transformation: -gen new_dep_var = logit(proportion)-, and run -mixed- with new_dep_var as the dependent variable. If you have, or can get, not just the proportion but the numerator and denominators that make the proportion, then you can run -melogit numerator other_variables || country: || firm:, binomial(denominator)-.
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#3

30 Jan 2018, 11:26

Dear Clyde,
Thank you very much for your answer, it is an excellent one. I apologise because I didn`t answer you back before, I was waiting for some advice in my email if I received an answer.

Well, I see your answer now and I can tell that I did the same you said, for the fracreg, I used the "xtmixed depvar indvar var1 .... industry1......industry9 year2007.....year2014 || country: || firm:" and worked perfectly, now, for the binomial dependent variable I tried "melogit depvar indvar var1 .... industry1......industry9 year2007.....year2014 || country: || firm:" and "meglm depvar indvar var1 .... industry1......industry9 year2007.....year2014 || country: || firm:, family(binomial) link(logit)" but I found a flat region and a not convergence outputs. I tried with "technique(bfgs)" others too, but I have the same problem; finally, I decide to limit the iterations to 6 or 10, and I have an interesting output but without convergence.
I would like to ask you what do you think about it? I can`t see any strange thing in my data nor missing values.
Finally, I have one observation for each firm and each year, and sometimes, I don't have the firm all years with observations.
About asking, I can`t find anyone close to me that manage the Multilevel analysis; my advisor told me that someone else told him that I should do that.
Thank you very much for your time, I follow you from different questions.
Best Regards.
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#4

30 Jan 2018, 13:24

I am keep trying in different ways, I was suspecting about a gdp variable, because was large compared with the others, I did a cube root transformation that I read in another post, but I received a message saying "cannot compute an improvement -- discontinuous region encountered" how large is the problem if a set the iterations to 6? I am worried about the rejection of my advisors and I don`t have how to support this decision.
Thank you so much again.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#5

30 Jan 2018, 13:53

If the model doesn't converge, then your results are invalid. In general, you should find the manual entry on meglm, and read the section on diagnosing convergence problems. However, your syntax looks like you are including one dummy for each year. The syntax is potentially problematic if you didn't leave out one year (i.e. you need a base level). Normally, most of us would have one variable called -year-, taking on the values 2007, 2008, 2009... 2014. We'd type the command like this:

Code:

melogit depvar indvar var1 i.industry i.year || country: || firm:

So, you have year and industry fixed effects. Each firm gets a random intercept, and there's a random intercept for country also. Type

Code:

help fvvarlist

for more information on the -i.- syntax I used, but this tells Stata that the variable is a categorical variable, so it picks the first level (you can specify otherwise, though), and in the regression output you get coefficients for, say, year 2008 versus the base year (2007).

Discontinuous region is a big problem. It's as big as the model not converging.

Last, your advisor told you to use multilevel analysis, because someone told him it was a good idea. This is not very encouraging. As you can see, multilevel analysis is harder than you think. I'd strongly encourage you to find someone you can ask in person. It is hard to explain things over the Internet. Perhaps you can get your advisor to refer you to the person who told him he should use multilevel analysis.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#6

30 Jan 2018, 18:08

Dear Weiwen,
thank you very much for your answer, I am going to read about the i. command and I will try in that way, with out dummies. I appreciate the time that all of you take answering.
Finally, unfortunately that person that recommend the multilevel is not from my country, ans its been hard to find someone that manage multilevel analysis.
thank you very much again!!
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#7

30 Jan 2018, 19:49

Hello again !!
I did "melogit depvar indvar var1 i.industry i.year || country: || firm:" but I have again the discontinuous region encountered. Are there something else that I can do please? Why stata find this kind of region?
Thank you very much for your help.
Best regards
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#8

30 Jan 2018, 21:08

Originally posted by Alejandro Torres View Post

Hello again !!
I did "melogit depvar indvar var1 i.industry i.year || country: || firm:" but I have again the discontinuous region encountered. Are there something else that I can do please? Why stata find this kind of region?
Thank you very much for your help.
Best regards

So, glad to hear it wasn't the dummies. I have no idea what a discontinuous region is, honestly, and I haven't seen anyone offer a good explanation either.

If not for the dummies, the next step I'd think about is fitting a simplified version of your model. For example, take out the random intercept for country. Does that converge? If so, save the coefficients like I described, and try re-fitting the model with the random intercept for country. Alternatively, is year something you can treat as continuous? That would simplify the model as well, and it would probably not be unreasonable. Another thing to do is see if you can determine which variables are problematic. Say that when fitting the model, it runs 5 iterations and then hits a discontinuous region. Try this:

Code:

melogit depvar indvar var1 i.industry i.year || country: || firm:, iterate(5)

That will make Stata run 5 iterations and then present the estimated parameters and variances. If you see a missing standard error, that can mean a problem with that variable. Or a parameter estimate going to an unreasonable value. It may help if you can present the code exactly as you typed it, plus output, within the code delimiters - it's the # button on the formatting toolbar.

Last edited by Weiwen Ng; 30 Jan 2018, 21:21.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#9

31 Jan 2018, 07:09

Hello Weiwen, thank you so much again.
Actually, I did what you said before and I can´t see any unreasonable output (only one industry and one year less in the outcome), what do you thing?

melogit csr lib lth llt roe bs bi lev siz Polrightinv religdiver lingdiver ethnic democracy autocracy RuleofLaw ShRights CredRight mcap lngdp femeduc genderquotas i.industry i.year ||cid: ||firm:, technique(bfgs) iterate (6)

I am sending the outcome.

In the meantime I am going to try year as a continuous.

Finally, I am not clear about taking out the random intercept for country and saving the coefficients.

Thank you very much !!!
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#10

31 Jan 2018, 07:40

You can see that I used technique(), with out that, I received the problema that can´t compute because the flat region.
Thank you
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#11

31 Jan 2018, 09:23

Hello again,

I just tested a multilevel analysis but I changed firm by industry in second level:
melogit depvar indvar var1 i.industry i.year || country: || industry: I didn`t have the same problem, then I am thinking, could be a problem with the firm variable? I am now just guessing, I started all the analysis (pooled and multilevel) with "xtset firm year", could be possible that for some reason the firm variable can`t work on the multilevel because of that? I am keep receiving in all my test with firm as a second level the flat region problem. Please, a appreciate some extra help. Best Regards!!
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#12

31 Jan 2018, 14:19

Originally posted by Alejandro Torres View Post

Hello again,

I just tested a multilevel analysis but I changed firm by industry in second level:
melogit depvar indvar var1 i.industry i.year || country: || industry: I didn`t have the same problem, then I am thinking, could be a problem with the firm variable? I am now just guessing, I started all the analysis (pooled and multilevel) with "xtset firm year", could be possible that for some reason the firm variable can`t work on the multilevel because of that? I am keep receiving in all my test with firm as a second level the flat region problem. Please, a appreciate some extra help. Best Regards!!

It is very difficult to say more without seeing your data, or at least a subset of the data. If you have Stata 15.1, type

Code:

help dataex

And if not, you can install -dataex- from SSC:

Code:

ssc install dataex

This command produces a summary of your data, which you can display on the forum in code delimiters, and people can easily copy it and run a test in their own copy of Stata. Please note, I mentioned the code delimiters before, and I'll mention them again because they're really helpful. If you hit the # button on the formatting toolbar, you will insert a pair of code delimiters, and anything you type between them will show up in a nice box like in my post.

I think that your xtset command looks like it should, given your description, unless there was some problem with firm or year. I also don't think that -xtset- will affect the -melogit- command.

That said, you need to think about the data are structured. Clearly, firm-year is nested in firm, i.e. every firm has one or more observations in different years. I was assuming that firms are nested in countries. Your code above suggests that industry is nested in country. That seems wrong; it seems like countries should have one or more industries. You're accounting for that with the fixed effect for industry, but if you must put that into a random effects structure, it is probably a crossed random effect; don't try experimenting with that unless you understand what these are.

I asked you to take out the random intercept for country and save the coefficients to make the model a bit more simple. If it converges, that's a good sign. If it does not converge, ... I am not sure what to say, but you would probably want to take a look at your data and check for problems.

Also, I forgot to mention, try the command -meqrlogit-. It may converge where -melogit- fails to. The manual says that -meqrlogit- "may aid convergence when variance components are near the boundary of the parameter space" - they mean when the variances of the random effects are close to 0.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Alejandro Torres

Join Date: Jan 2018

Posts: 152
#13

31 Jan 2018, 18:55

Thank you very much again Weiwen, I will try the dataex code to show the data. I am more clear with your explanation.
Best regards !!
Comment
Andre Altenberg

Join Date: May 2021

Posts: 4
#14

02 May 2021, 06:25

Hi everyone, i have a similar question as Alejandro Torres. I'm also currently rounding up my thesis using an unbalanced sample of 6831 firm- year observations from 22 countries from 2013 until 2019 covering 1895 firms, investigating how hofstede's cultural values (moderators standaridized between 0 and 100%) affect the relationship between board gender diversity (0-100%) and corporate social performance (0-100%). The iv, dv and moderating variables are all continuous, except for the control variables. My professor also adviced me to use a multi level model.

I interpretated the MLM as follows.

Firm year observations (level 1)
Industry (level 2)
Country (level 3)

As i couldn't find papers executing such a moderation relationship such that the firm level is affected through a third level variable, i decided two exclude the second level (industry) and created a dummy variable for theindustry component to account for industry effects.

However, firm's are nested in industries. Accordingly, my professor stated that the multi-level model should account for the industry aspect aswell.

Is this the right way to go, or should I create an industry level?

How can i do this? In my opinion, to answer my research question the industry aspect may be not relevant and therefore not be implemented as a level in the multi-level model?

Last edited by Andre Altenberg; 02 May 2021, 07:00.
Comment

Announcement

Multilevel Analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment