marginal results for melogit

Rachel Springer

Join Date: Apr 2019

Posts: 4
#1

marginal results for melogit

08 Apr 2019, 12:32

I am modeling a proportion (numerator/denominator where the denominator represents the number of trials and the numerator the number of successes), comparing performance between 2 groups of subjects over time, with melogit as follows:

Code:

melogit numerator i.time##i.group [pweight = w], || subject:time, covariance(unstructured) binomial(denominator)

The problem I'm running into is that when I try to get the marginal predictions:

Code:

margins i.time#i.group, predict(fixedonly)

Then I get the predictions in terms of the numerator. How do I get the marginal predictions instead as proportions?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

08 Apr 2019, 18:46

Code:

margins i.time#i.group, expression(predict(fixedonly)/denominator)
1 like
Comment
Rachel Springer

Join Date: Apr 2019

Posts: 4
#3

08 Apr 2019, 19:46

Thank you so much! This works perfectly.
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1132

17 Dec 2021, 12:57

Let me add my belated thanks to Clyde Schechter for the code in #2. I am currently working on a dataset with 10 yes/no dichotomies, and want to use -melogit nuerator- with -binomial(denom)- to do the analysis. Like Rachel, I notice that -margins was giving me the mean for the numerator, and I eventually figured out how to to include expression() to make it show the mean proportions.

For my initial explorations, denom was constant at 10. And in that situation, I found that the margins following -melogit- matched the margins following -fracreg logit-, and that both of them matched the mean proportions obtained via -tabstat-. That seemed right, and made me feel more confident in what I was doing.

However, in the real data, there are some missing data points, and denom ranges from 8 to 10. Therefore, I inserted some missing values in my toy data and tried again. What I found was that -margins- following -fracreg logit- still gave results matching the mean proportions from -tabstat-, but -margins- following -melogit- gave slightly different results. Anyone who wants the details can execute the code pasted below. But my question(s) boil down to this:

What is -margins- doing differently after -melogit- with -binomial(denom)- than it is doing after -fracreg logit-? Why do the values not match exactly when denom is variable?

Thanks Clyde, or anyone else who might be able to help.

Cheers,
Bruce

Code:

* ================================================================
*  File:    melogit_margins.do
*  Date:    17-Dec-2021
* ================================================================

* CONTEXT
* I have a set of 10 yes/no items, and I want to use
* the proportion of YES responses as my outcome.  
* The real data has multiple years of observation for
* each subject, which is why I want to use -melogit-.
* But for now, I'll stick to a one-level model,
* as it is all that I need to figure out how -margins- works.

* Suppose group (3 levels) is the only explanatory variable.
* Generate some "toy" data.
clear
set obs 120
generate byte group = mod(_n,3)
tabulate group
* Generate variables x1-x10, with p(Yes) varying by group
forvalues i = 1/10 {
    generate byte x`i'= .
}
forvalues i = 1/10 {
    forvalues g = 0/2 {
        replace x`i' = rbinomial(1,0.4+`g'/10) if group==`g'
    }
}
egen byte nyes = rowtotal(x1-x10) // # of Yes responses  
egen pyes = rowmean(x1-x10)       // proportion of Yes responses
summarize

* I want to use -melogit- with -binomial(10)-, but first,
* let's get the group means for pyes:
tabstat pyes, by(group)
* Compare those means to the -margins- from -fracreg logit-
quietly fracreg logit pyes i.group
margins group
* The -margins- from -fracreg logit- match the means from -tabstat-.

* For my actual data, I have multiple years per subject, so
* I want to estimate a multilevel model via -melogit- with
* the -binomial()- option.  But first, I want to make sure
* I understand what -margins- after -melogit- with -binomial()-
* is showing me.  Let's try it.
quietly melogit nyes i.group, binomial(10)
margins group, predict(mu) // Use the default for predict()
tabstat pyes, by(group)
* Hmm.  The margins from -melogit- are showing me the mean
* number of Yes responses, not the mean proportion of responses.
* If I add an -expression()- option, I should be able to
* display mean proportions rather than mean numbers.
margins group, expression(predict(mu)/10)
* Okay, now the -margins- values = mean proportions.

* BUT...in my actual data, there are some missing data points,
* and I want to include only those folks who have responded
* to at least 8 of the 10 items.  Let's try to mimic this situation.

* Insert some missing values in the x1-x10 variables
summarize x1-x10
forvalues i = 1/10 {
    replace x`i'= . if runiform() < 0.05
}
summarize x1-x10

egen byte denom2 = rownonmiss(x1-x10)
* Generate the next two variables only if denom2 > 7
egen byte nyes2 = rowtotal(x1-x10) if denom2 > 7
egen pyes2 = rowmean(x1-x10) if denom2 > 7
summarize denom2-pyes2

* Compare results from tabstat & fracreg, as before
tabstat pyes2, by(group)
quietly fracreg logit pyes2 i.group
margins group, cformat(%9.7f)
* The values match (to 7 decimals, at least).
* Now use -melogit- with -binomial(denom2)-.
* For -margins, use the method suggested by Clyde Schechter in post #2 here:
* https://www.statalist.org/forums/forum/general-stata-discussion/general/1492391-marginal-results-for-melogit
quietly melogit nyes2 i.group, binomial(denom2)
margins group, expression(predict(fixedonly)/denom2) cformat(%9.7f)
* The values are close, but not a perfect match to the values
* from -fracreg logit- and -tabstat-.

* Q. What is -margins- doing differently after -melogit- with
* -binomial(denom)- than it is doing after -fracreg logit-?
* Why do the values not match exactly?  

* ================================================================

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#5

17 Dec 2021, 14:23

It's not about what -margins- is doing. If you remove the -quietly-s, you will see that -fracreg logit- and -melogit- are producing different coefficients. From there, I would expect the -margins- results to also differ.

At the risk of muddying the waters further, let me point out yet another way to do the analysis:

Code:

glm numerator IVs..., link(logit) family(binomial denominator)

When I apply this method to the model in #4 the results are identical to those using -melogit-, for both the regressions themselves and for -margins-.

I think the differences are as follows: -fracreg- has no ability to take into account the varying size of the denominators in the different observations, it isn't even aware of them. So it must be treating each observation as if they were all equivalent in that respect. But that is not appropriate when the denominators actually vary. Observations based on larger denominators ought to be more influential. -melogit- and -glm- effectuate that.

In fact, I just experimented a little. If you rerun the -fracreg- command adding [iweight = denom2], it produces results that match exactly the results of -melogit- and -glm-.
1 like
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1132
#6

17 Dec 2021, 15:45

Hi Clyde. The results of your little experiment are interesting and useful. Thanks.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment

Announcement