Dear Statalist:
I am hoping to calculate a set of marginal effects at multiple levels of a factor variable. The twist is that I'd like to average them together (which, of course, would only make sense if it were weighted based on the distribution of that factor variable).
As an example, I'd say to imagine a hypothetical variable of race/ethnicity that has five categories: white, black, hispanic/latino, east asian, and south asian.
If I wanted to find the marginal effect for individuals who are white, I could use something like:
If I wanted to see the difference between white, black, hispanic/latino, etc, I could do:
But what if I wanted to see the difference between white and all of the other categories together? The way I used to do it is to recode race_ethnic into a white/nonwhite variable, which I could then put directly in the regression equation. That would allow me to use the following:
That's probably the cleanest way to do it in a simple, one-off comparison. But it gets really messy if you want to compare white to all other categories, and black to all other categories, and hispanic/latino to all other categories, etc. That would necessitate a bunch of regressions, when the better way to do it is just keep it as one larger factor variable. Furthermore, grouping them together creates a rather heterogeneous reference group, and putting them all together might have some strange effects on the other coefficients in the model.
I suppose I could get what I want in a more disaggregated modeling framework, by predicting the effect of race/ethnicity for each respondent, and then collapsing the dataset while taking the mean of that effect. That sounds like a mess for a number of reasons, and a whole lot could go wrong doing it that way if you're not careful. I was hoping that there might be an option in -margins- that I missed that allows me to do that. Or maybe there are other commands, user written or otherwise, that might help.
Thank you for your time,
Jonathan
I am hoping to calculate a set of marginal effects at multiple levels of a factor variable. The twist is that I'd like to average them together (which, of course, would only make sense if it were weighted based on the distribution of that factor variable).
As an example, I'd say to imagine a hypothetical variable of race/ethnicity that has five categories: white, black, hispanic/latino, east asian, and south asian.
If I wanted to find the marginal effect for individuals who are white, I could use something like:
Code:
margins i(1).race_ethnic
Code:
margins i(1 2 3 4 5).race_ethnic
Code:
margins i(0 1).white
I suppose I could get what I want in a more disaggregated modeling framework, by predicting the effect of race/ethnicity for each respondent, and then collapsing the dataset while taking the mean of that effect. That sounds like a mess for a number of reasons, and a whole lot could go wrong doing it that way if you're not careful. I was hoping that there might be an option in -margins- that I missed that allows me to do that. Or maybe there are other commands, user written or otherwise, that might help.
Thank you for your time,
Jonathan
Comment