Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Controlling for differences caused by occupation

    Hello - I am new to the world of econometrics and am working on a small research paper. I am trying to identify the difference in hourly wages for second generation Hispanic and Asian Americans.

    A big thing that I've identified through my research is that Asian Americans and Hispanic Americans tend to enter different occupations, so for my analysis I think it's important to add occupation as a control. I am unsure of how to do this without listing out a dummy variable for each of the 25 occupation groups that I've put together. I came across the idea of adding i.occupation in my regression... is this the way to go?

  • #2
    Adding i.occupation to your regression will cause Stata to create virtual indicators for 24 of the 25 occupation groups into your model. This is the usual way of adjusting for the effects of occupation. (Although it is common to refer to "controlling" for occupation, in observational data you never "control" anything; at best you adjust for it.)

    That said, if you go this route, you need to state your research question carefully. If you were to draw a diagram of the putative causal effects going on, would it not look something like this:

    Ethnic Group ==> Education/Social (Dis)Advantage ==> Occupation ==> Wages

    Occupation certainly does not "cause" ethnicity. So Occupation is not a confounder of the Ethnic Group:Wages relationship. It lies, instead, directly on the causal path between them: it mediates the relationship. If you adjust for occupation, you are not estimating the causal effect of ethnic group on wages. So you need to be clear that you are not studying the total causal effect of ethnicity on wages, but only the part of that effect which is not mediated by occupation. That's a perfectly fine thing to do and will answer (and probably raise) important questions. But just be clear that this is what you are doing.

    Indeed, I don't know how the field of econometrics views it, but this type of question strikes me as best approached with structural equations modeling, where you have both a direct path from ethnicity to wages and an indirect path mediated by occupation and you can estimate the direct and indirect effects simultaneously.

    Comment

    Working...
    X