Time-invariant variables in Fixed-effects model

Leon Schmidt

Join Date: Apr 2018

Posts: 98
#31

13 Oct 2021, 09:46

Thanks a lot Clyde for the quick reply!
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#32

14 Oct 2021, 09:25

Upon thinking further about my post (29) in this thread, I still have trouble understanding the interaction effect. Specifically, should the effects be viewed as absolute changes or relative changes?

Some background on my thoughts: Normally when using dummy variables one variable gets omitted and then all values are expressed relatively to this variable. Yet, here I get values for all three values of the dummy variable (with the main effect - hours - being the coefficient for the base category - whites -). And when changing the base category the calculated marginal effects remain the same.

So I tend to view the results as absolute changes (e.g. whites noticed absolute increases in wages by 0.0015... , blacks noticed a relative decrease to whites by -0.0037 ... and an absolute decrease by 0.0015... - 0.0037...)

However, the p-values on the interaction change when altering the base category. So the effects are still kind of relative to the choice of base category.

Can someone help me with the interpretation here?

Thank you very much!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#33

14 Oct 2021, 10:01

First, there is a problem because the term "relative" is overloaded in ordinary English, and both of its meanings potentially arise in your situation.

The difference between absolute and relative change has to do with the use of a log-transformed variable. It would be the same thing whether you had an interaction model or not. There is a correspondence between absolute change in ln X and percentage ("relative") change in X.

The interpretation of coefficients in interaction models is somewhat complicated and many people find it confusing. But it has nothing to do with relative vs absolute.

There is a different sense of relative vs absolute that I think you are raising in #32: are the coefficients describing results directly associated with a variable, or are they describing the difference between that variable and a base (usually omitted) category--which is sometimes colloquially described as "relative to the base category." Understanding this is made even more complicated in interaction models because the interaction variables are actually not attributes of any of the groups they are attached to. While all of this can be worked out with a little algebra, it is much easier to ignore the regression output and rely instead on the output of -margins-.

-margins race, at(hours = (interesting list of values of hours))- will give you the expected value of lnwage in each race at each level of hours; these results are absolute, not relative to a baseline.

And if you are interested in the differences between whites (the base category in your example) and other racial groups, you can get that with -margins, dydx(race) at(hours = (interesting list of values of hours))-. These results are relative to the reference group whites.

If you want a comparison of all racial groups with each other, -margins race, at(hours = (interesting list of values of hours)) pwcompare- will give you that. These are all explicitly comparisons between two groups.

[If you omit the -at(hours = (interesting list of values of hours))- part, you get the marginal outcomes or differences in all of these.]
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#34

15 Oct 2021, 08:15

Thanks a lot Clyde for this explanation! I think I understand it now somewhat better.

What do you mean with "... because the interaction variables are actually not attributes of any of the groups they are attached to."?

Given your answer, I realize the usefulness of the margins-command. But if I run the regression in post (29) and then do

Code:

margins race, dydx(hours)

, can´t I still view those as "absolute" changes because if I change hours by one unit, all groups in "race" experience a change in terms of the marginal effect?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#35

15 Oct 2021, 09:39

What do you mean with "... because the interaction variables are actually not attributes of any of the groups they are attached to."?

The interaction coefficients are a different kind of animal from the other coefficients. In a non-interaction model, the coefficient of a variable like hours represents the marginal effect of hours. When you have an interaction model, where hours is part of the interaction, the interaction coefficient isn't the marginal effect of anything. It isn't the marginal effect of hours, and it isn't the marginal effect of race or any specific race either. It represents, if you will, a "correction" to the non-interaction model. It is not a first derivative of anything: it is a mixed partial second derivative. It's a qualitatively different beast. And it doesn't say anything about either race or hours separately: it only has meaning with respect to the two variables jointly.

can´t I still view those as "absolute" changes because if I change hours by one unit, all groups in "race" experience a change in terms of the marginal effect?

Again, this question is confusing because of the ambiguous term absolute (or relative). The margins outputs are absolute in the sense that each of the outputs reflects an inherent attribute of one of the races, not a comparison of that race to some base race category. They are also absolute as effects of hours on ln(wage). But they are relative as effects of hours on wage.
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#36

15 Oct 2021, 09:46

Thank you very much Clyde, got it now!
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#37

02 Dec 2021, 06:39

Thank you very much Clyde Schechter for your help so far. Unfortunately, I have another question underlying the interpretation of such interaction effects based on the following example:

Code:

webuse nlswork.dta, clear xtset idcode year xtreg ln_wage c.hours i.race#c.hours i.year, fe // same result as xtreg ln_wage i.race##c.hours i.year, fe but here - margins - does not work margins race, dydx(hours) ... Average marginal effects Number of obs = 28,467 Model VCE : Conventional Expression : Linear prediction, predict() dy/dx w.r.t. : hours ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hours | race | white | .001518 .0002699 5.63 0.000 .0009891 .0020469 black | -.0021844 .0005306 -4.12 0.000 -.0032243 -.0011445 other | -.0000541 .0023919 -0.02 0.982 -.0047421 .004634 ------------------------------------------------------------------------------

As discussed above, I would interpret the results as follows (using the log approximation, which is not exact):

- a one hour increase is associated with an increase in wages of whites by 0.001518 * 100 percent
- a one hour increase is associated with a decrease in wages of blacks by 0.0021844 * 100 percent

Also as discussed above, these marginal effects are "absolute" in the sense that they are an attribute of each race.

Now I am wondering whether these are the only changes taking place in the wage of individuals if one increases - hours - . Or can it still be the case that increasing hours leads to (unobserved) wage increases for all (whites and blacks) but then there are the marginal effects on top which differ across race? So in effect I am wondering whether increasing hours can still benefit all in terms of their wage even though this might be less so for blacks compared to whites (which the results suggest)?

Thank you very much again!

Last edited by Leon Schmidt; 02 Dec 2021, 06:42.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#38

02 Dec 2021, 09:33

The notion that there is some kind of effect of hours on wages for everybody, and then there is some separate additional increment for whites and a separate decrement for blacks superimposed on that is not one that can be approached through this kind of modeling. In fact, I don't think this kind of question can be answered at all. I don't see how one would be able to distinguish in data a situation where a one hour increase is associated by a latent (i.e. unobserved) across the board increase in wages of, say, 0.10%, and then there is some additional effect of 0.0518% for whites and a corresponding decrement effect for blacks from one in which there is no across the board increase in wages, and we have a specific 0.1518% increase for whites, etc. The data would look the same in both scenarios. You could stipulate any value for the latent across the board effect of hours on wages and tthe race-specific effects would adjust themselves accordingly to net out the same way as the model in #37.

One could, of course, run -margins, dydx(hours)- to get an average marginal effect of hours, and, if you wish, you could conceptualize that this is the unobserved across-the-board effect, on top of which racial effects can be applied. But I don't know that there is any reality to that.
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#39

03 Dec 2021, 02:37

OK, thanks a lot Clyde! But how should I interpret then the results from - margins - if there might or might not be an underlying across the board effect?

As I interpret them now, all races are at one level of - wage - before - hours - is increased by one unit. If one then increases - hours - by one unit, this increases wages for whites by 0.1518% and decreases them for blacks by 0.21844%. Is that correct?
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#40

03 Dec 2021, 05:11

And if one would like to contextualize the results a bit more: Would it make sense to relate the change in percent to the overall mean wage (so e.g. say that we have an overall mean wage of X and 0.1518% of this is Y for whites) or should one relate it to the mean wage for each race (so e.g. 0.1518% of mean wage for whites is ... Dollars, 0.218% of mean wage for blacks is ... Dollars)?

Sorry for all these questions, I just find the interpretation very difficult. Thank you very much for all the great help!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#41

03 Dec 2021, 12:19

As I interpret them now, all races are at one level of - wage - before - hours - is increased by one unit. If one then increases - hours - by one unit, this increases wages for whites by 0.1518% and decreases them for blacks by 0.21844%. Is that correct?

Well, one thing that is not correct is the use of causal language here. The data are observational. So we cannot say that increasing hours by one unit "increases wages for whites" etc. What we can say is that, on average, given two whites whose hours differ by one unit, the one with higher hours will have 0.1518% higher wages. (I don't think I would carry four decimal places here--I doubt that level of implied precision is justified, but that's a more minor issue.) Similarly, given two blacks whose hours differ by one unit, the one with the higher hours will, on on average, have 0.218% lower wages.

Would it make sense to relate the change in percent to the overall mean wage (so e.g. say that we have an overall mean wage of X and 0.1518% of this is Y for whites) or should one relate it to the mean wage for each race (so e.g. 0.1518% of mean wage for whites is ... Dollars, 0.218% of mean wage for blacks is ... Dollars)?

Because you used a log transformed outcome measure, your model implicitly constrains the semi-elasticity of wages on hours to be constant. So, according to that model, regardless of the baseline wage you refer to, a one unit hours increase will be associated with the same percentage increase/decrease in wages. So the 0.1518% and 0.218% figures apply equally well to the overall mean wage, the mean wage of each specific racial group, or any other wage you might be interested in looking out. All of those are equally valid. As to how to contextualize it, that depends on what question you are trying to answer and who you are trying to explain your answer to. Given that you have chosen to look at separate effects by race, it probably makes more sense to contextualize by referring to the race-specific mean wages when illustrating your outcomes, particularly if you want to focus on race differences. But most likely the distributions of wages in whites and blacks and others all overlap extensively, so that it would also be meaningful to take a mean across all races and explain what it means in dollars for such an average person, pointing out that it means different things due to the different semi-elasticities by race.

So it really boils down to what you are trying put into sharpest focus, the baseline differences in wages among the groups, or the difference in semi-elasticities among the groups. That's neither a Stata question nor a statistical question. It's a "what am I doing this research for in the first place" question, and only you can answer it.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment