Time fixed effect and brand fixed effect

Meng JI

Join Date: May 2021

Posts: 77
#1

Time fixed effect and brand fixed effect

14 Mar 2022, 12:28

Hi everyone,

I have a question about adding time fixed effect and brand fixed effect in a random effect model. My current regression code is something like xtreg x1, x2, x3, ..., re.
However, now I need to add the date dummy in the model, which is about 400 unique dates, and need to add brand dummy, which has around 7,000 unique values.

First, I want to ask if it makes sense to add for instance, the daily time dummy in the model or has xtreg already take the date fixed effect into the regression. I've only seen people adding year and month dummy in the model but never the date dummy.
Second, if it make sense to add these dummies, how can I suppress the regression outcomes for these dummies while only keep the main IVs (x1,x2,x3) in my case.
Third, I wonder if you know in stata, the random effect is on the intercept or the intercept and the slopes?

Thanks a lot and look forward to your reply.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29958
#2

14 Mar 2022, 13:41

First, I want to ask if it makes sense to add for instance, the daily time dummy in the model or has xtreg already take the date fixed effect into the regression.

Whether it makes sense to add the daily time indicator depends on what you are modeling. If it is subject to daily shocks large enough that you need to account for them, then yes. Bear in mind that this is only doable if you have more than one observation per date in each brand. If brand and date uniquely identify observation, then adding the date effect is like adding an observation indicator and the model will be meaningless.

-xtreg- does not automatically include a time effect. If you want a time effect, you have to put it into the varlist.

I've only seen people adding year and month dummy in the model but never the date dummy.

In principle there is no reason one cannot do this with daily dates. It's a question of whether you have enough data to make it feasible, and whether there is enough variation at the daily frequency level to warrant so expansive a model. With that many fixed effects this is going to be very slow at best, and possibly will exceed some limits and not run at all.

how can I suppress the regression outcomes for these dummies while only keep the main IVs (x1,x2,x3) in my case

I don't think you can suppress output selectively in -xtreg- itself. What you could do is run -xtreg- -quietly-, and store the estimates. Then use a pretty-print command like -estout- or -esttab-, which has -drop()- and -keep()- options to enable you to restrict the output to what you are interested in.

Third, I wonder if you know in stata, the random effect is on the intercept or the intercept and the slopes?

-xtreg, re- estimates models with random intercepts and fixed slopes. If you want random slopes (with or without intercepts), you have to use -mixed- instead.
1 like
Comment
Meng JI

Join Date: May 2021

Posts: 77
#3

16 Mar 2022, 13:58

Originally posted by Clyde Schechter View Post

Whether it makes sense to add the daily time indicator depends on what you are modeling. If it is subject to daily shocks large enough that you need to account for them, then yes. Bear in mind that this is only doable if you have more than one observation per date in each brand. If brand and date uniquely identify observation, then adding the date effect is like adding an observation indicator and the model will be meaningless.

-xtreg- does not automatically include a time effect. If you want a time effect, you have to put it into the varlist.

In principle there is no reason one cannot do this with daily dates. It's a question of whether you have enough data to make it feasible, and whether there is enough variation at the daily frequency level to warrant so expansive a model. With that many fixed effects this is going to be very slow at best, and possibly will exceed some limits and not run at all.

I don't think you can suppress output selectively in -xtreg- itself. What you could do is run -xtreg- -quietly-, and store the estimates. Then use a pretty-print command like -estout- or -esttab-, which has -drop()- and -keep()- options to enable you to restrict the output to what you are interested in.

-xtreg, re- estimates models with random intercepts and fixed slopes. If you want random slopes (with or without intercepts), you have to use -mixed- instead.

Hi Clyde,

Thanks for your reply. I'll try to add the dummies and see if stata would work. Your comments really helped a lot.

Best wishes
Meng
Comment
Meng JI

Join Date: May 2021

Posts: 77
#4

17 Mar 2022, 12:02

Originally posted by Clyde Schechter View Post

Whether it makes sense to add the daily time indicator depends on what you are modeling. If it is subject to daily shocks large enough that you need to account for them, then yes. Bear in mind that this is only doable if you have more than one observation per date in each brand. If brand and date uniquely identify observation, then adding the date effect is like adding an observation indicator and the model will be meaningless.

-xtreg- does not automatically include a time effect. If you want a time effect, you have to put it into the varlist.

In principle there is no reason one cannot do this with daily dates. It's a question of whether you have enough data to make it feasible, and whether there is enough variation at the daily frequency level to warrant so expansive a model. With that many fixed effects this is going to be very slow at best, and possibly will exceed some limits and not run at all.

I don't think you can suppress output selectively in -xtreg- itself. What you could do is run -xtreg- -quietly-, and store the estimates. Then use a pretty-print command like -estout- or -esttab-, which has -drop()- and -keep()- options to enable you to restrict the output to what you are interested in.

-xtreg, re- estimates models with random intercepts and fixed slopes. If you want random slopes (with or without intercepts), you have to use -mixed- instead.

Hi Clyde,

Thanks for your previous comments. The method of running xtreg quietly and use esttab keep options work for me in my case.

However, when I use the code below, I couldn't get the output for adjusted R value, do you have any idea why this happened?

Code:

esttab, ar2 label keep(x1, x2,x3)

Another question that I have is when I add i.brand into the regression using reg, it took more than 3 hours to run the vif and in the end I manually stopped the process. I wonder if you have any suggestions to check the multicollinearity in the model.

Thanks a lot!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29958
#5

17 Mar 2022, 12:30

I think the code for adjusted R² in esttab is r2_a, not ar2. Try that. That said, adjusted R² is not part of the regular output of -xtreg-, so it may not be possible to get it at all.

As for multicollinearity, it's a waste of time looking for it even when it runs quickly. It is seldom a problem, and when it is, there is nothing you can do about it any way except get a much larger sample, so there's nothing to be gained by testing for it. Read the chapter in Arthur Goldberger's econometrics textbook about it. Or for a much shorter version, look at https://www.econlib.org/archives/200...ollineari.html where Bryan Caplan reviews the matter. The gist of it is this: multicollinearity does not introduce any bias into the coefficient estimates. What it does is inflate the standard errors. (That's why the standard test is called the variance inflation factor.) It only affects the variables that actually participate in the multicollinearity. Often, the only variables involved are variables included only to adjust for their confounding effects, not the actual variables of interest. So in that case, the multicollinearity has no importance at all: the results for the variables of interest are unaffected. Now, if a variable of interest is involved in the multicollinearity, there may be a problem. The problem will show up as a large standard error, and a correspondingly wide confidence interval, and higher test statistic and p-value. So just look at the output for your variable of interest. If the standard error is low enough, i.e. your confidence interval narrow enough, that you can draw conclusions that answer your research question, then you have no problem. If the confidence interval is too wide to enable you to answer your research question one way or another, then you have a problem. But it is a problem with no solution other than getting a much larger data set, or getting an altogether new data set sampled in such a way as to break the multicolinearity.
1 like
Comment
Meng JI

Join Date: May 2021

Posts: 77
#6

17 Mar 2022, 12:36

Originally posted by Clyde Schechter View Post

I think the code for adjusted R² in esttab is r2_a, not ar2. Try that. That said, adjusted R² is not part of the regular output of -xtreg-, so it may not be possible to get it at all.

As for multicollinearity, it's a waste of time looking for it even when it runs quickly. It is seldom a problem, and when it is, there is nothing you can do about it any way except get a much larger sample, so there's nothing to be gained by testing for it. Read the chapter in Arthur Goldberger's econometrics textbook about it. Or for a much shorter version, look at https://www.econlib.org/archives/200...ollineari.html where Bryan Caplan reviews the matter. The gist of it is this: multicollinearity does not introduce any bias into the coefficient estimates. What it does is inflate the standard errors. (That's why the standard test is called the variance inflation factor.) It only affects the variables that actually participate in the multicollinearity. Often, the only variables involved are variables included only to adjust for their confounding effects, not the actual variables of interest. So in that case, the multicollinearity has no importance at all: the results for the variables of interest are unaffected. Now, if a variable of interest is involved in the multicollinearity, there may be a problem. The problem will show up as a large standard error, and a correspondingly wide confidence interval, and higher test statistic and p-value. So just look at the output for your variable of interest. If the standard error is low enough, i.e. your confidence interval narrow enough, that you can draw conclusions that answer your research question, then you have no problem. If the confidence interval is too wide to enable you to answer your research question one way or another, then you have a problem. But it is a problem with no solution other than getting a much larger data set, or getting an altogether new data set sampled in such a way as to break the multicolinearity.

Thanks Clyde, I'll read the article you attached. I was confused that: If in the core regression model (variables of interest), reg y, x1, x2, x3, there is no severe multicollinearity. Now I want to incorporate brand as control. If after incorporating this variable, the vif increased a lot， should I keep brand as control or simply drop the variable from the model? If it won't affect the coefficients of x1,x2,x3, I can still keep it right?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29958
#7

17 Mar 2022, 13:25

Brand is a categorical variable with several levels. The indicator variables ("dummies") that are created to represent it when you use i.brand are always necessarily highly collinear. But precisely because you are including it only "as control" that collinearity is completely irrelevant. The inclusion of brand does affect the estimates for x1, x2, and x3. And assuming there was good reason to want to include it "as control" in the first place, you should include it. But the collinearity among the brand variables does not affect the estimates for x1, x2, and x3. In short, you have exactly the situation where the multicollinearity is expected and is of no importance whatsoever. Keep brand in the model, and don't even give multicollinearity a thought. You've already wasted more of your time on it than it's worth.
Comment
Meng JI

Join Date: May 2021

Posts: 77
#8

17 Mar 2022, 14:19

Originally posted by Clyde Schechter View Post

Brand is a categorical variable with several levels. The indicator variables ("dummies") that are created to represent it when you use i.brand are always necessarily highly collinear. But precisely because you are including it only "as control" that collinearity is completely irrelevant. The inclusion of brand does affect the estimates for x1, x2, and x3. And assuming there was good reason to want to include it "as control" in the first place, you should include it. But the collinearity among the brand variables does not affect the estimates for x1, x2, and x3. In short, you have exactly the situation where the multicollinearity is expected and is of no importance whatsoever. Keep brand in the model, and don't even give multicollinearity a thought. You've already wasted more of your time on it than it's worth.

I see. Thanks a lot for the detailed explanation, Clyde!
Comment

Announcement

Time fixed effect and brand fixed effect

Comment

Comment

Comment

Comment

Comment

Comment

Comment