use code -robust-, despite getting better results without it

Stan Breetman

Join Date: Jun 2017

Posts: 31
#1

use code -robust-, despite getting better results without it

17 Jul 2017, 14:19

Hi everybody.

My question is, should I use the code: -robust- for panel data with fixed effects, when the panel regression with fixed effects is generating better results without the code -robust-?

What are the exact implications? Is it allowed?

Sorry, totally a beginner in statistics .
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

17 Jul 2017, 14:25

I'm wondering what "better results" mean. That said, we select "robust" standard errors due to the characteristics of the data/model, in order to (somewhat) curb some issues: clustered data, heteroscedasticity, etc.

You may wish to take a look at this FAQ.

Best regards,

Marcos
Comment
Stan Breetman

Join Date: Jun 2017

Posts: 31
#3

17 Jul 2017, 14:43

Better results means, one of my main variables is getting significant if i dont use -robust-.

So if I use -robust- there should not be anymore heteroscedasticity?

If my correlations between the variables are low as the following, can I automatically say, there is no heteroscedasticity?
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

17 Jul 2017, 17:44

With regards to 'better results', ideally, they shall mirror the issues sparked by the data. Moreover, when the model is consistent enough, we get values on the same verge.

The theme relates to core knowledge concerning regression as well as panel data particularities.

Apart from a good book on stats, you may wish to take a look at this thread:

https://www.statalist.org/forums/for...tandard-errors

To end, I kindly suggest to follow the advices from the FAQ.

Among them, the recommendation to share command, output and data (full, abridged or mock). The odds are you will get a much more clarifying answer, provided you follow this suggestion.

Best regards,

Marcos
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#5

18 Jul 2017, 00:53

Stan.
you seem to mix up collinearity wit heteroskedasticity.
Which one does concern you?
Please note that under -xtreg- robustified/clustered standard errors deal with both heteroskedsticity and serial correlation, but cannot shelter you from quasi-extreme multicollinearity (whereas perfect multicollinearity is dealt automatically by Stata via variable omission).
As an aside, regression models are expected to give a fair and true view of the data generating process rather than achieving "good-looking" coefficients.

Kind regards,
Carlo
(Stata 19.0)
Comment
Stan Breetman

Join Date: Jun 2017

Posts: 31
#6

18 Jul 2017, 02:41

Thank you Mr. Almeida: Where can I see or with what analyize if the model is consistent? So if my model is consistent enough I dont have to use comand -robust-?

Yes, but my question is: Am I getting the right regression model by not using -robust- or not? For sure i would rather have a correct "good-looking" regression than a correct "bad-looking" regression. My question is now, which regression is the correct one, this with using -robust- or without?

There was a note in the forum concerning my regression, where the one main variable flips from significant to p-value 0.2 or 0.3. So what could my problem be?
"Or we are doing more exploratory (or, dare I say, data-mining-style) research to determine the appropriate concepts to measure and the best ways to measure them. Under these conditions, multicollinearity should be more of a concern -- especially if, as Clyde points out, it causes signs and significances to flip."

Last edited by Stan Breetman; 18 Jul 2017, 02:46.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#7

18 Jul 2017, 03:00

Presenting command and output, as recommended in the FAQ and underlined in #4, are the best way to get helpful advice.

Best regards,

Marcos
Comment
Stan Breetman

Join Date: Jun 2017

Posts: 31
#8

18 Jul 2017, 06:33

1. The first modell with -robust- : higly significant variables, but not gini_12. R2 with 0.33 which seems ok.

2. Now the panel regression without -robust-. For example gini_12 gets now significant. and R2 is now 0.37. So thats my question to proceed with or without -robust- in the forumla?

3. The 3. thing is, that when I add an controlling variable, gini_12 on the other side gets veryyyy insignificant.
So which model should I use as a main model?

4. Is it normal to get so high significant values for all other variables or could there be a problem with heteroscedasticity or something else?

Its difficult to find solutions on youtube or in the internet and since I am a beginner, I would be happy if anybody could give me good advice.
I would be happy when I could find a solution soon, because I am running out of time.

Thank you Carlo and Mr. Almeida
Attached Files

Last edited by Stan Breetman; 18 Jul 2017, 06:39.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#9

18 Jul 2017, 06:52

Stan:
sorry, but I fail to get your research approach.
You srated with -fe- specification, then you switched to -re- one.
Please note that they focus on different variances (-fe-: within variance; -re-: between variance).
Besides, the decision to choose -fe- vs -re- (or the other way round) is based on the -hausman- test (if you do not have robustified/clustered standard errors) or the user-written comand -xtoverid- (type -search xtoverid- from withih Stata to install it) if have robustified/clustered standard errors.
Unfortunately. -xtoveri- does not support -fvvarlist- notaion: so you have to prefix your equation by -xi- (see -help xi- for more details).
As an aside, I would say that -robust- standard errors do not differe that much from the defauls ones in 1. and 2.

Kind regards,
Carlo
(Stata 19.0)
Comment
Stan Breetman

Join Date: Jun 2017

Posts: 31
#10

18 Jul 2017, 07:25

Sorry Carlo, totally my fault. Here the results with fe. re should not be considered.

A. What is your opinion now?
B. The controlling variable where its used in 2 and 4 is not adding any value to the R2 but is making gini_12 very insignificant.
Should adding a variable not increase the R2? and Shoud I not controll with this variable?

1. is the main regression with robust as in 2.
2. is the main regression with the controlling variable wins_previous_season. Also with comand robust.
3,4 regression are the same as in 1 and 2, but without command -robust-. => 3 = 1 and 4 = 2 (without robust)

1.

2.

3.

4.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#11

18 Jul 2017, 07:30

Stan:
I will focus on the regression models 2 and 4, as they show a higher R-sq within vs regression models 1 and 3.
Eventually, I will select the regression model #4 with default standard errors (all in all, robustified and default standard errors are very similar).

Kind regards,
Carlo
(Stata 19.0)
Comment
daniel klein

Join Date: Mar 2014

Posts: 3824
#12

18 Jul 2017, 07:38

Did not follow closely and do not intend to do so, but I see from the screen shots (depreciated on Statalist, as stated in the FAQ that Stan was referred to several times now) that the outcome/response/dependent variable wins seems to be used as a predictor in a lagged form (wins_previous_season). This practice will, despite giving a higher R-squared, result in biased coefficients. The bias might not be large given the minimum of 9 observed periods. This topic has been discussed many times in the literature and here on Statalist. For starters see Paul Allison's blog entry.

Best
Daniel
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#13

18 Jul 2017, 08:28

Daniel made (as usual) a good point, that I baldly overlooked in my previous reply.
Using lagged dependent variable as a predictor takes the researcher on the edge of the dynamic panel data model realm (see -xtabond-).

Kind regards,
Carlo
(Stata 19.0)
Comment
Stan Breetman

Join Date: Jun 2017

Posts: 31
#14

18 Jul 2017, 08:50

But due to the writer of the article "Don’t Put Lagged Dependent Variables in Mixed Models", it refers just to mixed models.

Now I am a bit confused. My model is a fixed model .
Comment
daniel klein

Join Date: Mar 2014

Posts: 3824
#15

18 Jul 2017, 09:00

Please do read the material we refer you to carefully. You did not do so with the FAQ (otherwise you would not have posted screenshots) and you did do so not with Allison's blog entry either, which states

By the way, although I’ve emphasized random effects models in this post, the same problem occurs in standard fixed-effects models. You can’t put a lagged dependent variable on the right-hand side.

If you are asking others to invest their time helping you, you should show that you are willing to invest time as well, not just go for a quick yes or no answer, that you will and cannot not get, anyway. I cannot speak for others but I find this kind of behavior, while understandable, not very respectful.

Best
Daniel

Last edited by daniel klein; 18 Jul 2017, 09:04.
1 like
Comment

Announcement

use code -robust-, despite getting better results without it

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment