R-Squared (within, between, overall)

Bart DeBacker

Join Date: Oct 2015

Posts: 2
#1

R-Squared (within, between, overall)

24 Oct 2015, 10:16

Dear stata users,

I am building a model to predict firm return volatility, if historical returns are not available. My model is based on firm characteristics like size, industry, d/e ratio, etc..
I want to estimate coefficients with a dataset containing US firms in the period 2003 to 2012 (panel data). Hereafter I want to see how well the obtained model works in other years (2000-2001 and 2013-2014).

My regression is something like this:
xtreg volatility size d/e industry

within .5628
between .5012
overall .5820

Now the stata output gives me three different values of R-squared: within, between and overall. I am not sure which one of these I should interpret. I want to say: XX% of the differences in volatility in is explained by the model.

Thanks in advance!

Best regards,
Bart de Backer
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

24 Oct 2015, 10:47

Bart:
welcome to the list.
A good place to start is -xtreg- entry in Stata .pdf manual (also dowloadable at http://www.stata.com/manuals13/xtxtreg.pdf).
Please also consider that answers to your query are usually reported in any decent panel data econometrics textbook.

Kind regards,
Carlo
(Stata 19.0)
Comment
Bart DeBacker

Join Date: Oct 2015

Posts: 2
#3

24 Oct 2015, 12:34

Dear Carlo,

Thanks for your answer.

I already read the manual. Unfortunately I am not as educated as an econometrist so it is hard for me to interpret all that is written in the manual. I also searched on the internet for hours but did not find an answer for this problem. I hope that someone here is able to tell me which type of R-squared I should interpret.

Best regards,
Bart de Backer
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#4

11 Apr 2016, 07:15

Sorry for bumping, but this thread came up on a google search.
My answer to your query would be something along the lines of: all of these matter.

The between R2 is "How much of the variance between seperate panel units does my model account for"
The within R2 is "How much of the variance within the panel units does my model account for"
and the R2 overall is a weighted average of these two.

So if there's a factor, that accounts for how the depndent vairable changes for each of the panel units (say education's effect on income) - this goes to R2 within.
But if a factor accounts for the differences between panel units (say gender) - this to R2 between.

Of course some factors are time-variant and contribute a bit there and there, but I think this example clarifies this.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#5

11 Apr 2016, 08:33

Bart:
you may want to take a look at this link: https://www.princeton.edu/~otorres/Panel101.pdf

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#6

14 Apr 2016, 07:26

I also highly recommend this great presentation regarding linear panel models
Comment

Announcement

R-Squared (within, between, overall)

Comment

Comment

Comment

Comment

Comment