Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • R-Squared (within, between, overall)

    Dear stata users,

    I am building a model to predict firm return volatility, if historical returns are not available. My model is based on firm characteristics like size, industry, d/e ratio, etc..
    I want to estimate coefficients with a dataset containing US firms in the period 2003 to 2012 (panel data). Hereafter I want to see how well the obtained model works in other years (2000-2001 and 2013-2014).

    My regression is something like this:
    xtreg volatility size d/e industry

    within .5628
    between .5012
    overall .5820

    Now the stata output gives me three different values of R-squared: within, between and overall. I am not sure which one of these I should interpret. I want to say: XX% of the differences in volatility in is explained by the model.

    Thanks in advance!


    Best regards,
    Bart de Backer




  • #2
    Bart:
    welcome to the list.
    A good place to start is -xtreg- entry in Stata .pdf manual (also dowloadable at http://www.stata.com/manuals13/xtxtreg.pdf).
    Please also consider that answers to your query are usually reported in any decent panel data econometrics textbook.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      Thanks for your answer.

      I already read the manual. Unfortunately I am not as educated as an econometrist so it is hard for me to interpret all that is written in the manual. I also searched on the internet for hours but did not find an answer for this problem. I hope that someone here is able to tell me which type of R-squared I should interpret.

      Best regards,
      Bart de Backer

      Comment


      • #4
        Sorry for bumping, but this thread came up on a google search.
        My answer to your query would be something along the lines of: all of these matter.

        The between R2 is "How much of the variance between seperate panel units does my model account for"
        The within R2 is "How much of the variance within the panel units does my model account for"
        and the R2 overall is a weighted average of these two.

        So if there's a factor, that accounts for how the depndent vairable changes for each of the panel units (say education's effect on income) - this goes to R2 within.
        But if a factor accounts for the differences between panel units (say gender) - this to R2 between.

        Of course some factors are time-variant and contribute a bit there and there, but I think this example clarifies this.

        Comment


        • #5
          Bart:
          you may want to take a look at this link: https://www.princeton.edu/~otorres/Panel101.pdf
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            I also highly recommend this great presentation regarding linear panel models

            Comment

            Working...
            X