Pooled OLS vs Panel approach

Jo Smith

Join Date: May 2015

Posts: 92
#1

Pooled OLS vs Panel approach

12 Jun 2015, 17:13

Hello,
Could you please shed some light on the difference between Pooled and Panel regression model? and when I can't use pooled approach.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

12 Jun 2015, 17:48

When you have panel data, with more than one observation per panel, it will usually be the case that the observations in the data set are not all independent, because traits of the panel that are not represented by other variables will typically cause some within-panel correlation (or, in some special circumstances negative correlation). In that case, standard errors (and tests based on them) calculated in a pooled regression model will be incorrect.

So, in general, if you have panel data you should use a panel regression model. Pooled analysis is most suitable when each observation is independent of any other.

That said, sometimes when you perform a panel regression, you find that the actual extent of within-panel correlation of observations is negligibly small. In that case, if you prefer, you can go back and just use a pooled regression model for that. Also, if you are not interested in within-panel relationships, and just want to understand relationships between a panel's mean outcome and the mean values of the panel's predictor variables, you can calculate those means, reducing the panel data set to one observation per panel, and then do pooled regression. (Of course, this really only makes sense for continuous outcome variables, and in that case it is probably easier to just use -xtreg, be-, which does all that for you automatically, than to go through explicitly coding the calculations of all the variables' means.)
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#3

13 Jun 2015, 01:21

Jo:
an in-depth coverage (and much more else) of Clyde's excellent insight is reported in: http://www.stata.com/bookstore/micro...ata/index.html

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Nam Le

Join Date: May 2019

Posts: 5
#4

03 May 2019, 08:27

best explanation, easily understand and we can teach to students in simple words but exact meaning

tnx

Originally posted by Clyde Schechter View Post

When you have panel data, with more than one observation per panel, it will usually be the case that the observations in the data set are not all independent, because traits of the panel that are not represented by other variables will typically cause some within-panel correlation (or, in some special circumstances negative correlation). In that case, standard errors (and tests based on them) calculated in a pooled regression model will be incorrect.

So, in general, if you have panel data you should use a panel regression model. Pooled analysis is most suitable when each observation is independent of any other.

That said, sometimes when you perform a panel regression, you find that the actual extent of within-panel correlation of observations is negligibly small. In that case, if you prefer, you can go back and just use a pooled regression model for that. Also, if you are not interested in within-panel relationships, and just want to understand relationships between a panel's mean outcome and the mean values of the panel's predictor variables, you can calculate those means, reducing the panel data set to one observation per panel, and then do pooled regression. (Of course, this really only makes sense for continuous outcome variables, and in that case it is probably easier to just use -xtreg, be-, which does all that for you automatically, than to go through explicitly coding the calculations of all the variables' means.)
Comment

Announcement

Pooled OLS vs Panel approach

Comment

Comment

Comment