Unbalanced panel data inconsistent pooled estimates

Giulio Zarai

Join Date: Jan 2022

Posts: 2
#1

Unbalanced panel data inconsistent pooled estimates

21 Jan 2022, 11:10

Hello,

I have an unbalanced panel dataset where there are two observations for about 60% of individuals and only one observation for the remaining. The issue here is that when running pooled OLS, the estimated parameter of the covariate shows a positive sign. In contrast, when using the entity-fixed effect method, it has the opposite sign. Both the results are statistically significant. I understand that STATA basically drops the singletons when using xtreg, fe which may cause inconsistencies between the estimates. Is it correct to assume then that the pooled estimates are biased and that the covariate is most likely endogenous and correlated with the error term?
If hypothetically, the pooled and fe estimates were consistent, can I interpret the pooled results as a robustness check for the fixed effect results?

Thank you in advance.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17603
#2

21 Jan 2022, 12:27

Giulio:
welcome to this forum.
First, please act on the FAQ on how to post more effectively. Thanks.
Please also note that Stata codes and outcome tables are way more informative than a qualitative description of what we perceive is going on with our results (personal impressions are often deceptive).
That said:
1) you have an unbalanced panel dataset and panels have, at best, two waves of data,. So far, so good, as Stata can handle both balanced and unbalanced panel datasets;
2) pooled OLS and -xtreg,fe- are not interchangeable: pooled OLS is inconsistent if -fe- is the way to go. In addition, pooled OLS is not the first choice tool when dealing with panel datasets;
3) the -fe- machinery wipes out time-invariant variable;
4) endogeneity may arise for different causes: latent variable; reverse causation; error measurement of one or more covariates; simultaneous equations; in addition, its detection requires a deep knowledge of the data generating process.

I would start with -fe- and then compare it with -re- specification via -hausman-.
As usual, you should check whether a) a group-wise effect actually exists, b) your standard errors need to be cluster-robust and, c) last but by no means least, the functional form of your regressand suffers from misspecification.

As an aside, I would recommend to take a look at any decent textbook on panel data econometrics.

While Jeff Wooldridge 's one is obviously a "must have", Statalisters dealing with this stuff also like https://www.stata.com/bookstore/microeconometrics-stata.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Giulio Zarai

Join Date: Jan 2022

Posts: 2
#3

21 Jan 2022, 13:16

Thank you Carlo for welcoming me and for the answers.
I am sorry for the imprecise question and for the poor technical lexicon.

I didn't post any outcomes because I thought they weren't that informative. I will upload regressions results in the future.
I acknowledge that pooled and fixed are not interchangeable, but I've seen many papers showing results for both methods as a way to confer robustness to the results and control for endogeneity.
From previous posts here on statalist, I understood that when using xtreg on a panel dataset with singletons, these are not considered in the estimation of the parameter. Is this correct?
I used several model specifications to control for misspecified functional form as well as test for goodness of fit (AIC/BIC) and comparison of adjusted R-squared. The hausman test point to FE.
In the pooled regression, I use clustered standard errors:

Code:

vce(cluster id)

I am using the wooldridge handbook in my econometric course, I will check the other source you suggested.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17603
#4

21 Jan 2022, 15:13

Giulio:
you should have posted your codes and Stata results between CODE delimiters from your last post, as they can be more informativr than a qualitative description of what's going on with your inferential procedure.
Please also note that, while it's up to you to abide by the (mild) rules of this forum, it's up to other listers to leave your queries unreplied.
That said:
1) you're correct that due to demeaning singletons are not considered by the -fe- estimator;
2) the -fe- estimator is always consistent but is less efficient if -re- is the way to go (this is basically what -hausman- tests);
3) the no correlation between the vector of regressors and the u term of the composite error in -re- is often untenable;
4) as per 2) and 3) (along with other considerations) it is frequent, as you argue, to read in the literature different estimators apllied to fhe same dataset. While it makes sense to test different standard errors for the same estimator, is probably less helpful to test different estimators if we know from the beginning that some of them are not consistent (exception made for Mundlak and CRE). In addition, I fail to get how applying different estimators can get rid of endogeneity (the only exception that springs to my mind is the -fe- estimator as far as the endogeneity related to time-invariant variables is concerned);
5) a procedurr similar to -linktest- is useful to search for possible misspecification of the functional form of the regressand;
6) it is correct to impose -vce(cluster clusterid)- standard errors in pooled OLS (provided that the number of clusters is large enough), as the observations each panel is composed of are not independent;
7) if you detect heteroskedasticity and/or autocorrelation of the epsilon term of the composite error in -xtreg-, you should invoke -robust' or -vce(cluster idcode)- standard errors and test -fe- vs -re' specifications via the community-contributed module-xtoverid-.

Kind regards,
Carlo
(StataNow 18.5)
Comment

Announcement

Unbalanced panel data inconsistent pooled estimates

Comment

Comment

Comment