Dear all,
I would like to ask for your advice on how to treat my data and find the suitable model. I have nearly 30,000 firm-year observations (9164 firms over 36 years). However, most firms occur just a few times (95% of firms with less than 10 years and not necessarily in continuous years). My first question is whether I should treat it as a panel data. Below is the distribution:
I have tried estimating the pooled OLS, the panel regression with fixed effect, and random effect. The Breusch-Pagan LM statistic test (xttest0) show that pooled OLS is better than RE. The Hausman test shows that FE is better than RE. If l look at the F-test at the end of the FE regression, the F-statistics is 1.22, still significant, which I understand that FE is still better than OLS. The results are as follows:
To be honest, I prefer using pooled OLS, and I can add dummy to control for time and industry fixed effect. FE will produce the results that I cannot explain (although I know it should not be the reason for choosing the best model).
Could you please give me some advice on treating the data and choosing the model? Thank you very much in advance.
Best regards,
Wendy
I would like to ask for your advice on how to treat my data and find the suitable model. I have nearly 30,000 firm-year observations (9164 firms over 36 years). However, most firms occur just a few times (95% of firms with less than 10 years and not necessarily in continuous years). My first question is whether I should treat it as a panel data. Below is the distribution:
Code:
xtset ID year panel variable: ID (unbalanced) time variable: year, 1978 to 2013, but with gaps delta: 1 unit . xtdes, pattern(0) ID: 10058972, 10093022, ..., 2.969e+11 n = 9164 year: 1978, 1979, ..., 2013 T = 36 Delta(year) = 1 unit Span(year) = 36 periods (ID*year uniquely identifies each observation) Distribution of T_i: min 5% 25% 50% 75% 95% max 1 1 1 2 4 10 33
Code:
reg return ctrl1 ctrl2 ctrl3 ctrl4 ctrl5 ctrl6 ctrl7 ctrl8 Source | SS df MS Number of obs = 29,594 -------------+---------------------------------- F(8, 29585) = 131.37 Model | 15.0985952 8 1.8873244 Prob > F = 0.0000 Residual | 425.038721 29,585 .014366697 R-squared = 0.0343 -------------+---------------------------------- Adj R-squared = 0.0340 Total | 440.137316 29,593 .014873021 Root MSE = .11986 ------------------------------------------------------------------------------ return | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ctrl1 | .3460469 .0183427 18.87 0.000 .3100944 .3819994 ctrl2 | -.0389356 .0043692 -8.91 0.000 -.0474994 -.0303718 ctrl3 | -.0088315 .0028865 -3.06 0.002 -.0144891 -.0031739 ctrl4 | .0234995 .0044168 5.32 0.000 .0148423 .0321566 ctrl5 | .030962 .0032483 9.53 0.000 .0245951 .0373289 ctrl6 | -.1079416 .0103048 -10.47 0.000 -.1281394 -.0877438 ctrl7 | .3695356 .0381858 9.68 0.000 .2946898 .4443815 ctrl8 | -.0097221 .0059955 -1.62 0.105 -.0214735 .0020292 _cons | .0390474 .0032028 12.19 0.000 .0327698 .045325 ------------------------------------------------------------------------------ . xtreg return ctrl1 ctrl2 ctrl3 ctrl4 ctrl5 ctrl6 ctrl7 ctrl8, re Random-effects GLS regression Number of obs = 29,594 Group variable: ID Number of groups = 9,140 R-sq: Obs per group: within = 0.0008 min = 1 between = 0.0522 avg = 3.2 overall = 0.0343 max = 33 Wald chi2(8) = 1050.94 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ return | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ctrl1 | .3460469 .0183427 18.87 0.000 .3100959 .381998 ctrl2 | -.0389356 .0043692 -8.91 0.000 -.047499 -.0303722 ctrl3 | -.0088315 .0028865 -3.06 0.002 -.0144888 -.0031741 ctrl4 | .0234995 .0044168 5.32 0.000 .0148427 .0321563 ctrl5 | .030962 .0032483 9.53 0.000 .0245954 .0373286 ctrl6 | -.1079416 .0103048 -10.47 0.000 -.1281386 -.0877446 ctrl7 | .3695356 .0381858 9.68 0.000 .2946928 .4443784 ctrl8 | -.0097221 .0059955 -1.62 0.105 -.021473 .0020287 _cons | .0390474 .0032028 12.19 0.000 .03277 .0453248 -------------+---------------------------------------------------------------- sigma_u | 0 sigma_e | .1159866 rho | 0 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . xttest0 Breusch and Pagan Lagrangian multiplier test for random effects return[ID,t] = Xb + u[ID] + e[ID,t] Estimated results: | Var sd = sqrt(Var) ---------+----------------------------- return | .014873 .121955 e | .0134529 .1159866 u | 0 0 Test: Var(u) = 0 chibar2(01) = 0.00 Prob > chibar2 = 1.0000 . xtreg return ctrl1 ctrl2 ctrl3 ctrl4 ctrl5 ctrl6 ctrl7 ctrl8, fe Fixed-effects (within) regression Number of obs = 29,594 Group variable: ID Number of groups = 9,140 R-sq: Obs per group: within = 0.0021 min = 1 between = 0.0031 avg = 3.2 overall = 0.0016 max = 33 F(8,20446) = 5.47 corr(u_i, Xb) = -0.0823 Prob > F = 0.0000 ------------------------------------------------------------------------------ return | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ctrl1 | -.0323696 .0569229 -0.57 0.570 -.143943 .0792038 ctrl2 | -.0097785 .0090738 -1.08 0.281 -.0275638 .0080068 ctrl3 | -.01408 .0063627 -2.21 0.027 -.0265514 -.0016086 ctrl4 | .0448623 .0115237 3.89 0.000 .0222748 .0674497 ctrl5 | .0088277 .005608 1.57 0.115 -.0021644 .0198198 ctrl6 | -.0415465 .0167647 -2.48 0.013 -.0744066 -.0086865 ctrl7 | -.1646823 .1814929 -0.91 0.364 -.5204228 .1910582 ctrl8 | -.0127077 .0263921 -0.48 0.630 -.0644383 .0390229 _cons | .0278964 .0085206 3.27 0.001 .0111955 .0445974 -------------+---------------------------------------------------------------- sigma_u | .08704298 sigma_e | .1159866 rho | .36028087 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(9139, 20446) = 1.22 Prob > F = 0.0000 . est store fe . qui xtreg return ctrl1 ctrl2 ctrl3 ctrl4 ctrl5 ctrl6 ctrl7 ctrl8, re . est store re . hausman fe ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | fe re Difference S.E. -------------+---------------------------------------------------------------- ctrl1 | -.0323696 .3460469 -.3784165 .0538866 ctrl2 | -.0097785 -.0389356 .0291571 .0079526 ctrl3 | -.01408 -.0088315 -.0052485 .0056703 ctrl4 | .0448623 .0234995 .0213628 .0106437 ctrl5 | .0088277 .030962 -.0221343 .0045714 ctrl6 | -.0415465 -.1079416 .066395 .0132237 ctrl7 | -.1646823 .3695356 -.5342179 .1774303 ctrl8 | -.0127077 -.0097221 -.0029856 .0257021 ------------------------------------------------------------------------------ b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(8) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 107.90 Prob>chi2 = 0.0000
Could you please give me some advice on treating the data and choosing the model? Thank you very much in advance.
Best regards,
Wendy
Comment