Hi,
few days ago I opened a thread asking for some advice regarding my model specification where Carlo Lazzaro gave very useful feedback. The thread can be found here (with example data).
Until know I was pretty sure that -fe- would be the right approach as 99% of the research done on my research question applies -fe-.
Just to be sure I spend the last few days playing around with some tests for deciding which model to choose.
As I can assume to have heteroskedastic and intragroup correlation I cannot really rely on the Hausman test and followed the advice to use the command -xtoverid- first, which gave me the following output:
Input:
As done by nearly every single paper on my research topic I winsorized all variables at the 1st and 99th percentile values (even though knowing there is a hard debate on winsorizing data). However, after doing so, the result of -xtoverid- changed substantially:
I was even more confused after applying the mundlak approach (on the winsorized data):
Here, the Prob > chi2 = 0.0002, highly suggesting that -fe- is the way to go.
However, I am not sure whether the mundlak approach as done here accounts for heteroskedastic and intragroup correlation. As I couldn't find a -robust- option in the help file I tried to rebuild the approach following this post: https://blog.stata.com/2015/10/29/fi...dlak-approach/
Doing so results in Prob > chi2 = 0.3076 which is again far away from the results I got above.
My questions would be:
1. Is it the case that xtoverid is very sensitive to outliers? (Without seeing my entire dataset this question may have no obvious answer)
2. Do you have any ideas on why the results between xtoverid and mundlak differ so much?
3. What did I do wrong when applying the manual mundlak approach that would explain the huge difference between both mundlak approaches?
Thanks in advance
few days ago I opened a thread asking for some advice regarding my model specification where Carlo Lazzaro gave very useful feedback. The thread can be found here (with example data).
Until know I was pretty sure that -fe- would be the right approach as 99% of the research done on my research question applies -fe-.
Just to be sure I spend the last few days playing around with some tests for deciding which model to choose.
As I can assume to have heteroskedastic and intragroup correlation I cannot really rely on the Hausman test and followed the advice to use the command -xtoverid- first, which gave me the following output:
Input:
Code:
. xtreg Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_S > tructure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals, re Random-effects GLS regression Number of obs = 2,521 Group variable: Acq_ID Number of groups = 980 R-squared: Obs per group: Within = 0.0212 min = 1 Between = 0.0521 avg = 2.6 Overall = 0.0331 max = 75 Wald chi2(20) = . corr(u_i, X) = 0 (assumed) Prob > chi2 = . ----------------------------------------------------------------------------------- Acq_CAR_1_1_ES2 | Coefficient Std. err. z P>|z| [95% conf. interval] ------------------+---------------------------------------------------------------- CFO_PaySlice | .0106131 .0198141 0.54 0.592 -.0282218 .0494481 CFO_No_Boardsitze | -.0000494 .001944 -0.03 0.980 -.0038597 .0037609 CFO_No_Deals | -.0002787 .0006399 -0.44 0.663 -.0015329 .0009754 CFO_Perc_Own_Dir | -.0000603 .0002286 -0.26 0.792 -.0005082 .0003877 CFO_Board | .0060153 .0070377 0.85 0.393 -.0077783 .019809 CFO_Age | 1.08e-06 .0002584 0.00 0.997 -.0005054 .0005076 CFO_Gender | .0039131 .005748 0.68 0.496 -.0073527 .015179 CFO_MBA | -.0054712 .0032979 -1.66 0.097 -.0119349 .0009926 CFO_CPA | .0008035 .0032792 0.25 0.806 -.0056236 .0072305 CFO_Tenure | -9.74e-07 1.48e-06 -0.66 0.511 -3.88e-06 1.93e-06 Deal_Value | -1.81e-12 4.01e-13 -4.51 0.000 -2.60e-12 -1.02e-12 Targ_Listed | -.0133341 .0037269 -3.58 0.000 -.0206386 -.0060295 Deal_Structure | .0000302 .000617 0.05 0.961 -.001179 .0012394 Deal_No_Bidders | .023375 .0123418 1.89 0.058 -.0008145 .0475645 Deal_Div_FF12 | -.0007084 .0032532 -0.22 0.828 -.0070846 .0056678 Acq_MktValue | 3.92e-11 4.11e-11 0.95 0.341 -4.14e-11 1.20e-10 Acq_Leverage | .0058658 .0080689 0.73 0.467 -.0099489 .0216806 Acq_ROA | .0101078 .0252952 0.40 0.689 -.0394698 .0596855 Acq_Cash_holdings | -.019994 .0094826 -2.11 0.035 -.0385794 -.0014085 Acq_TobinsQ | -.0013971 .0003883 -3.60 0.000 -.0021581 -.000636 Acq_FCF | .0430743 .0285144 1.51 0.131 -.0128129 .0989614 Acq_No_Deals | -.0004628 .0002621 -1.77 0.077 -.0009765 .0000508 _cons | -.0189153 .0196203 -0.96 0.335 -.0573704 .0195397 ------------------+---------------------------------------------------------------- sigma_u | .03276371 sigma_e | .06290979 rho | .21336491 (fraction of variance due to u_i) ----------------------------------------------------------------------------------- . xtoverid, robust cluster(Acq_ID) Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(Acq_ID) Sargan-Hansen statistic 31.518 Chi-sq(20) P-value = 0.0487
Code:
. xtoverid, robust cluster(Acq_ID) Test of overidentifying restrictions: fixed vs random effects Cross-section time-series model: xtreg re robust cluster(Acq_ID) Sargan-Hansen statistic 23.705 Chi-sq(20) P-value = 0.2555
Code:
mundlak Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_Structure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals estimates replay Mundlak test
However, I am not sure whether the mundlak approach as done here accounts for heteroskedastic and intragroup correlation. As I couldn't find a -robust- option in the help file I tried to rebuild the approach following this post: https://blog.stata.com/2015/10/29/fi...dlak-approach/
Code:
//Mundlak manually bysort Acq_ID: egen mean_x2 = mean(CFO_PaySlice) bysort Acq_ID: egen mean_x3 = mean(CFO_No_Boardsitze) bysort Acq_ID: egen mean_x4 = mean(CFO_No_Deals) bysort Acq_ID: egen mean_x5 = mean(CFO_Perc_Own_Dir) bysort Acq_ID: egen mean_x6 = mean(CFO_Board) bysort Acq_ID: egen mean_x7 = mean(CFO_Age) bysort Acq_ID: egen mean_x8 = mean(CFO_Gender) bysort Acq_ID: egen mean_x9 = mean(CFO_MBA) bysort Acq_ID: egen mean_x10 = mean(CFO_CPA) bysort Acq_ID: egen mean_x11 = mean(CFO_Tenure) bysort Acq_ID: egen mean_x12 = mean(Deal_Value) bysort Acq_ID: egen mean_x13 = mean(Targ_Listed) bysort Acq_ID: egen mean_x14 = mean(Deal_Structure) bysort Acq_ID: egen mean_x15 = mean(Deal_No_Bidders) bysort Acq_ID: egen mean_x16 = mean(Deal_Div_FF12) bysort Acq_ID: egen mean_x17 = mean(Acq_MktValue) bysort Acq_ID: egen mean_x18 = mean(Acq_Leverage) bysort Acq_ID: egen mean_x19 = mean(Acq_ROA) bysort Acq_ID: egen mean_x20 = mean(Acq_Cash_holdings) bysort Acq_ID: egen mean_x21 = mean(Acq_TobinsQ) bysort Acq_ID: egen mean_x22 = mean(Acq_FCF) bysort Acq_ID: egen mean_x23 = mean(Acq_No_Deals) quietly xtreg Acq_CAR_1_1_ES2 CFO_PaySlice CFO_No_Boardsitze CFO_No_Deals CFO_Perc_Own_Dir CFO_Board CFO_Age CFO_Gender CFO_MBA CFO_CPA CFO_Tenure Deal_Value Targ_Listed Deal_Structure Deal_No_Bidders Deal_Div_FF12 Acq_MktValue Acq_Leverage Acq_ROA Acq_Cash_holdings Acq_TobinsQ Acq_FCF Acq_No_Deals mean_x*, vce(cluster Acq_ID) estimates store mundlak test mean_x2 mean_x3 mean_x4 mean_x5 mean_x6 mean_x7 mean_x8 mean_x9 mean_x10 mean_x11 mean_x12 mean_x13 mean_x14 mean_x15 mean_x16 mean_x17 mean_x18 mean_x19 mean_x20 mean_x21 mean_x22 mean_x23
My questions would be:
1. Is it the case that xtoverid is very sensitive to outliers? (Without seeing my entire dataset this question may have no obvious answer)
2. Do you have any ideas on why the results between xtoverid and mundlak differ so much?
3. What did I do wrong when applying the manual mundlak approach that would explain the huge difference between both mundlak approaches?
Thanks in advance
Comment