Heteroskedasticity on Fixed Effect Model Won't Cured

Arnola Putri

Join Date: Apr 2022
Posts: 12

Heteroskedasticity on Fixed Effect Model Won't Cured

18 Apr 2022, 07:59

Hello, my name is Arnola and i would like to ask some questions,

I have a small panel data consist of 34 N and 8 T, with 1 dependent variable. and 7 independet variables. I use panel regression and found that the best model to use is the FE model. I run "vif, uncentered" after running "xtreg dep indep, fe" and it shows that my model has multicollinearity. I cure this by transform all my variables into first difference and run it again and then there is no multicollinearity. But then i test for heteroskedasticity, it appears that my model has one, and some post told me to transform all my variables to natural logarithms, so i did and run my model from the very first step (hausmen, etc). But my model still hetero. I did a test for autocorr and the results show there is no serial autocorr. what should i do? is there another method i should do?

note:
- i transform ALL my variables to natural logarithm to avois hetero (even though 3 of the variables are already in percentage format) is this okay? because some post said it is not necessary
- this is my first time in the forum, pardon me if i post my question in a wrong way
- i already read some post that is related to my question and i still can't figure this out

Thank you in advance

Below is the code i'm using:

Code:

transform variables to ln
gen lnCHL=ln(CHL)
gen lnGOV=ln(GOV)
gen lnTPT=ln(TPT)
gen lnPENG=ln(PENG)
gen lnAMH=ln(AMH)
gen lnAPM=ln(APM)
gen lnPOV=ln(POV)

*stating panel data
xtset Provinsi Tahun

*run PLS FE RE
reg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV
xtreg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV, fe
xtreg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV, re
*
restricted F-test
reg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV i.Provinsi
testparm i.Provinsi:

F( 33,   196) =   58.13
 Prob > F =    0.0000

* hausman test
xtreg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV, fe
estimates store FEM
xtreg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV, re
estimates store REM
hausman FEM REM:

 Test:  Ho:  difference in coefficients not systematic

                  chi2(6) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      194.40
                Prob>chi2 =      0.0000
                (V_b-V_B is not positive definite)

*LM test
xtreg lnCHL lnGOV lnTPT lnPENG lnAMH lnAPM lnPOV, re
xttest0:

chibar2(01) =   363.60
Prob > chibar2 =   0.0000

*Multicol test
. vif, uncentered

    Variable |       VIF       1/VIF  
-------------+----------------------
      lnPENG |   4765.04    0.000210
       lnAMH |   4194.83    0.000238
       lnAPM |   2926.06    0.000342
       lnGOV |    796.88    0.001255
       lnPOV |     22.72    0.044023
       lnTPT |     21.09    0.047412
-------------+----------------------
    Mean VIF |   2121.10

*transform first difference and run it again
 vif, uncentered

    Variable |       VIF       1/VIF  
-------------+----------------------
      dlnAPM |      3.02    0.330674
      dlnGOV |      2.15    0.464344
     dlnPENG |      2.10    0.476792
      dlnAMH |      1.56    0.639131
      dlnPOV |      1.30    0.766340
      dlnTPT |      1.08    0.926223
-------------+----------------------
    Mean VIF |      1.87

*Modified Wald test
. xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (34)  =    3.3e+05
Prob>chi2 =      0.0000

*Woolridge test
. xtserial dlnCHL dlnGOV dlnTPT dlnPENG dlnAMH dlnAPM dlnPOV

Wooldridge test for autocorrelation in panel data
H0: no first-order autocorrelation
    F(  1,      33) =      0.027
           Prob > F =      0.8705

Tags: fixed effects, panel, panel data, regression, Suggestion

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

18 Apr 2022, 09:17

Hey Arnola. I do think you should read the FAQ to learn how to ask better questions (at least, if you'll frequent Statalist), but either way, I'm not understanding the problem we have here. Arnola Putri

Stata handles multicollinearity by itself (as do most stats software if I remember correctly). Heteroskedasticity can be addressed in about a trillion different ways, but the simplest as far as I'm aware is clustered robust standard errors.

It would be better for you to hyper-link to the posts you read so we can look at them, as well as (and especially!) you presenting your dataset using the dataex command so we can see exactly how your dataset looks.

Welcome to Statalist, Arnola.

Oh and just a technical note, I wouldn't use the word cured to describe potential solutions to statistical problems. If I could develop a cure for... I don't know, missing data, where all our woes were solved (something folks like Carlo Lazzaro might appreciate), then I would retire tomorrow at the age of 24, a wealthy and happy man. In my business at least, it's understood that in stats, there are rarely standard solutions, only standard problems.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17613
#3

18 Apr 2022, 09:22

Arnola:
welcome to this forum.
Some comments about your query:
1) if you detect heteroskedasticity and/or autocorrelation after -xtreg- just invoke -robust- or -vce(cluster panelid)- standard errors (both options di the very same jpob under -xtreg- and take both nuisances into account);
2) given what above, you should switch from -hausman- (that does not support non-default standard errors) to the community-contribute module -xtoverid-. to compare -fe- vs. -re- specification.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 634
#4

18 Apr 2022, 14:05

In addition to the excellent advice already provided, I would really recommend you read the following paper: https://www.nber.org/papers/w24003

It is quite closely related to your problem
1 like
Comment

Arnola Putri

Join Date: Apr 2022
Posts: 12

18 Apr 2022, 22:38

Jared Greathouse Thank you so much for your reply and i'm sorry for using the wrong word. My problem is i don't have any solution for the heteroskedasticity in my model. here is the dataset i use:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte Provinsi int Tahun long CHL double(TPT APM AMH POV GOV)
 1 2012   8168  9.06 78.61 96.99 19.46  23099.13
 1 2013      . 10.12 82.57 97.04  17.6  23228.59
 1 2014   8959  9.02  85.2 97.42 18.05  23129.04
 1 2015   7046  9.93 85.55 97.63 17.08  22524.31
 1 2016   7920  7.57 85.73 97.74 16.73  22835.29
 1 2017  17962  6.57 86.31 97.94 16.89   23362.9
 1 2018  16111  6.34 86.38 98.03 15.97  24013.79
 1 2019   8153  6.17 86.48 98.21 15.32   24842.3
 2 2012 114706  6.28 70.57 97.51 10.67  28036.88
 2 2013      .  6.45 73.98 97.84 10.06  29339.21
 2 2014  81981  6.23 78.33 98.57  9.38  30477.07
 2 2015  74977  6.71 78.48 98.68 10.53  31637.41
 2 2016 128032  5.84 78.71 98.88 10.35  32885.09
 2 2017 118313   5.6 79.12 98.89 10.22  34183.58
 2 2018 124277  5.55 79.25 99.07  9.22   35570.5
 2 2019 118840  5.39 80.26 99.15  8.83  36853.59
 3 2012  15880  6.65 70.08 97.23  8.19  23744.01
 3 2013      .  7.02 72.56 97.38  8.14  24857.64
 3 2014  13108   6.5 75.61 98.44  7.41  25982.83
 3 2015  15010  6.89    76 98.56  7.31  27080.76
 3 2016  19125  5.09 76.19 98.81  7.09  28164.93
 3 2017  24719  5.58 76.47 98.85  6.87  29312.17
 3 2018  22896  5.66 77.08 99.07  6.65   30470.8
 3 2019  23816  5.38  78.1 99.17  6.42  31427.29
 4 2012  16798  4.37 70.18 98.45  8.22  72396.34
 4 2013      .  5.48 74.23 98.48  7.72  72297.05
 4 2014  12083  6.56 77.67 98.75  8.12  72390.88
 4 2015  16552  7.83 78.22 98.87  8.42  70769.78
 4 2016  13181  7.43 78.53 99.07  7.98  70569.36
 4 2017  13658  6.22 78.87 99.17  7.78  70740.43
 4 2018  18478  5.98 79.12  99.2  7.39  70736.77
 4 2019  16775  5.76 79.94 99.21  7.08  72509.14
 5 2012   7715   3.2 69.56  96.2  8.42  32417.72
 5 2013      .  4.76 73.23 96.85  8.07   34012.1
 5 2014   5235  5.08 77.34 97.77  7.92  35878.09
 5 2015   7198  4.34 77.94 97.84  8.86  36753.52
 5 2016   9223     4 78.09 98.01  8.41   37728.8
 5 2017   9483  3.87 78.57 98.09  8.19  38833.87
 5 2018  10270  3.73 79.38 98.15  7.92  40025.52
 5 2019   7334  4.06 79.48  98.2   7.6  41812.35
 6 2012  21133  5.66 67.94  97.5 13.78  28577.89
 6 2013      .  4.84 72.06 97.62 14.24  29656.76
 6 2014  15341  4.96 75.87 98.14 13.91  30636.27
 6 2015  11136  6.07 76.18 98.22 14.25   31549.3
 6 2016  16365  4.31 76.43 98.46 13.54   32699.5
 6 2017  24937  4.39 76.89 98.54 13.19  34059.71
 6 2018  21706  4.27 76.91 98.66  12.8  35659.82
 6 2019  18441  4.53 77.58 98.76 12.71  37125.75
 7 2012   5889  3.62 71.97 95.69  17.7  18143.51
 7 2013      .  4.61 73.07 96.55 18.34   18919.3
 7 2014   5631  3.47 76.44 97.52 17.48  19626.72
 7 2015   2990  4.91 76.88 97.63 17.88  20302.48
 7 2016   6540   3.3 77.02 97.75 17.32  21039.84
 7 2017   4406  3.74 77.85  97.9 16.45  21751.64
 7 2018   7020  3.35 78.03 97.91 15.43  22494.84
 7 2019   7138  3.26 78.81 98.01 15.23  23504.53
 8 2012  38623   5.2 72.08 95.13 16.18  21794.83
 8 2013      .  5.69 74.96 95.92 14.86  22770.68
 8 2014  20217  4.79 77.98 96.54 14.28  23647.27
 8 2015  20500  5.14  78.2 96.67 14.35  24581.78
 8 2016  25253  4.62 78.34 96.78 14.29  25568.57
 8 2017  22993  4.33 79.24 96.89 13.69  26614.88
 8 2018  35282  4.04 80.23 96.93 13.14  27736.26
 8 2019  27654  4.03  80.4 97.11 12.62   28894.5
 9 2012   4904  3.43 63.28 95.88  5.53  31172.42
 9 2013      .  3.65 63.83 96.44  5.21   32081.3
 9 2014   3502  5.14 71.83  97.6  5.36  32859.64
 9 2015   3494  6.29 72.42 97.63   5.4  33480.38
 9 2016   4475   2.6 72.75 97.66  5.22  34132.87
 9 2017   5235  3.78 73.06 97.79   5.2  34933.52
 9 2018   4990  3.61 73.96 97.76  5.25  35762.04
 9 2019   3223  3.58 74.13 98.09  4.62  37173.14
10 2012   3213  5.08 78.67  97.8  7.11     70930
10 2013      .  5.63 83.31 98.07  6.46  73743.33
10 2014   2049  6.69 83.36 98.71   6.7  76313.81
10 2015   1634   6.2 83.77 98.79  6.24  78625.43
10 2016   1390  7.69 84.06 98.84  5.98   80295.6
10 2017   3818  7.16 84.28 98.83  6.06  79743.68
10 2018   4361  8.04 84.59 98.87   6.2   81206.2
10 2019   1227   7.5 85.54    99   5.9  81138.52
11 2012   7146  9.67 70.31 99.21  3.69 123962.38
11 2013      .  8.63 75.46 99.22  3.55 130060.31
11 2014   6539  8.47 79.61 99.54  3.92 136312.34
11 2015   6966  7.23  80.2 99.59  3.93 142913.61
11 2016   5691  6.12 80.35 99.64  3.75  149831.9
11 2017   8341  7.14 80.72 99.67  3.77  157636.6
11 2018   3340  6.65 80.81 99.72  3.57 165768.99
11 2019   2699  6.54 81.68 99.74  3.47 174812.51
12 2012  51485  9.08 73.54 96.39 10.09     23036
12 2013      .  9.16 76.76 96.87  9.52  24118.31
12 2014  43302  8.45  79.3 97.96  9.44  24966.86
12 2015  21429  8.72 79.55 98.01  9.53   25845.5
12 2016  26070  8.89 79.76 98.22  8.95  26923.51
12 2017  97165  8.22 80.29 98.23  8.71  27970.92
12 2018  80154  8.23 81.01 98.48  7.45  29160.06
12 2019  52039  8.04 81.26 98.53  6.91  30413.37
13 2012  79834  5.61 72.52 90.45 15.34  20950.62
13 2013      .  6.01 74.94 91.71 14.56  21844.87
13 2014  54222  5.68 78.57 92.98 14.46  22819.16
13 2015  33841  4.99 78.66 93.12 13.58  23887.06
end

Comment

Arnola Putri

Join Date: Apr 2022
Posts: 12

18 Apr 2022, 22:46

Carlo Lazzaro hello and thank you for your response, i already try your suggestion on adding -robust- or -vce(cluster panelid)- aftre xtreg... is this right? below is my command:

Code:

*Modified Wald test for groupwise heteroskedasticity
xtreg dlnCHL dlnGOV dlnTPT dlnAMH dlnAPM dlnPOV, fe
xttest3
*vif is 0.0000 (there is heteroskedasticity)

*add robust to the FE regression and re-run Modified Wald test 
xtreg dlnCHL dlnGOV dlnTPT dlnAMH dlnAPM dlnPOV, fe robust
xttest3
*vif is 0.0000 (there is heteroskedasticity)

*add vce(panelId) to the FE regression and re-run Modified Wald test 
xtreg dlnCHL dlnGOV dlnTPT dlnAMH dlnAPM dlnPOV, fe vce(cluster Provinsi)
xttest3
*vif is 0.0000 (there is heteroskedasticity)

Comment

Arnola Putri

Join Date: Apr 2022

Posts: 12
#7

18 Apr 2022, 22:50

Maxence Morlet Thank you so much, i will look into it!
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17613

19 Apr 2022, 02:33

Arnola:
I fail to get what -vif- has to do with the outcome of te community-contributed module -xttest3-:

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage c.age##c.age, fe

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1087                                         min =          1
     Between = 0.1006                                         avg =        6.1
     Overall = 0.0865                                         max =         15

                                                F(2,23798)        =    1451.88
corr(u_i, Xb) = 0.0440                          Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0539076   .0028078    19.20   0.000     .0484041    .0594112
             |
 c.age#c.age |  -.0005973   .0000465   -12.84   0.000    -.0006885   -.0005061
             |
       _cons |    .639913   .0408906    15.65   0.000     .5597649    .7200611
-------------+----------------------------------------------------------------
     sigma_u |   .4039153
     sigma_e |  .30245467
         rho |  .64073314   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4709, 23798) = 8.74                 Prob > F = 0.0000

. xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (4710)  =  4.4e+35
Prob>chi2 =      0.0000


.

That saod, just invoke -robust- or -vce(cluster panlid)- standard errors and go on with your analysis.
Please note that repeating -xttest3- after that means wasting your time, as the test is performed on the variance of the epsilon distribution, that is not affected by the non-default standard error.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Arnola Putri

Join Date: Apr 2022

Posts: 12
#9

19 Apr 2022, 03:53

Carlo Lazzaro i am so sorry i wasn't focused, yes vif has nothing to do with the -xttest3-... so sorry but i'm still confused on what -robust- or -vce(cluster panlid)- standard errors do to my model
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17613
#10

19 Apr 2022, 06:41

Arnola:
they simply modify your standard errors so that they can take heteroskedasticity and/or autocorrelation of the epsilon into account.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Samia Aourid

Join Date: Apr 2022

Posts: 5
#11

20 Apr 2022, 16:38

Hello,
I'm sorry if I am a bit intrusive but I am facing the same issue as Arnola. What if the -robust- command doesn't resolve the issue?

Last edited by Samia Aourid; 20 Apr 2022, 17:22.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#12

20 Apr 2022, 19:26

Robust isn't a command, it's a way of adjusting your standard errors for heteroskedasticity. There's no reason that I can think of off the top of my head as to why they wouldn't be sensible here/appropriate, unless there's a particular problem you're having? Samia Aourid
Comment
Samia Aourid

Join Date: Apr 2022

Posts: 5
#13

21 Apr 2022, 09:21

Hello Jared Greathouse thank you for your reply. Apologies for not using the right terminology, my knowledge of econometrics is very limited. If you don't mind checking my post on the issue I am currently facing. I tried to adjust my standard error using -cluster()- after -xtreg- as mentioned in the paper of Driscoll-Kraay (1998) "Consistent Covariance Matrix Estimation with Spatially Dependent Panel Data"-I am not sure if it is the right way to resolve the issue.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17613
#14

21 Apr 2022, 09:34

Samia:
see the community-contributed module -xtscc- (just type -search xtscc- from within Stata to find it out and follow the instructions to install it).

Kind regards,
Carlo
(StataNow 18.5)
Comment
Samia Aourid

Join Date: Apr 2022

Posts: 5
#15

21 Apr 2022, 10:41

Hello Carlo Lazzaro, thank you for your help. I've managed to install -xtscc-. Sorry to bother you a bit more but, I regressed my model using xtscc and found that some variables are insignificant (although the F-test suggests that the overall model is significant at a 5% significance level). Is that an issue? Also, I am dealing with a panel data with T>N, is it okay to use -xtscc- ?

Last edited by Samia Aourid; 21 Apr 2022, 11:02.
Comment

Announcement

Heteroskedasticity on Fixed Effect Model Won't Cured

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment