Testing for Normality using Jarque-Bera in Panel Data (xtsktest + jb)

Robin Klomp

Join Date: Apr 2022
Posts: 18

Testing for Normality using Jarque-Bera in Panel Data (xtsktest + jb)

02 May 2022, 05:59

Dear Stata community,

currently, I am trying to decide whether my data is normally distributed or not using the Jarque-Bera test. The test runs fine, however, after reading the manual, looking at other posts (most of which are unanswered), and watching youtube videos, I have not found a way how to interpret the results for panel data. I would be very grateful if somebody could please help me with this.

Code:

xtsktest resid
(running _xtsktest_calculations on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50

Tests for skewness and kurtosis                          Number of obs = 3,683
                                                         Replications  =    50

                                  (Replications based on 559 clusters in firm)
------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             | coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
  Skewness_e |  -.0002604   .0001401    -1.86   0.063     -.000535    .0000142
  Kurtosis_e |   .0003008   .0001085     2.77   0.006     .0000882    .0005134
  Skewness_u |   .0001761    .000113     1.56   0.119    -.0000454    .0003976
  Kurtosis_u |  -7.41e-06   .0000286    -0.26   0.795    -.0000634    .0000486
------------------------------------------------------------------------------
Joint test for Normality on e:        chi2(2) =  11.14    Prob > chi2 = 0.0038
Joint test for Normality on u:        chi2(2) =   2.49    Prob > chi2 = 0.2873
------------------------------------------------------------------------------



. jb resid
Jarque-Bera normality test:   2878 Chi(2)      0
Jarque-Bera test for Ho: normality:

Thank you very much!

Best,
Robin

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

02 May 2022, 06:08

Robin:
first, with such a huge number of observations, it's easy that analytical tests reject the null (that, in your case relates to the epsilon error only).
Hence, if you're planning to go -xtreg-, you can invoke -robust- or -vce(cluster panelid)- options (they do the very same job under -xtreg-) to fix the heteroskedasticity issue.
In addition, with 559 clusters, you should go non-default standard errors anyhow.
As an aside, please notify the list whe you use a community-contributed module (like -xtsktest-) for rerasons that are well explained in the FAQ. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Robin Klomp

Join Date: Apr 2022

Posts: 18
#3

02 May 2022, 07:18

Dear Carlo,

thank you for your prompt reply.

as I am not sure how to interpret these results, could you please specify which values you are taking into account when concluding "reject the null (that, in your case relates to the epsilon error only)."

Also, what do you mean with "you should go non-default standard errors anyhow"?

And I will be sure to mention the community-contributed models in the future.

Thanks,
Robin
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

02 May 2022, 07:31

Robin:
1) I refer to:

Code:

Joint test for Normality on e: chi2(2) = 11.14 Prob > chi2 = 0.0038

2) set aside heteroskedasticity, with 559 clusters, within panel serial correlation of the epsilon error is likely; that's why non-default standard errors should be the way to go.

Kind regards,
Carlo
(Stata 19.0)
Comment
Robin Klomp

Join Date: Apr 2022

Posts: 18
#5

02 May 2022, 07:55

So, if I understood correctly, I have heteroskedasticity and, therefore, you would recommend running my fixed-effects model with the robust option (which accounts for the non-default standard errors), i.e. my code would look like this:

Code:

xtreg ROA2 crt_InvestorPressureScore, fe robust
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#6

02 May 2022, 08:05

Robin:
correct.
In addition, please note that, unlike under -regress-, -robust- or -vce(cluster panelid)- options (they do the very same job under -xtreg-) and take into account serial correlation too (if existing).

Kind regards,
Carlo
(Stata 19.0)
Comment
Robin Klomp

Join Date: Apr 2022

Posts: 18
#7

02 May 2022, 08:30

Dear Carlo,
thank you for the clarification. I looked up the serial correlation now as well and found the community contributed program xtserial to detect the presence of serial correlation (downloaded from the Stata Journal).

Running it, I got the following result:

Code:

. xtserial InvestorPressureScore EnvironmentalPillarScore ROA1 IndependentBoardScore CeoDuality FirmAge GicSectorCode at Financ > ialLeverage2 UNPRISignatoryScore EnvironmentalInnovationScore Wooldridge test for autocorrelation in panel data H0: no first-order autocorrelation F( 1, 413) = 8.892 Prob > F = 0.0030

If I interpret the results correctly, it means that there is a serial correlation.

Judging by another comment you made in a thread https://www.statalist.org/forums/for...tion-in-panels, it should not be too big of an issue for my dataset, right? N = 5,142; T = 10 (years 2010-2020)

Originally posted by Carlo Lazzaro View Post

Jesse:
in the light of what I have (self)learnt in these years, serial correlation is a nasty issue when you have small N, large T panel dataset.
If the reverse holds, it is a minor issue and clustering standard errors can be all that is required to deal with it.
Please consider that I usually work with linear regression models for panel dataset (-xtreg- in Stata).
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#8

02 May 2022, 08:55

Robin:
it is not an issue at all if you invoke -robust- or -vce(cluster panelid)- options for standard errors.

Kind regards,
Carlo
(Stata 19.0)
Comment
Robin Klomp

Join Date: Apr 2022

Posts: 18
#9

02 May 2022, 08:59

Thank you for your help, Carlo! I misunderstood your last comment and thought you advised me to take serial correlation into account as well, rather than xtreg robust doing it automatically.

Enjoy the rest of your day
Robin
Comment

Robin Klomp

Join Date: Apr 2022
Posts: 18

#10

03 May 2022, 01:40

Good morning Carlo,

now I am rerunning my regressions with the fe robust option and my degrees of freedom in the F-Test significantly decrease. What could this be due to? It seems like it is acknowledging the groups instead of the observations in this case.

fe Model without robust (, fe)
F(18,3106) = 11.12

Code:

 xtreg ROA2 EnvironmentalPillarScore crt_InvestorPressureScore IndependentBoardScore CeoDuality FirmAge at FinancialLeverage2 
> UNPRISignatoryScore EnvironmentalInnovationScore i.GicSectorCod i.Year, fe 
note: 15.GicSectorCode omitted because of collinearity.
note: 20.GicSectorCode omitted because of collinearity.
note: 25.GicSectorCode omitted because of collinearity.
note: 30.GicSectorCode omitted because of collinearity.
note: 35.GicSectorCode omitted because of collinearity.
note: 40.GicSectorCode omitted because of collinearity.
note: 45.GicSectorCode omitted because of collinearity.
note: 50.GicSectorCode omitted because of collinearity.
note: 55.GicSectorCode omitted because of collinearity.
note: 60.GicSectorCode omitted because of collinearity.
note: 2020.Year omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =      3,683
Group variable: firm                            Number of groups  =        559

R-squared:                                      Obs per group:
     Within  = 0.0605                                         min =          1
     Between = 0.0000                                         avg =        6.6
     Overall = 0.0004                                         max =         11

                                                F(18,3106)        =      11.12
corr(u_i, Xb) = -0.6165                         Prob > F          =     0.0000

----------------------------------------------------------------------------------------------
                        ROA2 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-----------------------------+----------------------------------------------------------------
    EnvironmentalPillarScore |  -.0000365   .0000992    -0.37   0.713    -.0002311    .0001581
   crt_InvestorPressureScore |  -.0000136   7.86e-06    -1.74   0.083     -.000029    1.77e-06
       IndependentBoardScore |   6.55e-06   .0000532     0.12   0.902    -.0000977    .0001108
                  CeoDuality |    .000341   .0028032     0.12   0.903    -.0051554    .0058373
                     FirmAge |  -.0018165   .0005633    -3.22   0.001     -.002921    -.000712
                          at |  -6.71e-08   3.13e-08    -2.15   0.032    -1.28e-07   -5.77e-09
          FinancialLeverage2 |  -.0706042   .0097865    -7.21   0.000    -.0897929   -.0514155
         UNPRISignatoryScore |  -.0062503   .0212308    -0.29   0.768    -.0478782    .0353775
EnvironmentalInnovationScore |   .0000614   .0000674     0.91   0.363    -.0000708    .0001935
                             |
               GicSectorCode |
                         15  |          0  (omitted)
                         20  |          0  (omitted)
                         25  |          0  (omitted)
                         30  |          0  (omitted)
                         35  |          0  (omitted)
                         40  |          0  (omitted)
                         45  |          0  (omitted)
                         50  |          0  (omitted)
                         55  |          0  (omitted)
                         60  |          0  (omitted)
                             |
                        Year |
                       2011  |   .0066602     .00353     1.89   0.059    -.0002612    .0135817
                       2012  |   .0004499   .0034014     0.13   0.895    -.0062193    .0071191
                       2013  |   .0020875   .0033524     0.62   0.534    -.0044856    .0086606
                       2014  |   .0082337   .0033715     2.44   0.015     .0016232    .0148443
                       2015  |  -.0027242   .0034076    -0.80   0.424    -.0094056    .0039573
                       2016  |   .0066789   .0037564     1.78   0.076    -.0006865    .0140442
                       2017  |   .0139444    .004142     3.37   0.001     .0058229    .0220658
                       2018  |   .0193495   .0037429     5.17   0.000     .0120106    .0266884
                       2019  |    .011117   .0037461     2.97   0.003     .0037719    .0184621
                       2020  |          0  (omitted)
                             |
                       _cons |   .2405236   .0186362    12.91   0.000     .2039831    .2770641
-----------------------------+----------------------------------------------------------------
                     sigma_u |  .09701257
                     sigma_e |  .04712117
                         rho |  .80910984   (fraction of variance due to u_i)
----------------------------------------------------------------------------------------------
F test that all u_i=0: F(558, 3106) = 14.69                  Prob > F = 0.0000

fe Model with robust (, fe robust)
F(18,558) = 6.62

Code:

. xtreg ROA2 EnvironmentalPillarScore crt_InvestorPressureScore IndependentBoardScore CeoDuality FirmAge at FinancialLeverage2 
> UNPRISignatoryScore EnvironmentalInnovationScore i.GicSectorCod i.Year, fe robust
note: 15.GicSectorCode omitted because of collinearity.
note: 20.GicSectorCode omitted because of collinearity.
note: 25.GicSectorCode omitted because of collinearity.
note: 30.GicSectorCode omitted because of collinearity.
note: 35.GicSectorCode omitted because of collinearity.
note: 40.GicSectorCode omitted because of collinearity.
note: 45.GicSectorCode omitted because of collinearity.
note: 50.GicSectorCode omitted because of collinearity.
note: 55.GicSectorCode omitted because of collinearity.
note: 60.GicSectorCode omitted because of collinearity.
note: 2020.Year omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =      3,683
Group variable: firm                            Number of groups  =        559

R-squared:                                      Obs per group:
     Within  = 0.0605                                         min =          1
     Between = 0.0000                                         avg =        6.6
     Overall = 0.0004                                         max =         11

                                                F(18,558)         =       6.62
corr(u_i, Xb) = -0.6165                         Prob > F          =     0.0000

                                                 (Std. err. adjusted for 559 clusters in firm)
----------------------------------------------------------------------------------------------
                             |               Robust
                        ROA2 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-----------------------------+----------------------------------------------------------------
    EnvironmentalPillarScore |  -.0000365   .0001551    -0.24   0.814    -.0003413    .0002682
   crt_InvestorPressureScore |  -.0000136   .0000107    -1.27   0.204    -.0000347    7.43e-06
       IndependentBoardScore |   6.55e-06   .0000763     0.09   0.932    -.0001433    .0001564
                  CeoDuality |    .000341   .0036031     0.09   0.925    -.0067364    .0074184
                     FirmAge |  -.0018165   .0009821    -1.85   0.065    -.0037455    .0001125
                          at |  -6.71e-08   3.83e-08    -1.75   0.081    -1.42e-07    8.18e-09
          FinancialLeverage2 |  -.0706042   .0559752    -1.26   0.208    -.1805521    .0393437
         UNPRISignatoryScore |  -.0062503    .008872    -0.70   0.481    -.0236769    .0111763
EnvironmentalInnovationScore |   .0000614   .0000935     0.66   0.512    -.0001222    .0002449
                             |
               GicSectorCode |
                         15  |          0  (omitted)
                         20  |          0  (omitted)
                         25  |          0  (omitted)
                         30  |          0  (omitted)
                         35  |          0  (omitted)
                         40  |          0  (omitted)
                         45  |          0  (omitted)
                         50  |          0  (omitted)
                         55  |          0  (omitted)
                         60  |          0  (omitted)
                             |
                        Year |
                       2011  |   .0066602   .0021041     3.17   0.002     .0025273    .0107932
                       2012  |   .0004499   .0027208     0.17   0.869    -.0048944    .0057942
                       2013  |   .0020875   .0027596     0.76   0.450    -.0033329    .0075079
                       2014  |   .0082337   .0033115     2.49   0.013     .0017292    .0147383
                       2015  |  -.0027242   .0056082    -0.49   0.627      -.01374    .0082917
                       2016  |   .0066789   .0037742     1.77   0.077    -.0007345    .0140922
                       2017  |   .0139444   .0047863     2.91   0.004     .0045431    .0233456
                       2018  |   .0193495   .0039963     4.84   0.000     .0114999    .0271991
                       2019  |    .011117   .0033113     3.36   0.001     .0046129     .017621
                       2020  |          0  (omitted)
                             |
                       _cons |   .2405236    .024762     9.71   0.000     .1918855    .2891616
-----------------------------+----------------------------------------------------------------
                     sigma_u |  .09701257
                     sigma_e |  .04712117
                         rho |  .80910984   (fraction of variance due to u_i)
----------------------------------------------------------------------------------------------

Best,
Robin

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#11

03 May 2022, 01:48

Robin:
1) if you go -xtreg,fe- time-invariant variables, such as -industry-, are wiped out (as expected) by the -fe- estimator: hence, there's no gain in plugging them in the right-hand side of your regression equation;
2) you're right: if you go cluster-robust standard errors (SEs), the degrees of freedom at the denominator of the F-statistics are (N-1) clusters (or groups, as you wrote). Please not that there's nothing sinister in that. In addition, your non-default SEs are higher (as expected) of their default counterparts. Stick with them and move forward.

Kind regards,
Carlo
(Stata 19.0)
Comment
Robin Klomp

Join Date: Apr 2022

Posts: 18
#12

03 May 2022, 05:07

Thank you, Carlo! All clear
Comment

Announcement

Testing for Normality using Jarque-Bera in Panel Data (xtsktest + jb)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment