Normalization in panel data

Pranshu Tripathi

Join Date: Oct 2022

Posts: 62
#1

Normalization in panel data

15 Nov 2022, 03:23

I have balanced panel data for 1600 companies for 8 years and 6 variables. Do I need to check for normality?
I applied the Hausman test without normalizing it, and it suggested using a fixed-effect model.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

15 Nov 2022, 04:03

Pranshu:
what follows holds for cross-sectional datasets, too: normality is a (weak) requirement for epsilon (and u, in panel data analysis) only.
Therefore, you do not have to test/adjust for normality and live with your data as they are.
In addition, with 1600 panels, you shoud switch from default to clustered-ribust standard error (just add the -robust- or the -vce(lcuster idcode)-) option; the do the very same job under -xtreg-).
As -hausman- does not allow non-defalut standard errors, you should test the -re- (only) specification via the community-contributed module -xtoverid- (its null is that -re- is the way to go).
Otherwise, having a balanced panel dataset you can test the -fe- vs. the -re- specification via the Mundlak approach (see https://blog.stata.com/2015/10/29/fi...ndlak-approach).

Kind regards,
Carlo
(Stata 19.0)
2 likes
Comment
Pranshu Tripathi

Join Date: Oct 2022

Posts: 62
#3

15 Nov 2022, 05:12

Thanks, Mr. Lazzaro.
If I am not wrong in interpreting what you have suggested, then you mean that we can not run the Hausman test with the clustered error, so we should go with

xtreg y x1, x2, x3,re vce(cluster id)
xtoverid

OR
bysort id: egen mean_x2 = mean(x2)
bysort id: egen mean_x3 = mean(x3)
quietly xtreg y x1 x2 x3 mean_x2 mean_x3, vce(robust)
estimates store mundlak
test mean_x2 mean_x3
( 1) mean_x2 = 0
( 2) mean_x3 = 0 (credit:https://blog.stata.com/2015/10/29/fi...dlak-approach/)
and they do the same job.

Last edited by Pranshu Tripathi; 15 Nov 2022, 05:17.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#4

15 Nov 2022, 06:16

Pranshu:
correct.
The more efficient code is the one that includes -xtoverid-. Unfortunately, being glorious but a bit old-fashioned, -xtoverid- does not support-fvvarlist- notation. The usual fix is to prefix your -xtreg,re- code with -xi:- for categorical variables and creating interactions by hand.
As an aside, please call me Carlo, as all on (and many more off) this forum do. Thanks.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Pranshu Tripathi

Join Date: Oct 2022

Posts: 62
#5

15 Nov 2022, 06:21

Thanks, Carlo.
For clearing my doubts.
Comment
Pranshu Tripathi

Join Date: Oct 2022

Posts: 62
#6

16 Nov 2022, 05:53

Hey Carlo,
I have a third-order interaction as an explanatory variable.
It has two categorical and one continuous variable.
To find the appropriate model, I am doing as given below

xi:xtreg inv tobq size tang lev cash categoricaldummy1 categoricaldummy2 categoricaldummy1categoricaldummy2 tobqcategoricaldummy1 tobqcategoricaldummy2 tobqcategoricaldummy1categoricaldummy2, re vce(cluster id)

xtoverid

Is this the correct code to select the appropriate model?

And can I do this with clustering by time also?
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17673

16 Nov 2022, 07:06

Pranshu:
1) -xi:- takes categorical variables only into account;
2) interactions should be created by hand as in the following toy-example:

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

. g mpg_weight= mpg*weight

. regress price mpg weight mpg_weight

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(3, 70)        =     13.11
       Model |   228430463         3  76143487.7   Prob > F        =    0.0000
    Residual |   406634933        70  5809070.47   R-squared       =    0.3597
-------------+----------------------------------   Adj R-squared   =    0.3323
       Total |   635065396        73  8699525.97   Root MSE        =    2410.2

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |   396.7844   185.2023     2.14   0.036     27.41003    766.1587
      weight |   5.067008   1.378057     3.68   0.000      2.31856    7.815455
  mpg_weight |  -.1916795   .0711936    -2.69   0.009    -.3336706   -.0496885
       _cons |  -5944.881   4525.706    -1.31   0.193    -14971.12    3081.356
------------------------------------------------------------------------------

that, as expected, give back the very same results as:

Code:

. regress price c.mpg##c.weight

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(3, 70)        =     13.11
       Model |   228430463         3  76143487.7   Prob > F        =    0.0000
    Residual |   406634933        70  5809070.47   R-squared       =    0.3597
-------------+----------------------------------   Adj R-squared   =    0.3323
       Total |   635065396        73  8699525.97   Root MSE        =    2410.2

--------------------------------------------------------------------------------
         price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
---------------+----------------------------------------------------------------
           mpg |   396.7844   185.2023     2.14   0.036     27.41003    766.1587
        weight |   5.067008   1.378057     3.68   0.000      2.31856    7.815455
               |
c.mpg#c.weight |  -.1916795   .0711936    -2.69   0.009    -.3336706   -.0496885
               |
         _cons |  -5944.881   4525.706    -1.31   0.193    -14971.12    3081.356
--------------------------------------------------------------------------------

.

3) n-clustering is not supported by -xtoverid- (and by -xtreg- either). You may want to keep -i.timevar- as a predictor, though.

Kind regards,
Carlo
(Stata 19.0)

Comment

Pranshu Tripathi

Join Date: Oct 2022

Posts: 62
#8

16 Nov 2022, 11:01

Thanks, Carlo for bearing with me.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#9

16 Nov 2022, 11:27

Pranshu:
no bearing with you at all; we're simply reharsing together!

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

Normalization in panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment