Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Normalization in panel data

    I have balanced panel data for 1600 companies for 8 years and 6 variables. Do I need to check for normality?
    I applied the Hausman test without normalizing it, and it suggested using a fixed-effect model.

  • #2
    Pranshu:
    what follows holds for cross-sectional datasets, too: normality is a (weak) requirement for epsilon (and u, in panel data analysis) only.
    Therefore, you do not have to test/adjust for normality and live with your data as they are.
    In addition, with 1600 panels, you shoud switch from default to clustered-ribust standard error (just add the -robust- or the -vce(lcuster idcode)-) option; the do the very same job under -xtreg-).
    As -hausman- does not allow non-defalut standard errors, you should test the -re- (only) specification via the community-contributed module -xtoverid- (its null is that -re- is the way to go).
    Otherwise, having a balanced panel dataset you can test the -fe- vs. the -re- specification via the Mundlak approach (see https://blog.stata.com/2015/10/29/fi...ndlak-approach).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks, Mr. Lazzaro.
      If I am not wrong in interpreting what you have suggested, then you mean that we can not run the Hausman test with the clustered error, so we should go with

      xtreg y x1, x2, x3,re vce(cluster id)
      xtoverid

      OR
      bysort id: egen mean_x2 = mean(x2)
      bysort id: egen mean_x3 = mean(x3)
      quietly xtreg y x1 x2 x3 mean_x2 mean_x3, vce(robust)
      estimates store mundlak
      test mean_x2 mean_x3
      ( 1) mean_x2 = 0
      ( 2) mean_x3 = 0 (credit:https://blog.stata.com/2015/10/29/fi...dlak-approach/)
      and they do the same job.
      Last edited by Pranshu Tripathi; 15 Nov 2022, 05:17.

      Comment


      • #4
        Pranshu:
        correct.
        The more efficient code is the one that includes -xtoverid-. Unfortunately, being glorious but a bit old-fashioned, -xtoverid- does not support-fvvarlist- notation. The usual fix is to prefix your -xtreg,re- code with -xi:- for categorical variables and creating interactions by hand.
        As an aside, please call me Carlo, as all on (and many more off) this forum do. Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks, Carlo.
          For clearing my doubts.

          Comment


          • #6
            Hey Carlo,
            I have a third-order interaction as an explanatory variable.
            It has two categorical and one continuous variable.
            To find the appropriate model, I am doing as given below

            xi:xtreg inv tobq size tang lev cash categoricaldummy1 categoricaldummy2 categoricaldummy1categoricaldummy2 tobqcategoricaldummy1 tobqcategoricaldummy2 tobqcategoricaldummy1categoricaldummy2, re vce(cluster id)

            xtoverid


            Is this the correct code to select the appropriate model?

            And can I do this with clustering by time also?

            Comment


            • #7
              Pranshu:
              1) -xi:- takes categorical variables only into account;
              2) interactions should be created by hand as in the following toy-example:
              Code:
              . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
              (1978 automobile data)
              
              . g mpg_weight= mpg*weight
              
              . regress price mpg weight mpg_weight
              
                    Source |       SS           df       MS      Number of obs   =        74
              -------------+----------------------------------   F(3, 70)        =     13.11
                     Model |   228430463         3  76143487.7   Prob > F        =    0.0000
                  Residual |   406634933        70  5809070.47   R-squared       =    0.3597
              -------------+----------------------------------   Adj R-squared   =    0.3323
                     Total |   635065396        73  8699525.97   Root MSE        =    2410.2
              
              ------------------------------------------------------------------------------
                     price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                       mpg |   396.7844   185.2023     2.14   0.036     27.41003    766.1587
                    weight |   5.067008   1.378057     3.68   0.000      2.31856    7.815455
                mpg_weight |  -.1916795   .0711936    -2.69   0.009    -.3336706   -.0496885
                     _cons |  -5944.881   4525.706    -1.31   0.193    -14971.12    3081.356
              ------------------------------------------------------------------------------
              that, as expected, give back the very same results as:
              Code:
              . regress price c.mpg##c.weight
              
                    Source |       SS           df       MS      Number of obs   =        74
              -------------+----------------------------------   F(3, 70)        =     13.11
                     Model |   228430463         3  76143487.7   Prob > F        =    0.0000
                  Residual |   406634933        70  5809070.47   R-squared       =    0.3597
              -------------+----------------------------------   Adj R-squared   =    0.3323
                     Total |   635065396        73  8699525.97   Root MSE        =    2410.2
              
              --------------------------------------------------------------------------------
                       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              ---------------+----------------------------------------------------------------
                         mpg |   396.7844   185.2023     2.14   0.036     27.41003    766.1587
                      weight |   5.067008   1.378057     3.68   0.000      2.31856    7.815455
                             |
              c.mpg#c.weight |  -.1916795   .0711936    -2.69   0.009    -.3336706   -.0496885
                             |
                       _cons |  -5944.881   4525.706    -1.31   0.193    -14971.12    3081.356
              --------------------------------------------------------------------------------
              
              .
              3) n-clustering is not supported by -xtoverid- (and by -xtreg- either). You may want to keep -i.timevar- as a predictor, though.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thanks, Carlo for bearing with me.

                Comment


                • #9
                  Pranshu:
                  no bearing with you at all; we're simply reharsing together!
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment

                  Working...
                  X