Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diagnostics test in Panel Data Analysis

    Hi everyone!

    I am doing my master thesis using panel data regression. However, before proceeding to estimation I did diagnostics tests such as:

    1) normality test
    2) multicollinearity test
    3) heteroskedasticity test.

    Out of these 3 tests, I didn't pass normality and heteroskedasticity tests. And what is the difference to perform a test before panel data estimation and after? This forum suggested doing it before, while my supervisor said to perform tests after the panel data regression.

    Secondly, how can I now control the normality and heteroskedasticity errors in my panel data? Can you please provide me detail steps in controlling these errors?
    I tried to run fixed and random effect models with xtreg and "robust" suffix to control heteroskedasticity and it was showing me that the probability is not anymore significant, what should I do now? Run pooled OLS regression?

    I will really appreciate your answers.

    With kind regards,
    Temirlan.
    Last edited by Temirlan Talap; 03 Nov 2021, 16:23.

  • #2
    If a panel, then cluster on the id. If the variable isn't statistically significant, then that's the way the cookie crumbles.

    Code:
    cluster Y X1 X2 , absorb(ID YEAR) cluster(ID)
    I wouldn't stress about normality, but if it bugs you, then cluster on id then use boottest.

    G

    Comment


    • #3
      Temirlan, more information about your data would help others better understand your question. How large are N and T? What are the dependent variable and independent variables? What's the type of the dependent variable? (continuous, count number, or just dummy?)

      Comment


      • #4
        Thank you all.

        Dear Fei,

        The N is 52 companies and T is 9 years. In total 459 observations. The dependent variable is ESG score (from 1-100 scale) and independent variables are ROA, ROE and Tobin's Q financial ratios. I guess dependent varibale is continuous.

        Comment


        • #5
          Also, I just read that if I have serial correlation and heteroskedasticity I should run panel regression with -xtgls. Is it true?

          Therefore, I got really confused. I thought I should run xtreg with -fe and -re, make hausman test to choose between models. Now after diagnostic tests, it appears that I should run -xtgls.

          When diagnostics test should be done? After getting results of panel data or before? I think everybody confusing on that.

          Comment


          • #6
            Temirlan, as you have small T and (somehow) large N, I would recommend -xtreg- with -fe- option.

            Code:
            xtset companyid year
            xtreg ESGscore ROA ROE TobinQ i.year, fe vce(cluster companyid)
            I feel it's not that important to test heteroskedasticity, etc. I would simply use methods that are robust to these issues, as above. I would not use -xtreg- with -re- (so will not bother to use Hausman test), as panel unobserved heterogeneity can hardly be uncorrelated with covariates. I would not recommend -xtgls- either because it's not robust enough to general forms of autocorrelation and further requires relatively small N -- not your case.

            Comment


            • #7
              Thank you very much Fei.

              One more question regarding your command in xtreg, what is i.year and vce(cluster companyid) means?

              Comment


              • #8
                Dear Fei,

                One more thing, if you suggest not to do Hausman test, what the rationale that I use for my thesis? Because Hausman test looks to be obligatory.

                Comment


                • #9
                  xtreg is a regression command for panel data. i.year tells stata to include year dummies. vce(cluster companyid) tells stata to use clustered standard errors at the company level, and this will correct for both autocorr and hetero.

                  For me, I prefer reghdfe since you can have multiple fixed effects and absorb both companyid and year.

                  Code:
                  reghdfe Y X1 X2, absorb(ID YEAR) cluster(ID)

                  Comment


                  • #10
                    Thanks, George.

                    To Temirlan: Hausman test is not necessary to justify the use of fixed effect estimation. I would argue that it's harmless to use FE but RE could be very wrong when panel unobserved heterogeneity is correlated with regressors.

                    Comment


                    • #11
                      Thank you very much George and Fei.

                      So, it is better to use - reghdfe in my regression analysis? I am asking it because of different results xtreg and reghdfe.

                      Comment


                      • #12
                        Shouldn't be different, except for maybe the t-stats (due to the absorption of the year thus altering the DF).

                        Comment


                        • #13
                          George,

                          Can I please clarify with you one moment in regression. In my regression, the dependent variable is ROE, and the independent variables are ESG_total ENV SOC GOV. When I perform regression should I do each independent variable separately?

                          Because when I regress "reghdfe ROE ESG_total" separately the relationship is positive, when I do "reghdfe ROE ESG_total ENV SOC GOV" together the relationship with ROE and ESG_total is negative and p-values are different.

                          What is the correct way to regress in such case?

                          Thank you very much.

                          With kind regards,
                          Temirlan.

                          Comment


                          • #14
                            When you change the model, you change the results if the independent variables are correlated. The richer model is better if you believe the additional variables are relevant (does theory say so?). But that's got nothing to do with reghdfe.

                            Absolutely not do the variables independently. This is the purpose of multivariate regression--getting the coefficient under the ceteris paribus assumption.

                            Comment


                            • #15
                              George,
                              let me give you the purpose of my paper is to test the relationship between ROA and total ESG. In turn, ESG consists of Environment, Social and Governance scores which I want to test separately in my research. Then I think I should do regression for each variable separately.

                              Additionaly, can you please suggest the issue of the Hausman test?

                              Because all my literature have done Hausman test I want to be in line and do Hausman test as well. My approach to do hausman test with: xtreg -fe -re. Then run -hausman and choose. Is it correct? Should I take into account heteroskedasticity and serial correlation when I do xtreg -fe -re? For example, I read that I can add robust to xtreg and do -xtoverid to choose between fixed and random effect.

                              Sorry, I am zero in statistics and doing it the first time.

                              Comment

                              Working...
                              X