Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assumptions Fixed Effects

    Dear Community,

    I hope it's the right place to ask this question but I got so many help here that I hope I will find someone who can answer this very BASIC question as I'm still a stata beginner...

    So my problem is that I have panel data, I have performed the Hausman test and saw that I should actually use a fixed effects model.

    Then I have checked some assumptions that I thought have to be fulfilled with any model I want to use, but now that they are not fulfilled I asked myself:

    are these the right assumptions also for a fixed effects model?
    and second, if they are the right ones and they are not fulfilled, what can I do then to still be able to use my results?

    What I did is the following:


    *** 1. deleting outlierst that influence my data

    I did avplots and deleted outliers that I saw where influencing my data, as a basis for this I performed a normal OLS regression with all variables I want to use in my final regression

    *** 2. Checking Normality of Residuals

    I did an xtreg regression (fixed effects) and performed:

    predict rs
    kdensity rs, normal
    pnorm rs
    swilk rs

    *I saw that the residuals are not normal distributed

    ***3. Homoscedasticity tests

    I did performa a normal regression with all variables I want to use as a basis and then performed:

    rvfplot, yline(0) // --- negative linear pattern
    estat imtest, white
    estat hettest //--- Breusch-Pagan test indicates that we cannot accept the constant variance assumption
    *I saw that there is homoscedasticity in the data

    *4. Multicollinearity
    vif
    *seems to be Okay

    so can I still use my data or do you know any method that allows be to get normal distributed residuals and data without homoscedasticity?

    Thank you so much in advance!!!

    Btw: I googled all this before but really, the answers in other forums are a bit confusing for me and I still don't know what to do that's why this is my last hope.



    Best

    Lisa

  • #2
    1. Deleting outliers is usually a bad idea. It is an especially bad idea if you are deleting outliers on the outcome variable. If it is predictor variables and they are really distorting the modeling results, then perhaps separate models for inlying and outlying values are a reasonable way to go. Sometimes a transformation of the variable solves the problem. But in any case it is best to try to understand why the outliers are there and what they mean, rather than just arbitrarily cutting them out.

    2. Normality of residuals is not important if your sample size is reasonably large: the central limit theorem implies that the regression coefficients really will be normally distributed around the true values to a good approximation. If your sample size is small, then the tests you used are not particularly sensitive and may not help you anyway.

    3. Hetereoscedasticity is best dealt with by just using the -vce(robust)- option in your regression. Again, the data set needs to be reasonably large for this to work properly (in particular, the number of panels should not be too small).

    4. Multicolinearity. Almost always unimportant unless it involves your main predictor variable(s), and in that case, there isn't really any good solution to the problem. See Arthur Goldberger's Introductory Econometrics book for more information about why investigating multicollinearity is a waste of time.

    Comment


    • #3
      Thank you so much for that very quick answer Clyde Schechter !

      Comment


      • #4
        clyde what is "reasonably large" for normality and vce robust that you suggested in point 2&3?

        Comment


        • #5
          With a sample size of even 30, in most situations the central limit theorem will provide the assurances that one seeks from normality--only severely skew distributions still fail at that size. By the time you reach a sample size of 100, for practical purposes, the central limit theorem will always rescue the regression. (That is not to say one could not contrive a bizarre example where it fails, but such things rarely arise in real life.)

          As for robust variance estimates, the most permissive guideline I have seen says 15 is enough, but most people prefer a larger number such as 30 or 50 or even 100.

          Comment


          • #6
            does it not bother you clyde that there is such a huge subjectivity. how can maths be subjective? in these cases it seems like you can bend the rules to get the results you want. shouldn't there be a clear cut definition as to what defines as large?

            Comment


            • #7
              Devashish:
              disappointing as it may sound, the same issue creeps up with the -ttest- requirements (see Analysis of cost data in randomized trials: an application of the non-parametric bootstrap - PubMed (nih.gov) , page 3223).
              For more on clustered standard errors, see https://cameron.econ.ucdavis.edu/res...5_February.pdf.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Originally posted by Devashish Singh View Post
                does it not bother you [...] that there is such a huge subjectivity. how can maths be subjective?
                Math is hardly subjective. Econometrics is not math, though. Much more generally speaking, one of the cornerstones of science is inter-subjectivity not objectivity. Whether the latter can be achived or even exists in the first place has been discussed by various philosophers.

                Comment

                Working...
                X