Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed Effects and Clustering the Standard Error

    Hello,

    I need your help!

    I try to understand the meaning of fixed effects and clustering standard errors.

    I have a regression with a dependent variable and independent variables: reg x y y y y y y i.date i.country, why do I need these fixed effects?
    And then my professor said I have to cluster by country and year, but I do not understand the sense of clustering the standard error by country and year!

    I hope somebody can help me!
    Thanks in Advance!

  • #2
    Jenny:
    welcome to the list.
    If you're dealing with panel data and you use -regress- (which is usually not your best bet with such data structure; see -xtreg-, instead), you should -cluster- your standard errors on your -panelid- (probably -country-in your case), as your observations are not independent.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      You can get some ideas about clustering by looking at Figures 1 (at firm level) and 6 (at both firm and year level) of the following paper: https://academic.oup.com/rfs/article...nce-Panel-Data.

      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        It's mostly standard to just cluster by your panelid, in this case country. However I have heard people are now clustering by panelid and timevar, in this case country and year. I personally have never done it and have never seen an explanation on how to do it.

        To run a two way fixed effect model with clustered errors, in your case would be.

        Code:
        xtset country year
        xtreg y x i.year, fe vce(cluster country)

        Comment


        • #5
          So far as I know,
          Code:
          xtreg y x i.year, fe vce(robust)
          xtreg y x i.year, fe vce(cluster country)
          provide identical standard (robust) errors.
          Last edited by River Huang; 05 Sep 2017, 19:45.
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment


          • #6
            To cluster at both firm (id) and year (year) level, see below for an example: (please first ssc install reghdfe):
            Code:
            reghdfe y x1x2, a(id year) vce(cluster id year)
            Ho-Chuan (River) Huang
            Stata 19.0, MP(4)

            Comment


            • #7
              Thanks for help and the quick responses!

              Can I also use the function xtivreg2 or ivreg2 y x1 x2 x3 ... i.year i.country, cluster (country date) ?
              Last edited by Jenny Seinsch; 06 Sep 2017, 00:05.

              Comment


              • #8
                Jenny:
                I do not think that -xtivreg2- can accomodate two-way clustering, as you can see from te following toy-example:
                Code:
                . use http://fmwww.bc.edu/ec-p/data/macro/abdata.dta
                (Layard & Nickell, Unemployment in Britain, Economica 53, 1986 from Ox dist)
                
                . tsset id year
                       panel variable:  id (unbalanced)
                        time variable:  year, 1976 to 1984
                                delta:  1 unit
                
                .  xtivreg2 ys k (n=l2.n l3.n), fe small cluster( ind yr1980)
                cluster():  too many variables specified
                r(103);
                However, the main point rests on the fact that you're seemingly considering -ivreg2-equivalent to -xtivreg2-.
                Let's set aside for a while instrumental variables.
                Your original post focused on two-way clustered standard errors using -regress- in dealing with panel data (which should not be your first choice, since -xtreg- usually outperforms -regress- when it comes to panel data).
                If you're intended to use -regress- with panel data, clustering your standard errors on -panelid- is mandatory. Conversely, Stata would not be informed that you're analysing non-independent observations.
                Things are different with -xtreg-, as you are requested to -xtset- your data first: hence, Stata knows from the start that you're dealing with panel data.
                Hence, clustering/robustifying standard errors with -xtreg- is not manadtory: it makes sense if you suspect heteroskedasticity/autocorrelation in your data (by the way, the latter is quite immaterial as long as you're dealing with a large N, small T panel dataset, as it frequently appears to be the case on this forum); otherwise, default standard errors are enough..
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Hi Carlo,
                  My problem is that if I do it like:
                  • cluster2 y x x x x, fcluster(country) tcluster(date)
                  • reghdfe y x x x, a (country date) vce (cluster country date)
                  • xtreg y x x x x, i.date, fe vce (cluster country)
                  I do not have a lot of statistically significant explanatory variables, but I talk about them in my work.
                  But I am completely new in Stata to understand all the staff. For example if I use only fixed effects without clustering I have good models with a lot of statistically significant variables, therefore I had the question: What is the sense of clustering? Do I really need it?

                  Thanks a lot!


                  ​​​​​​​

                  Comment


                  • #10
                    Jenny:
                    some remarks about your last query:
                    - statistical significance is usually oversold (this opinion is pretty shared on this list). You should better look at the 95% confidence interval of coefficients, instead.
                    Besides, the lack of ststistical significance can depend on different issues, the most trivial being a limited sample size;
                    - clustering is mandatory when you have panel data and you (often wrongly) decide to analyze them via a regression model conceived for one-wave dataset (say, -regresss-; -logit-; -poisson). Conversely, Stata would not be informed that you're analysing non-independent observations;
                    - if you're dealing with panel data and (as it is often correct) you use one of the -xt- suite models (say, -xtreg-; -xtlogit-; -xtpoisson-), clustering (if feasible) should be considered when you suspect heteroskedasticity and/or autocorrelation in your data.

                    I would recommend you to discuss these topics with you supervisor before starting your statistical analysis, just to avoid wasting time and panic crises as the deadline gets nearer.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Originally posted by River Huang View Post
                      To cluster at both firm (id) and year (year) level, see below for an example: (please first ssc install reghdfe):
                      Code:
                      reghdfe y x1x2, a(id year) vce(cluster id year)
                      Dear Carlo and River,

                      Your posts in this section are really helpful. I still have a question about the implementation in stata.
                      I was trying to incorporate two types of fixed effects and clustering on both dimensions. When implementing this in stata, an error message popped up (see attachment).
                      Do you know how to resolve this?

                      FYI, the panelid is industries (industrynr is the actual variable) and timevar is Observation (which equals 1 to 242 for each industry. I did it this way as there was a year and monthly component so I decomposed it into an observation for each industry which seems to work)

                      Thank you in advance.

                      Regards,

                      Daniel
                      Attached Files

                      Comment


                      • #12
                        Daniel:
                        you might be interested in the folowing thread: https://www.statalist.org/forums/for...ects-undefined.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment


                        • #13
                          Dear Carlo,

                          Thank you for your help. I didn't see that thread before because I am new to this forum.
                          The problem is resolved. Appreciate it!

                          Kind regards,

                          Daniel

                          Comment


                          • #14
                            Dear Carlo
                            I have a question please

                            Can I ask please, If I use
                            xtreg y l.y x1 x2.....x15 i.year i.country i.industry, robust cl(comany_id)

                            (((5 years unbalanced 379 observations )))
                            Does it mean that I am using mixed effect plus random effect together?


                            Comment


                            • #15
                              Mohammad:
                              not quite.
                              I'd say that you're coding something similar to dynamic panel data analysis (the lagged regressand is plugged in as a predictor).
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X