Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference between robust and non-robust?

    Hi statisticians!

    Can anyone explain to me when we should use the robust option when running what kind of models? In which case, the robust and nonrobust standard errors will not change much? Thanks so much!

    I know in SAS we have the empirical option, dose anyone know which option or package we have in R to get the robust results? Thanks a ton!

  • #2
    The robust variance estimator is robust to heteroscedasticity. It should be used when heteroscedasticity is, or is likely to be, present.

    In some commands, (-xtreg, fe- and -xtpoisson, fe- come to mind, there may be others I'm not thinking of off the top of my head), specifying -vce(robust)- leads to the cluster robust variance estimator. This one, in addition to being robust to heteroscedasticity is also robust to correlation of errors within the specified clusters (the panel variable when invoked automatically by the command itself) and serial correlation. It should be used when these are present or suspected, and when the number of clusters is large enough for it to be valid. As a practical matter, most real world panel data has these problems, and it is easier to pre-emptively deal with them by specifing -robust- than it is to try to test for their presence. So the -vce(cluster robust)- is generally a good idea in any panel data analysis with a sufficient number of clusters. There is no universal agreement about the minimum number of cluster needed. I have seen rules of thumb suggesting a minimum of 10, or a minimum of 25, or of 50 in order for the cluster robust variance estimator to actually be an improvement over the ordinary variance estimator.

    I do not use R and cannot answer the second question. But there are others on the Forum who use both Stata and R and might respond to that.
    Last edited by Clyde Schechter; 22 Aug 2017, 12:03.

    Comment


    • #3
      Hello Dan,

      This blog post might also be helpful

      http://blog.stata.com/2016/08/30/two...andard-errors/

      Comment


      • #4
        Thanks, Enrique. That was very informative. I wasn't aware of that.

        Comment


        • #5
          Thank you everyone!!! It's really helpful!!!

          Comment


          • #6
            I personally always use cluster robust options when analyzing panel data. This is standard in my field (econ and public policy). While I understand that it may be technically permissible in some situations to not use these commands, reviewers would always question it, and I've never seen a paper that tried to make an excuse for not using it.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              The robust variance estimator is robust to heteroscedasticity. It should be used when heteroscedasticity is, or is likely to be, present.

              In some commands, (-xtreg, fe- and -xtpoisson, fe- come to mind, there may be others I'm not thinking of off the top of my head), specifying -vce(robust)- leads to the cluster robust variance estimator. This one, in addition to being robust to heteroscedasticity is also robust to correlation of errors within the specified clusters (the panel variable when invoked automatically by the command itself) and serial correlation. It should be used when these are present or suspected, and when the number of clusters is large enough for it to be valid. As a practical matter, most real world panel data has these problems, and it is easier to pre-emptively deal with them by specifing -robust- than it is to try to test for their presence. So the -vce(cluster robust)- is generally a good idea in any panel data analysis with a sufficient number of clusters. There is no universal agreement about the minimum number of cluster needed. I have seen rules of thumb suggesting a minimum of 10, or a minimum of 25, or of 50 in order for the cluster robust variance estimator to actually be an improvement over the ordinary variance estimator.

              I do not use R and cannot answer the second question. But there are others on the Forum who use both Stata and R and might respond to that.
              Hi Schechter,

              Is there any problem if we use -xtreg, fe vce(cluster id)- when there is no heteroscedasticity or autocorrelation in our data?
              --------------------
              (Stata 15.1 MP)

              Comment


              • #8
                As long as you have enough clusters, it should not be a problem.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  As long as you have enough clusters, it should not be a problem.
                  I use panel data that has more than 200 firms in 10 years, so I will have 200 clusters. It is OK, isn't it?

                  In addition, could you please give me some theoretical background so that I can answer when a reviewer ask "why should it not be a problem if I have enough clusters".

                  Thanks in advance!
                  --------------------
                  (Stata 15.1 MP)

                  Comment


                  • #10
                    The robust standard errors are consistent whether you have heteroskedasticity or not. The cluster-robust standard errors are consistent whether you have cluster correlation as you have specified, or only heteroskedasticity, or no cluster correlation and no heteroskedasticity at all.

                    For an accessible theoretical background you can look up this paper: Cameron, A. Colin, and Douglas L. Miller. "A practitioner’s guide to cluster-robust inference." Journal of human resources 50, no. 2 (2015): 317-372.

                    Also note that in Stata -xtreg, fe vce(cluster id)- is equivalent to -xtreg, fe robust-, in other words, Stata would not allow you to compute heteroskedasticity only consistent standard errors and variances in the xtreg suit, but automatically reverts to cluster robust even if you have said only robust.



                    Originally posted by Linh Nguyen View Post

                    I use panel data that has more than 200 firms in 10 years, so I will have 200 clusters. It is OK, isn't it?

                    In addition, could you please give me some theoretical background so that I can answer when a reviewer ask "why should it not be a problem if I have enough clusters".

                    Thanks in advance!

                    Comment


                    • #11
                      Originally posted by Joro Kolev View Post
                      The robust standard errors are consistent whether you have heteroskedasticity or not. The cluster-robust standard errors are consistent whether you have cluster correlation as you have specified, or only heteroskedasticity, or no cluster correlation and no heteroskedasticity at all.

                      For an accessible theoretical background you can look up this paper: Cameron, A. Colin, and Douglas L. Miller. "A practitioner’s guide to cluster-robust inference." Journal of human resources 50, no. 2 (2015): 317-372.

                      Also note that in Stata -xtreg, fe vce(cluster id)- is equivalent to -xtreg, fe robust-, in other words, Stata would not allow you to compute heteroskedasticity only consistent standard errors and variances in the xtreg suit, but automatically reverts to cluster robust even if you have said only robust.




                      Thanks so much Joro
                      --------------------
                      (Stata 15.1 MP)

                      Comment

                      Working...
                      X