Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Robust regression in Stata

    Hi,

    I have a few questions regarding robust regression in Stata and the package -mmregress-. The paper is published here "http://www.stata-journal.com/article.html?article=st0173"

    1- Can I use -mmregress- for time series regressions? I am currently using OLS regression and wanted to check whether the results are robust given the potential existence of outliers and extreme observations in an independent variable.
    2- It seems that the results change each time I run mmregress. The problem is not present when I -set seed- to enable me to replicate the results. Is this the right approach to deal with that?

    Thanks

  • #2
    A search of Stata leads to a message that mmregress has been superseded by robreg (from SSC).

    My distant understanding is that neither of these commands pays any special attention to whether your data are time series, but i am open to qualification or contradiction on that point.

    A problem here is of tribal habits. A mainstream statistics tradition is exactly as you phrase it to be sensitive to

    the potential existence of outliers and extreme observations
    except that I think most people would want to add in any of the variables included in the model.

    A more recent tradition, especially in econometrics, focuses on robustness largely in terms of first getting honest standard errors in the face of autocorrelation, heteroscedasticity and so forth.

    Comment


    • #3
      The general advice is "look at the reputable literature in your field, and what they have done before you." If nobody has used robust regression before, either you are breaking a new ground, or you are alternatively doing something that does not make sense.

      Another general advice regarding which method is appropriate, is that it does not depend on what type of data you have, but rather on what kind of assumptions you are ready to make for your data. E.g., all OLS methods are appropriate for time series data, if you are ready to make the OLS assumption for your data (and in particular that they are iid in the time series dimension).

      I do not know what -mmregress- does, but different results on every run certainly means that there is uncertainty/randomness in the method, it is something like a bootstrap or simulation. And yes, the right approach is to set the seed, and then you get the same results because the pseudo random numbers on every run are the same.

      Comment


      • #4
        I was able to download mmregress today and saw that other researchers have used it in my field too. However, these papers used panel data. In my case, I have time series regressions.

        I use OLS regressions throughout the research. But as I noted that there could be some extreme observations, I thought that running a robust regression and reporting the mm estimator using --mmregress- will be helpful only as a robustness test.

        My understanding from the replies is that mmregress is appropriate for time series regressions. Please correct me if this not right?!

        Thanks

        Comment


        • #5
          I am probably in a minority of 1 here, but when we use OLS we are estimating a conditional mean or an approximation to it, so the results should be sensitive to outliers in y because by definition the mean is sensitive to outliers in y. If we do not want our results to be sensitive to outliers in y, then we should estimate something that is not a mean (e.g., the mode or the median) rather than trying to find a robust estimator for the mean, which is a bit of an oxymoron. Anyway, we should keep in mind that in many cases it is important that the outliers affect the results. For example, a "robust" model of stock prices will not take into account outliers such as the financial crises or the pandemic, leading to an overly optimistic (and dangerous) view of the stock market performance.

          Comment


          • #6
            My concern is for the existence of outliers in x. I simply estimate a time series regression using high-frequency data (daily and intraday data: regressions of asset returns on an index where the index (x) may have some extreme observations). I use OLS regressions and report Newey West t stats.

            Just for comfort and to ensure that extreme observations are not affecting my time series regressions results I estimate robust regressions with MM estimator (with -mmregress-). I find that inferences do not change.

            Does this sound OK? I assume that -mmregress- is appropriate for time series regressions, isn't it?!

            Comment


            • #7
              In the published paper as in my post #1, the authors state "In particular, development of robust procedures for panel-data and time-series models would be of major interest for applied economic research. The time-series setting will give rise to new problems; for example, selecting random p-subsets will not be appropriate because they break the temporal structure of the data." This is also why I am not sure if I need to do anything with mmregress when used for time series regressions.

              That said, the authors seem to suggest using robust regressions for linear regression analysis without being specific about whether it has to be cross sectional or time series.

              I look forward to getting more help.
              Last edited by Lisa Wilson; 04 Jan 2023, 06:18.

              Comment


              • #8
                Lisa:
                why not going -qreg- then (as Joao's wise helpful advice sounds to me)?
                Another issue rests on the data generating process that generates your x variables: are "estreme values" allowed (something along theline of a Gamma distribution, with along rigth tail) or not?
                Eventually: if I got your description right, you performed a simple OLS. Are the results of your OLS informative enough with one predictor only?
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Originally posted by Carlo Lazzaro View Post
                  Lisa:
                  why not going -qreg- then (as Joao's wise helpful advice sounds to me)?
                  Another issue rests on the data generating process that generates your x variables: are "estreme values" allowed (something along theline of a Gamma distribution, with along rigth tail) or not?
                  Eventually: if I got your description right, you performed a simple OLS. Are the results of your OLS informative enough with one predictor only?
                  There seem to be strong recommendations for using mmregress in the literature I am following but that was not specific on whether the linear regression should be cross-sectional or time series. The aim is just to test whether there is a relation between stock return and the sentiment index (I am mainly interested in the sign of this relation and of course its significance, but no predictions are made).


                  Note: I have made apost in #7 at the same time when Carlo Lazzaro was responding.

                  I can see why mmregress would be inappropriate for time series though but thought it would be very helpful to get the Statalist users' opinion too.

                  Hope to hear from you more.

                  Comment


                  • #10
                    Lisa:
                    I'd echo my previous advice about going -qreg-.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Thanks! But as mentioned the literature recommends the use of robust regressions with MM estimator e.g. mmregress and hence my interest is whether it is now appropriate for time series regressions?! It shows superiority over qreg.

                      Any advice on mmregress will be very appreciate!

                      looking forward to getting further assistance

                      Comment


                      • #12
                        Around 1972 I had the distinct impression that doing regression robustly was the top problem in statistics and that the next big advance would be a robust regression method almost everyone liked. Fortunately I kept quiet about that as I was wrong. Since then there have many suggestions of robust regression method but no one much likes any particular method except its creators and even their affection can be brief, not least because their prestige depends on producing good ideas again and again.

                        People in favour of robust regression would I think all agree with Joao that

                        If we do not want our results to be sensitive to outliers in y, then we should estimate something that is not a mean
                        but they would underline that what is estimated doesn't necessarily have a simple description.

                        Comment


                        • #13
                          Lisa:
                          tribal habits adn myths have their own importance, as Nick wisely reminded in #2.
                          If you're planning to submit a paper on your research to some journal of your research field, go -mmregress- as you can easily justified your methodological choice (it's more a matter of publication strategy than a ststistical issue).
                          That said, you have also received different takes about different alternatives to -mmregress- (that I do not know, so I cannot say in which respects it outperforms -qreg-).
                          What I cannot do is making a decision on your behalf.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Thanks a lot, Nick and Carlo for the advice. I appreciate that.

                            My problem is still whether -mmregress- is appropriate for time series regressions. As mentioned, the literature uses mmregress but most papers have either pooled regressions or cross-sectional regression. I use time-series regressions.

                            Therefore I would like to use mmregress for the time series regressions in a robustness test but really needs to know if it is appropriate for time series as I do not want to do something wrong!



                            Hope to hear from more users too who may also be familiar with robust regressions using mmregress.

                            Comment


                            • #15
                              I and some others have tried to answer your question to the extent possible, but whether the time series aspect of your data is consistent with or conversely undermines your use of mmregress is too hard to call remotely without knowing anything about how you are using mmregress precisely, and what your goals are. What is the context here? Are you a student who should be seeking support within your university or a more experienced researcher?

                              FWIW, quantile regression does not seem to pay any special attention to time series aspects either,

                              Comment

                              Working...
                              X