Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in # of observations AR(1) using ARIMA vs. OLS with lag using 'regress'

    Hi all,

    I'm working with a dataset with several Datastream indices (panels). I want to run the following AR(1) regression for each country individually: rit = bi0 + bi1 ri,t-1 + bi2 mondayit + bi3 taxit.

    In the papers I read, they use an OLS estimator for estimating these coefficients for individual countries and OLS with panel corrected standard errors (PCSE) for the whole panel. For now I'm only interested in the estimation of the coefficients for each individual country, since you can run a AR(1) model in two possible ways: using regress and using ARIMA.

    If you use the time-series ARIMA option in STATA, it will use maximum likelihood. This is not a problem 'an sich' but I found out that if I would run a simple regression for one country (using 'regress') with an added lag it will use way less observations than if I would choose to add a lag in the ARIMA-option.
    8739 observations versus 9582 observations to be precise, where the total amount of observations for the return variable was indeed 9582.
    Could anyone give me explanation for this difference in amount of observations?
    Thank you in advance.

    Best regards,

    RJ

  • #2
    Could you please post the code you used?
    Jorge Eduardo Pérez Pérez
    www.jorgeperezperez.com

    Comment


    • #3
      Hi Jorge,

      This is the code for the linear regression with a lag, which basically is AR(1) (number of observations = 8739).
      Code:
      regress RI_RTpct tax monday tempdailyC lag_RI_RTpct if country_ID == 1
      And this is the code for the ARIMA regression (number of observations = 9582).
      Code:
      arima RI_RTpct tax monday tempdailyC if country_ID == 1, arima(1,0,0)
      8739 makes sense to me in that there are some gaps in the dataset.
      So sometimes a 'lag observation' is matching an empty observation of the return variable, for example:
      Date RI_RT RI_RT(-1)
      1-1-2003 0.025 0.013
      2-1-2003 - 0.025
      3-1-2003 -0.03 -
      And the linear regression will only include the values for which it has observations for all variables, hence it will exclude the observation for 3-1-2003.

      But why does it tell that in ARIMA regression that it uses 9582 observations? Does the procedure of creating an AR lag in the ARIMA regression not take into account if there is an empty observation of this lag in specific period (like the observation 3-1-2003 in the example above) and just simply ignores this?

      Furthermore I observed that if I do the same regression (so an AR(1) model) in Eviews, he also finds 8739 observations.

      Best regards,

      Robbert Jan
      Last edited by RJ Bremer; 15 Jul 2015, 03:45.

      Comment


      • #4
        This difference arises because arima uses the Kalman filter to estimate the model, filtering over the missing observations. This shows up in the command output:

        Code:
        sysuse sp500, clear
        * generate random gaps
        gen u=uniform()
        replace close=. if u<0.05
        tsset date
        * arima
        arima close, arima(1,0,0)
        * ols
        reg close l.close


        To see the details, check the manual entry on arima, pages 92 to 95.


        Jorge Eduardo Pérez Pérez
        www.jorgeperezperez.com

        Comment


        • #5
          Ah I see, thank you very much for the clarification! Although I'm not familiar with that filter.
          For this specific example, given the amount of missing observations, would you prefer one approach above the other?

          Comment


          • #6
            I don't have enough experience to suggest one approach or the other. My intuition tells me that for small gaps that are scattered over the series the filtering approach is fine, but it would not be very good if you have a large consecutive number of gaps. Maybe this will help:

            http://www.stata.com/statalist/archi.../msg00893.html

            Jorge Eduardo Pérez Pérez
            www.jorgeperezperez.com

            Comment

            Working...
            X