Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of ACF and PACF result

    Dear all,

    I have 4 data sets and I am trying to figure out what AR processes do I have.

    Below is my first result:
    Code:
    . varsoc csad
    
       Selection-order criteria
       Sample:  11jan2002 - 31dec2010, but with gaps
                                                    Number of obs      =       469
      +---------------------------------------------------------------------------+
      |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
      |----+----------------------------------------------------------------------|
      |  0 | -643.473                      .914306   2.74829   2.75177   2.75714  |
      |  1 | -476.467  334.01    1  0.000  .450451   2.04037   2.04733   2.05807  |
      |  2 | -473.181  6.5712*   1  0.010  .446082*  2.03062*  2.04107*  2.05717* |
      |  3 | -472.631  1.0995    1  0.294  .446939   2.03254   2.04647   2.06794  |
      |  4 | -472.533  .19679    1  0.657  .448661   2.03639    2.0538   2.08064  |
      +---------------------------------------------------------------------------+
       Endogenous:  csad
        Exogenous:  _cons
    
    . ac csad
    (note: time series has 469 gaps) 
    Click image for larger version

Name:	ac1.png
Views:	2
Size:	13.4 KB
ID:	1436034
    . pac csad (note: time series has 469 gaps)
    Click image for larger version

Name:	pac1.png
Views:	1
Size:	10.0 KB
ID:	1436035
    . dfuller csad, lag(2) Augmented Dickey-Fuller test for unit root Number of obs = 938 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -6.591 -3.430 -2.860 -2.570 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.0000 . dfuller csad, lag(3) Augmented Dickey-Fuller test for unit root Number of obs = 469 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -5.369 -3.442 -2.871 -2.570 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.0000 .
    Below is my second result:
    Code:
    . ac csad
    (note: time series has 469 gaps) 
    Click image for larger version

Name:	ac2.png
Views:	1
Size:	13.1 KB
ID:	1436036
    . pac csad (note: time series has 469 gaps)
    Click image for larger version

Name:	pac2.png
Views:	1
Size:	10.0 KB
ID:	1436037
    . graph save Graph "\\ads.bris.ac.uk\filestore\MyFiles\StudentUG15\zl15509\Docum > ents\ac2.gph" (file \\ads.bris.ac.uk\filestore\MyFiles\StudentUG15\zl15509\Documents\ac2.gph s > aved) . varsoc csad Selection-order criteria Sample: 11jan2002 - 31dec2010, but with gaps Number of obs = 469 +---------------------------------------------------------------------------+ |lag | LL LR df p FPE AIC HQIC SBIC | |----+----------------------------------------------------------------------| | 0 | -526.243 .554601 2.24837 2.25185 2.25722 | | 1 | -398.007 256.47 1 0.000 .322358 1.70579 1.71275 1.72349 | | 2 | -381.31 33.394 1 0.000 .301487 1.63885 1.64929 1.6654 | | 3 | -369.076 24.468* 1 0.000 .287384* 1.59094* 1.60487* 1.62634* | | 4 | -368.558 1.0358 1 0.309 .287976 1.593 1.61041 1.63725 | +---------------------------------------------------------------------------+ Endogenous: csad Exogenous: _cons . dfuller csad, lag(2) Augmented Dickey-Fuller test for unit root Number of obs = 938 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -8.450 -3.430 -2.860 -2.570 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.0000 . dfuller csad, lag(3) Augmented Dickey-Fuller test for unit root Number of obs = 469 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -4.264 -3.442 -2.871 -2.570 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.0005 .
    I have shown 2 of my 4 data sets, why is it that, in my first result, the Information Criteria suggested a lag selection of 2, but the PAC correlogram pointed out that my process is an AR(3) process? However, in my second result, the IC suggested a lag of 3, which comply with my observation in my PAC correlogram. I am also curious about the gaps I have in my data, I do not find any visible 'gaps', and the strange thing is, why does the number of observations I have coincide with the number of gaps I have (according to Stata anyway) in my ADF test when the lag is set to 3 in both cases?

    I am really confused when it comes to selecting the correct lags, any help would be very much appreciated.
    Last edited by sladmin; 09 Apr 2018, 08:52. Reason: anonymize poster

  • #2
    Hi Guest,

    The ACF and PACF functions tell the degree of autocorrelation of the residuals, while the Dickey-Fuller test is a test of stationarity of a time-series and this is very important to make sure. In terms of selecting the most appropriate lag length my personal way to assess it is to run different AR(p) processes reducing the number of lags and comparing them with the AIC criterion. I use ACF and PACF to control for autocorrelation, but the best model shouldn't be relied only on the outcome of these two functions. You notice the strange coincidence, which is definitely worth to investigate, but I can't help you on that specific issue, at least without an example of your dataset.

    Stefano
    Last edited by sladmin; 09 Apr 2018, 08:52. Reason: anonymize poster

    Comment


    • #3
      Hi,

      I did not include an example data due to the worry that my question would go unwieldy, here is an example data:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int date float(csad r_mt abs_r_mt)
      15341         .          .         .
      15342         0          0         0
      15343         0          0         0
      15344  1.121359  -1.244825  1.244825
      15347  .8384604 -1.8260676 1.8260676
      15348  .8142346  -.7415656  .7415656
      15349 2.3549974  -1.363532  1.363532
      15350  1.550921   .9361056  .9361056
      15351 1.1240761 -2.6176565 2.6176565
      15354 1.8118427 -3.3451836 3.3451836
      15355  2.969135  -.0896658  .0896658
      15356 2.1570632  -.3391402  .3391402
      15357 2.2170117 -4.1681747 4.1681747
      15358  2.644805  -.3118701  .3118701
      15361  2.963015  -3.454361  3.454361
      15362 2.2707293  -.4424261  .4424261
      15363 2.3031635   6.198806  6.198806
      15364 1.4451684   .8063558  .8063558
      15365 1.3950977   -.344738   .344738
      15368  3.902349  -6.505327  6.505327
      15369 1.1566843   2.423436  2.423436
      15370 1.0549119  .26137948 .26137948
      15371  1.254906    6.52339   6.52339
      15372 1.5201653  -.4173109  .4173109
      15375  .9825116   1.702328  1.702328
      15376  1.114676   .4786668  .4786668
      15377 1.5348548 -2.1573632 2.1573632
      15378  .7537856  1.9775778 1.9775778
      15379 1.0132658  -.5581059  .5581059
      15382         0          0         0
      15383         0          0         0
      15384         0          0         0
      15385         0          0         0
      15386         0          0         0
      15389         0          0         0
      15390         0          0         0
      15391         0          0         0
      15392         0          0         0
      15393         0          0         0
      15396  1.274694  1.5701783 1.5701783
      15397 1.3580036  .10401993 .10401993
      15398  .8076649   .4514593  .4514593
      15399  .8925744  -.9193029  .9193029
      15400 1.1542493 -1.4786975 1.4786975
      15403  .9701719  1.6151924 1.6151924
      15404 1.1633363  2.1370106 2.1370106
      15405 1.0109228   .5902937  .5902937
      15406     1.439  2.7598126 2.7598126
      15407 1.2160257    1.68069   1.68069
      15410 1.6176355  1.4480067 1.4480067
      15411 1.3672187  -1.204716  1.204716
      15412 1.5481066  -1.240187  1.240187
      15413  2.288033   2.526972  2.526972
      15414 1.8057197  -2.683378  2.683378
      15417  .8521762  .16768247 .16768247
      15418    1.1117  2.5364754 2.5364754
      15419   .854178  1.0444176 1.0444176
      15420 1.0258094   .1653281  .1653281
      15421  .9014975  -.9187028  .9187028
      15424   .774503   .3416858  .3416858
      15425  .8466411  -.6586901  .6586901
      15426 1.0115734   -.628175   .628175
      15427  .8829954  -.2470474  .2470474
      15428 1.0818111  -2.898383  2.898383
      15431  .9583073   .2838773  .2838773
      15432 1.5708042 -1.6087356 1.6087356
      15433  .6156698   .7381983  .7381983
      15434 1.0369376   2.944683  2.944683
      15435  .7619497  -.4269994  .4269994
      15438  1.073394   .3779377  .3779377
      15439 1.2923234   1.636865  1.636865
      15440  .8183972   .6197558  .6197558
      15441  .9553066 -1.7046663 1.7046663
      15442  .6897609   .5968718  .5968718
      15445  .9276596 -1.1327618 1.1327618
      15446 1.0392244  .08547807 .08547807
      15447  .5828557  .23829854 .23829854
      15448  .7659038  -.9333453  .9333453
      15449  .8302611   .4145174  .4145174
      15452 1.1097444 -.45854485 .45854485
      15453  .8099011 -.20033753 .20033753
      15454 1.1097099  -.8491672  .8491672
      15455 1.1383946   .2953311  .2953311
      15456  2.040586     1.4821    1.4821
      15459  3.339945  1.5277324 1.5277324
      15460 2.2982426  .13717596 .13717596
      15461         0          0         0
      15462         0          0         0
      15463         0          0         0
      15466         0          0         0
      15467         0          0         0
      15468 1.5502427  -.9300161  .9300161
      15469  .8445619 -.10962622 .10962622
      15470 1.0004123  -.7513041  .7513041
      15473  1.352603  -.8735946  .8735946
      15474 1.0092524 -.48211205 .48211205
      15475  .9904096 -1.1693865 1.1693865
      15476 1.0718229  -3.157136  3.157136
      15477   .834774  1.1713375 1.1713375
      15480 1.1608008  -1.686772  1.686772
      end
      format %td date
      I would like to first assure you that the zeros you see in my data are completely innocuous, the estimates are calculated by subtracting stock price at time t-1 from stock price at time t, hence getting a sense of the 'returns'. Did Stata detect the zeros as 'gaps'?

      Yes, I do realise that the ADF and PACF are not to be mixed up, I was just trying to point out a strange phenomenon. Thanks for your answer concerning the lags, but do you mind elaborating on 'comparing the lags of the AR(p) processes with the AIC criterion' part? How do you actually 'compare' to get an insight of what is the correct lags? As mentioned in my first post, I do know how to run the Information Criteria command, I am just wondering what is the intuition behind it and how does it contrast with your method of choosing lags- that is, alternating AR(p) lags.

      Thank you very much.

      Comment


      • #4
        Also, I find it very strange that the time series line graph and my correlogram suggest different conclusion.
        The left graph is my y variable over time, and the right one is the first difference of my y variable. It seems to me that they are pretty stationary, especially after the first-difference.
        Click image for larger version

Name:	newcsad.png
Views:	1
Size:	34.4 KB
ID:	1436117
        Click image for larger version

Name:	newdcsad.png
Views:	1
Size:	28.2 KB
ID:	1436118


        Below is my correlogram, it suggests that the autocorrelations never really die, does it not suggest a 'trend' and thus, pointing out that my plots are non-stationary? However, my ADF test turns out to support the stance that my data is in fact stationary. I am really confused as to which one should I be trusting.

        Click image for larger version

Name:	ac1.png
Views:	2
Size:	13.4 KB
ID:	1436119

        Comment


        • #5
          First of all, regarding the number of zero returns, even though it can happen, in my experience it's very rare that closing prices are exactly the same for two subsequent days. Even more strange foe several consecutive days. A couple of years ago, trading in the Athen stock exchange was suspended for a month, for example, and you'd have a similar situation. Moreover, I experienced issues in downloading daily data on calendar days instead of trading days, so that Saturdays and Sundays have zero returns, because of no trading, thus the price is the same as Friday. If you're confident your data are correct then go for it, I won't insist

          I agree with you in your post #4 that the time series is stationary, in fact this is also confirmed by the ADF test you reported in #1. Moreover, the second plot looks more stationary simply because you're first differencing, which is what you should do to deal with mean reverting issues.

          For what concern your main question, the Akaike's information criterion (AIC) and the Schwarz Bayesian information criterion (SBIC) are used to choose the "best" model in terms of goodness of fit and number of regressors. The decision rule is simple: you go for the model that minimises the criterion. That's why in your first table you see an * for the model with 2 lags, because the decision criteria are minimised.

          It is important to check for autocorrelation, but it's more important to assure stationarity. You can also perform a Durbin-Watson test to check for autocorrelation and clarify your ideas. There are various ways in dealing with autocorrelation. It may be due to omitted variables or a wrong functional form. Some most common remedies regard the inclusion of a dummy variable in the data, the Generalized Least Squares estimation or the inclusion of a linear (trend) term if the residuals show a consistent increasing or decreasing pattern.

          Hope this helps

          S

          Comment


          • #6
            Thank you, I have sorted it out!

            Comment


            • #7
              Stefano, I'm having some issues on ACF, PACF, DFULLER and choosing lag length, I will make a thread now, please could you look at it from my profile and seeing my threads. I will call it "help with lags and autocorrelation".

              Comment

              Working...
              X