Interpretation of ACF and PACF result

Guest

Interpretation of ACF and PACF result

24 Mar 2018, 10:46

Dear all,

I have 4 data sets and I am trying to figure out what AR processes do I have.

Below is my first result:

Code:

. varsoc csad

   Selection-order criteria
   Sample:  11jan2002 - 31dec2010, but with gaps
                                                Number of obs      =       469
  +---------------------------------------------------------------------------+
  |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
  |----+----------------------------------------------------------------------|
  |  0 | -643.473                      .914306   2.74829   2.75177   2.75714  |
  |  1 | -476.467  334.01    1  0.000  .450451   2.04037   2.04733   2.05807  |
  |  2 | -473.181  6.5712*   1  0.010  .446082*  2.03062*  2.04107*  2.05717* |
  |  3 | -472.631  1.0995    1  0.294  .446939   2.03254   2.04647   2.06794  |
  |  4 | -472.533  .19679    1  0.657  .448661   2.03639    2.0538   2.08064  |
  +---------------------------------------------------------------------------+
   Endogenous:  csad
    Exogenous:  _cons

. ac csad
(note: time series has 469 gaps) 
 
 



. pac csad
(note: time series has 469 gaps) 
 
 



. dfuller csad, lag(2)

Augmented Dickey-Fuller test for unit root         Number of obs   =       938

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -6.591            -3.430            -2.860            -2.570
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0000

. dfuller csad, lag(3)

Augmented Dickey-Fuller test for unit root         Number of obs   =       469

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -5.369            -3.442            -2.871            -2.570
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0000

.

Below is my second result:

Code:

. ac csad
(note: time series has 469 gaps) 
 
 

. pac csad
(note: time series has 469 gaps) 
 
 

. graph save Graph "\\ads.bris.ac.uk\filestore\MyFiles\StudentUG15\zl15509\Docum
> ents\ac2.gph"
(file \\ads.bris.ac.uk\filestore\MyFiles\StudentUG15\zl15509\Documents\ac2.gph s
> aved)

. varsoc csad

   Selection-order criteria
   Sample:  11jan2002 - 31dec2010, but with gaps
                                                Number of obs      =       469
  +---------------------------------------------------------------------------+
  |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
  |----+----------------------------------------------------------------------|
  |  0 | -526.243                      .554601   2.24837   2.25185   2.25722  |
  |  1 | -398.007  256.47    1  0.000  .322358   1.70579   1.71275   1.72349  |
  |  2 |  -381.31  33.394    1  0.000  .301487   1.63885   1.64929    1.6654  |
  |  3 | -369.076  24.468*   1  0.000  .287384*  1.59094*  1.60487*  1.62634* |
  |  4 | -368.558  1.0358    1  0.309  .287976     1.593   1.61041   1.63725  |
  +---------------------------------------------------------------------------+
   Endogenous:  csad
    Exogenous:  _cons

. dfuller csad, lag(2)

Augmented Dickey-Fuller test for unit root         Number of obs   =       938

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -8.450            -3.430            -2.860            -2.570
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0000

. dfuller csad, lag(3)

Augmented Dickey-Fuller test for unit root         Number of obs   =       469

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -4.264            -3.442            -2.871            -2.570
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0005

.

I have shown 2 of my 4 data sets, why is it that, in my first result, the Information Criteria suggested a lag selection of 2, but the PAC correlogram pointed out that my process is an AR(3) process? However, in my second result, the IC suggested a lag of 3, which comply with my observation in my PAC correlogram. I am also curious about the gaps I have in my data, I do not find any visible 'gaps', and the strange thing is, why does the number of observations I have coincide with the number of gaps I have (according to Stata anyway) in my ADF test when the lag is set to 3 in both cases?

I am really confused when it comes to selecting the correct lags, any help would be very much appreciated.

Last edited by sladmin; 09 Apr 2018, 08:52. Reason: anonymize poster

Tags: None

Stefano Grillini

Join Date: Jun 2015

Posts: 85
#2

24 Mar 2018, 17:40

Hi Guest,

The ACF and PACF functions tell the degree of autocorrelation of the residuals, while the Dickey-Fuller test is a test of stationarity of a time-series and this is very important to make sure. In terms of selecting the most appropriate lag length my personal way to assess it is to run different AR(p) processes reducing the number of lags and comparing them with the AIC criterion. I use ACF and PACF to control for autocorrelation, but the best model shouldn't be relied only on the outcome of these two functions. You notice the strange coincidence, which is definitely worth to investigate, but I can't help you on that specific issue, at least without an example of your dataset.

Stefano

Last edited by sladmin; 09 Apr 2018, 08:52. Reason: anonymize poster
Comment

Guest

25 Mar 2018, 07:10

Hi,

I did not include an example data due to the worry that my question would go unwieldy, here is an example data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int date float(csad r_mt abs_r_mt)
15341         .          .         .
15342         0          0         0
15343         0          0         0
15344  1.121359  -1.244825  1.244825
15347  .8384604 -1.8260676 1.8260676
15348  .8142346  -.7415656  .7415656
15349 2.3549974  -1.363532  1.363532
15350  1.550921   .9361056  .9361056
15351 1.1240761 -2.6176565 2.6176565
15354 1.8118427 -3.3451836 3.3451836
15355  2.969135  -.0896658  .0896658
15356 2.1570632  -.3391402  .3391402
15357 2.2170117 -4.1681747 4.1681747
15358  2.644805  -.3118701  .3118701
15361  2.963015  -3.454361  3.454361
15362 2.2707293  -.4424261  .4424261
15363 2.3031635   6.198806  6.198806
15364 1.4451684   .8063558  .8063558
15365 1.3950977   -.344738   .344738
15368  3.902349  -6.505327  6.505327
15369 1.1566843   2.423436  2.423436
15370 1.0549119  .26137948 .26137948
15371  1.254906    6.52339   6.52339
15372 1.5201653  -.4173109  .4173109
15375  .9825116   1.702328  1.702328
15376  1.114676   .4786668  .4786668
15377 1.5348548 -2.1573632 2.1573632
15378  .7537856  1.9775778 1.9775778
15379 1.0132658  -.5581059  .5581059
15382         0          0         0
15383         0          0         0
15384         0          0         0
15385         0          0         0
15386         0          0         0
15389         0          0         0
15390         0          0         0
15391         0          0         0
15392         0          0         0
15393         0          0         0
15396  1.274694  1.5701783 1.5701783
15397 1.3580036  .10401993 .10401993
15398  .8076649   .4514593  .4514593
15399  .8925744  -.9193029  .9193029
15400 1.1542493 -1.4786975 1.4786975
15403  .9701719  1.6151924 1.6151924
15404 1.1633363  2.1370106 2.1370106
15405 1.0109228   .5902937  .5902937
15406     1.439  2.7598126 2.7598126
15407 1.2160257    1.68069   1.68069
15410 1.6176355  1.4480067 1.4480067
15411 1.3672187  -1.204716  1.204716
15412 1.5481066  -1.240187  1.240187
15413  2.288033   2.526972  2.526972
15414 1.8057197  -2.683378  2.683378
15417  .8521762  .16768247 .16768247
15418    1.1117  2.5364754 2.5364754
15419   .854178  1.0444176 1.0444176
15420 1.0258094   .1653281  .1653281
15421  .9014975  -.9187028  .9187028
15424   .774503   .3416858  .3416858
15425  .8466411  -.6586901  .6586901
15426 1.0115734   -.628175   .628175
15427  .8829954  -.2470474  .2470474
15428 1.0818111  -2.898383  2.898383
15431  .9583073   .2838773  .2838773
15432 1.5708042 -1.6087356 1.6087356
15433  .6156698   .7381983  .7381983
15434 1.0369376   2.944683  2.944683
15435  .7619497  -.4269994  .4269994
15438  1.073394   .3779377  .3779377
15439 1.2923234   1.636865  1.636865
15440  .8183972   .6197558  .6197558
15441  .9553066 -1.7046663 1.7046663
15442  .6897609   .5968718  .5968718
15445  .9276596 -1.1327618 1.1327618
15446 1.0392244  .08547807 .08547807
15447  .5828557  .23829854 .23829854
15448  .7659038  -.9333453  .9333453
15449  .8302611   .4145174  .4145174
15452 1.1097444 -.45854485 .45854485
15453  .8099011 -.20033753 .20033753
15454 1.1097099  -.8491672  .8491672
15455 1.1383946   .2953311  .2953311
15456  2.040586     1.4821    1.4821
15459  3.339945  1.5277324 1.5277324
15460 2.2982426  .13717596 .13717596
15461         0          0         0
15462         0          0         0
15463         0          0         0
15466         0          0         0
15467         0          0         0
15468 1.5502427  -.9300161  .9300161
15469  .8445619 -.10962622 .10962622
15470 1.0004123  -.7513041  .7513041
15473  1.352603  -.8735946  .8735946
15474 1.0092524 -.48211205 .48211205
15475  .9904096 -1.1693865 1.1693865
15476 1.0718229  -3.157136  3.157136
15477   .834774  1.1713375 1.1713375
15480 1.1608008  -1.686772  1.686772
end
format %td date

I would like to first assure you that the zeros you see in my data are completely innocuous, the estimates are calculated by subtracting stock price at time t-1 from stock price at time t, hence getting a sense of the 'returns'. Did Stata detect the zeros as 'gaps'?

Yes, I do realise that the ADF and PACF are not to be mixed up, I was just trying to point out a strange phenomenon. Thanks for your answer concerning the lags, but do you mind elaborating on 'comparing the lags of the AR(p) processes with the AIC criterion' part? How do you actually 'compare' to get an insight of what is the correct lags? As mentioned in my first post, I do know how to run the Information Criteria command, I am just wondering what is the intuition behind it and how does it contrast with your method of choosing lags- that is, alternating AR(p) lags.

Thank you very much.

Comment

Guest
#4

25 Mar 2018, 08:19

Also, I find it very strange that the time series line graph and my correlogram suggest different conclusion.
The left graph is my y variable over time, and the right one is the first difference of my y variable. It seems to me that they are pretty stationary, especially after the first-difference.

Below is my correlogram, it suggests that the autocorrelations never really die, does it not suggest a 'trend' and thus, pointing out that my plots are non-stationary? However, my ADF test turns out to support the stance that my data is in fact stationary. I am really confused as to which one should I be trusting.
Comment
Stefano Grillini

Join Date: Jun 2015

Posts: 85
#5

25 Mar 2018, 17:22

First of all, regarding the number of zero returns, even though it can happen, in my experience it's very rare that closing prices are exactly the same for two subsequent days. Even more strange foe several consecutive days. A couple of years ago, trading in the Athen stock exchange was suspended for a month, for example, and you'd have a similar situation. Moreover, I experienced issues in downloading daily data on calendar days instead of trading days, so that Saturdays and Sundays have zero returns, because of no trading, thus the price is the same as Friday. If you're confident your data are correct then go for it, I won't insist

I agree with you in your post #4 that the time series is stationary, in fact this is also confirmed by the ADF test you reported in #1. Moreover, the second plot looks more stationary simply because you're first differencing, which is what you should do to deal with mean reverting issues.

For what concern your main question, the Akaike's information criterion (AIC) and the Schwarz Bayesian information criterion (SBIC) are used to choose the "best" model in terms of goodness of fit and number of regressors. The decision rule is simple: you go for the model that minimises the criterion. That's why in your first table you see an * for the model with 2 lags, because the decision criteria are minimised.

It is important to check for autocorrelation, but it's more important to assure stationarity. You can also perform a Durbin-Watson test to check for autocorrelation and clarify your ideas. There are various ways in dealing with autocorrelation. It may be due to omitted variables or a wrong functional form. Some most common remedies regard the inclusion of a dummy variable in the data, the Generalized Least Squares estimation or the inclusion of a linear (trend) term if the residuals show a consistent increasing or decreasing pattern.

Hope this helps

S
1 like
Comment
Guest
#6

27 Mar 2018, 13:44

Thank you, I have sorted it out!
Comment
selina gravette

Join Date: Dec 2018

Posts: 11
#7

03 Mar 2019, 11:52

Stefano, I'm having some issues on ACF, PACF, DFULLER and choosing lag length, I will make a thread now, please could you look at it from my profile and seeing my threads. I will call it "help with lags and autocorrelation".
Comment

Announcement

Interpretation of ACF and PACF result

Comment

Comment

Comment

Comment

Comment

Comment