ARDL in stata - Statalist

wanhaiyou replied

13 Jul 2022, 15:54
Originally posted by Sebastian Kripfganz View Post

The optimal lag orders are found based on the sample which sets aside the first 10 observations. If you set aside fewer initial observations, it is possible that you get different optimal lag orders. For comparison of the models with selection criteria (AIC, BIC), the same sample must be used.

Once you have chosen the optimal lag order, you could then use all observations as in your second code for the subsequent analysis.

I see, great thanks for your timely help!

Bests,
wanhai
Leave a comment:
Sebastian Kripfganz replied

13 Jul 2022, 09:03
The optimal lag orders are found based on the sample which sets aside the first 10 observations. If you set aside fewer initial observations, it is possible that you get different optimal lag orders. For comparison of the models with selection criteria (AIC, BIC), the same sample must be used.

Once you have chosen the optimal lag order, you could then use all observations as in your second code for the subsequent analysis.
Leave a comment:
wanhaiyou replied

13 Jul 2022, 08:55
Originally posted by Sebastian Kripfganz View Post

In your first code, you have set

Code:

local maxlags = 10

This sets aside the first 10 observations in the estimation sample for computing the lags, even if the optimal lag order is smaller than 10.

In your second code, you pre-specify the lag order to be (1, 1). Here, only 1 observation is set aside.

Great thanks! Yes, I understand this, but I am question about which one is "right".

In fact, although the max lag is set to 10 for choosing the optimal lag, the sample used to estimate should based on the final optimal lag step. Is it right?

Thanks for your time!

Bests,
wanhai
Leave a comment:
Sebastian Kripfganz replied

13 Jul 2022, 07:46
In your first code, you have set

Code:

local maxlags = 10

This sets aside the first 10 observations in the estimation sample for computing the lags, even if the optimal lag order is smaller than 10.

In your second code, you pre-specify the lag order to be (1, 1). Here, only 1 observation is set aside.
Leave a comment:

wanhaiyou replied

13 Jul 2022, 06:30

Originally posted by Sebastian Kripfganz View Post

The equations on page 12 are derived from the equation on page 5. With ec1, when the lag order of ln_inc is zero, its first lag is included in the error-correction form. However, in terms of the coefficients of the equation on page 5, this first lag has a coefficient equal to zero. Because this coefficient equals zero, we need the restriction for the coefficients on page 12, which I mentioned in my previous post. Put differently, for your model there is a total of 8 coefficients in the level equation on page 5:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4)

In the ec1 representation, page 12, you have 9 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1

For the two models to coincide, there must be 1 restriction on the coefficients in the latter version of the model.

If you do not want to have this restriction, either estimate it with option ec, which again gives you 8 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec

or allow for 1 unrestricted lag of ln_inc in the model:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 1 4)
ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec1
ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec

In the latter case, no restriction is needed. There are 9 coefficients in each version of the model. But obviously, because we allow for a nonzero coefficient of the first lag in the level version, the estimates differ.

Hi dear Prof Kripfganz,
I am running ARDL model with the following two commands. I don't understand why different results are produced.

Code:

clear
set seed 1234
set obs 1000
gen y = uniform()
gen x1 =  rt(5)
gen time = _n
tsset time
local ylist y
local xlist x1 
local quantile "0.1 0.25 0.5"

    
local nq: word count `quantile'
di `nq'

local xnum: word count `xlist'  
di `xnum'

tempname opt

local maxlags = 10
if ("`maxlags'" != "") {
  ardl `ylist' `xlist', maxlag(`maxlags') aic ec1
  mat `opt' = e(lags)
}

mat list `opt'

The results are as follows

Code:

ARDL(1,1) regression

Sample:        11 -      1000                   Number of obs     =        990
                                                R-squared         =     0.5314
                                                Adj R-squared     =     0.5300
Log likelihood = -180.88505                     Root MSE          =     0.2911

------------------------------------------------------------------------------
         D.y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
           y |
         L1. |  -1.062711   .0318044   -33.41   0.000    -1.125124   -1.000299
-------------+----------------------------------------------------------------
LR           |
          x1 |
         L1. |   .0191763   .0099109     1.93   0.053    -.0002725    .0386252
-------------+----------------------------------------------------------------
SR           |
          x1 |
         D1. |   .0070096    .007535     0.93   0.352    -.0077768     .021796
             |
       _cons |   .5206504   .0181471    28.69   0.000      .485039    .5562619
------------------------------------------------------------------------------
.   mat `opt' = e(lags)
. }

. 
. mat list `opt'

__000000[1,2]
     y  x1
r1   1   1

As shown above, the optimal lags are 1 and 1. Then I run

Code:

ardl y x1, lags(1 1) ec1  // optimal lag equals to `opt'
ARDL(1,1) regression

Sample:         2 -      1000                   Number of obs     =        999
                                                R-squared         =     0.5334
                                                Adj R-squared     =     0.5320
Log likelihood =  -186.9914                     Root MSE          =     0.2924

------------------------------------------------------------------------------
         D.y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
           y |
         L1. |  -1.066707   .0316454   -33.71   0.000    -1.128807   -1.004608
-------------+----------------------------------------------------------------
LR           |
          x1 |
         L1. |   .0186711   .0098246     1.90   0.058    -.0006082    .0379504
-------------+----------------------------------------------------------------
SR           |
          x1 |
         D1. |   .0084251   .0074896     1.12   0.261    -.0062722    .0231223
             |
       _cons |   .5239664   .0181209    28.92   0.000     .4884069     .559526
------------------------------------------------------------------------------

I don't know why the sample obs used are different. One is 990, and the other is 999.
Which answer is correct?

Bests,
wanhai

Leave a comment:

wanhaiyou replied

17 Jun 2022, 07:48
Originally posted by Sebastian Kripfganz View Post

The equations on page 12 are derived from the equation on page 5. With ec1, when the lag order of ln_inc is zero, its first lag is included in the error-correction form. However, in terms of the coefficients of the equation on page 5, this first lag has a coefficient equal to zero. Because this coefficient equals zero, we need the restriction for the coefficients on page 12, which I mentioned in my previous post. Put differently, for your model there is a total of 8 coefficients in the level equation on page 5:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4)

In the ec1 representation, page 12, you have 9 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1

For the two models to coincide, there must be 1 restriction on the coefficients in the latter version of the model.

If you do not want to have this restriction, either estimate it with option ec, which again gives you 8 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec

or allow for 1 unrestricted lag of ln_inc in the model:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 1 4) ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec1 ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec

In the latter case, no restriction is needed. There are 9 coefficients in each version of the model. But obviously, because we allow for a nonzero coefficient of the first lag in the level version, the estimates differ.

Dear prof, I see. Many thanks for your kindly help. Thanks again for your excellent rutine "ardl".

Bests,
wanhai you
Leave a comment:
Sebastian Kripfganz replied

17 Jun 2022, 07:26
The equations on page 12 are derived from the equation on page 5. With ec1, when the lag order of ln_inc is zero, its first lag is included in the error-correction form. However, in terms of the coefficients of the equation on page 5, this first lag has a coefficient equal to zero. Because this coefficient equals zero, we need the restriction for the coefficients on page 12, which I mentioned in my previous post. Put differently, for your model there is a total of 8 coefficients in the level equation on page 5:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4)

In the ec1 representation, page 12, you have 9 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1

For the two models to coincide, there must be 1 restriction on the coefficients in the latter version of the model.

If you do not want to have this restriction, either estimate it with option ec, which again gives you 8 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec

or allow for 1 unrestricted lag of ln_inc in the model:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 1 4) ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec1 ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec

In the latter case, no restriction is needed. There are 9 coefficients in each version of the model. But obviously, because we allow for a nonzero coefficient of the first lag in the level version, the estimates differ.
1 like
Leave a comment:
wanhaiyou replied

17 Jun 2022, 05:40
Originally posted by Sebastian Kripfganz View Post

The difficulty here is that the optimal lag order for ln_inc is zero, but you are forcing it to appear in the first lag in the EC representation (option ec1). This is achieved by effectively imposing a constraint as follows:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1 constraint 1 L.ln_inc = D.ln_inc cnsreg D.ln_consump L.ln_consump L.ln_inc L.ln_inv D.ln_inc L(0/3)D.ln_inv if e(sample), constraints(1)

The long-run coefficients in the ardl output are a nonlinear coefficient combination:

Code:

nlcom (- _b[L.ln_inc] / _b[L.ln_consump]) (- _b[L.ln_inv] / _b[L.ln_consump])

If you use the ec instead of the ec1 option, this complication does not arise and you can replicate the results directly with regress:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec reg D.ln_consump L.ln_consump ln_inc ln_inv L(0/3)D.ln_inv if e(sample) nlcom (- _b[ln_inc] / _b[L.ln_consump]) (- _b[ln_inv] / _b[L.ln_consump])

Thanks very much for your reply, dear prof. Yes, I have read your slide at page 12. I think the formula at page 5 is equivalent to the formula at page 12. Whether the lag order of ln_inc is zero or not,
the x_t-1 in EC representation is always included. Therefore, the first lag of ln_inc is included. Actually, I don't follow why we need the constraint 1 L.ln_inc = D.ln_inc?

Additionally, I think the EC1 representation is more commonly in the literature.

Thanks again for your help.

Bests,
Wanhai You

Reference:

Kripfganz, S. and D. C. Schneider (2018). ardl: Estimating autoregressive distributive lag and equilibrium correction models. Proceedings of the 2018 London Stata Conference.
Leave a comment:
Sebastian Kripfganz replied

17 Jun 2022, 02:43
The difficulty here is that the optimal lag order for ln_inc is zero, but you are forcing it to appear in the first lag in the EC representation (option ec1). This is achieved by effectively imposing a constraint as follows:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1 constraint 1 L.ln_inc = D.ln_inc cnsreg D.ln_consump L.ln_consump L.ln_inc L.ln_inv D.ln_inc L(0/3)D.ln_inv if e(sample), constraints(1)

The long-run coefficients in the ardl output are a nonlinear coefficient combination:

Code:

nlcom (- _b[L.ln_inc] / _b[L.ln_consump]) (- _b[L.ln_inv] / _b[L.ln_consump])

If you use the ec instead of the ec1 option, this complication does not arise and you can replicate the results directly with regress:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec reg D.ln_consump L.ln_consump ln_inc ln_inv L(0/3)D.ln_inv if e(sample) nlcom (- _b[ln_inc] / _b[L.ln_consump]) (- _b[ln_inv] / _b[L.ln_consump])
Leave a comment:

wanhaiyou replied

16 Jun 2022, 17:38

Originally posted by Sebastian Kripfganz View Post

I do not really know enough about the Westerlund test, so cannot give you a definite answer to this question.

I(0) variables can potentially affect the long-run level relationship if the I(1) dependent variable is cointegrated with other I(1) regressors. In that case, there exists a linear combination between the dependent variable and those I(1) regressors which is I(0). The other I(0) regressors can then have a long-run effect on this I(0) linear relationship. In that regard, significant long-run coefficients of I(0) variables could still make sense.

Dear Prof Kripfganz,
I want to replicate the ardl model with regression routine, however, the results are inconsistent. The corresponding results are as follows

Code:

. webuse lutkepohl2,clear
(Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)

. ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1    // (1, 0, 4)

ARDL(1,0,4) regression

Sample: 1961q1 - 1982q4                         Number of obs     =         88
                                                R-squared         =     0.5385
                                                Adj R-squared     =     0.4981
Log likelihood =  306.72241                     Root MSE          =     0.0078

------------------------------------------------------------------------------
D.ln_consump |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
  ln_consump |
         L1. |  -.3335022    .043545    -7.66   0.000    -.4201595    -.246845
-------------+----------------------------------------------------------------
LR           |
      ln_inc |
         L1. |    1.00357    .032044    31.32   0.000     .9398006     1.06734
             |
      ln_inv |
         L1. |  -.0417399   .0373449    -1.12   0.267    -.1160586    .0325788
-------------+----------------------------------------------------------------
SR           |
      ln_inc |
         D1. |   .3346929    .044616     7.50   0.000     .2459042    .4234815
             |
      ln_inv |
         D1. |   .0556284   .0196031     2.84   0.006     .0166169      .09464
         LD. |   .0167529   .0200739     0.83   0.406    -.0231955    .0567012
        L2D. |   .0620601   .0198688     3.12   0.002     .0225198    .1016003
        L3D. |     .03124   .0194042     1.61   0.111    -.0073756    .0698555
             |
       _cons |    .037445   .0118549     3.16   0.002     .0138529     .061037
------------------------------------------------------------------------------

. 
. // OLS estimator
. reg d.ln_consump l.ln_consump l.ln_inc l.ln_inv d.ln_inc d.ln_inv l(1/3).d.ln_inv

      Source |       SS           df       MS      Number of obs   =        88
-------------+----------------------------------   F(8, 79)        =     11.68
       Model |  .005677968         8  .000709746   Prob > F        =    0.0000
    Residual |  .004801841        79  .000060783   R-squared       =    0.5418
-------------+----------------------------------   Adj R-squared   =    0.4954
       Total |  .010479809        87  .000120458   Root MSE        =     .0078

------------------------------------------------------------------------------
D.ln_consump |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  ln_consump |
         L1. |  -.3064987   .0563429    -5.44   0.000    -.4186463    -.194351
             |
      ln_inc |
         L1. |   .3078729   .0570286     5.40   0.000     .1943604    .4213855
             |
      ln_inv |
         L1. |  -.0130074   .0125833    -1.03   0.304    -.0380537     .012039
             |
      ln_inc |
         D1. |    .382558   .0773686     4.94   0.000     .2285596    .5365565
             |
      ln_inv |
         D1. |   .0542619   .0197379     2.75   0.007     .0149746    .0935491
         LD. |   .0116764   .0212116     0.55   0.584    -.0305443    .0538972
        L2D. |   .0586942   .0204104     2.88   0.005     .0180683      .09932
        L3D. |   .0295406   .0195846     1.51   0.135    -.0094415    .0685227
             |
       _cons |   .0337566   .0128433     2.63   0.010     .0081927    .0593205
------------------------------------------------------------------------------

. 
end of do-file

Thanks for your kindly help!

Best regards,
wanhai you

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: