ARDL Modelling

Jonno WIlkes

Join Date: Apr 2020

Posts: 4
#1

ARDL Modelling

06 Apr 2020, 12:13

I am running some analysis looking at the effect of a depreciation of a currency on commodity trade. My variables are stationary, so i applied first difference to them and one variable was still non stationary. I applied varsoc to the variables in original log form and the first differenced variables and the optimal number of lags was 4. I then applied Vecrank and it showed that the variables are cointegrated at order 2. I applied ARDL with 4 lags. An error then occurs saying -

ardl lnrex lnx lnm lnukgdp lnchinagdp, lags(4 4 4 4 4) ec
note: L.lnchinagdp omitted because of collinearity
note: L3.lnchinagdp omitted because of collinearity
note: L4.lnchinagdp omitted because of collinearity
Collinear variables detected.

What does this mean. I am able to generate ARDL results using 3 lags which is confusing
Tags: None
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#2

06 Apr 2020, 13:12

ardl is a user-written command by Sebastian Kripfganz and Daniel Schneider.

My variables are stationary, so i applied first difference to them and one variable was still non stationary.

This can be problematic, ardl's bounds testing requires that no variables are I~(2) .

I would look through Sebastian's slides here,
http://repec.org/usug2018/uk18_Kripfganz.pdf

Note from slide 2:

"The existence of a long-run / cointegrating relationship can be tested based on the EC representation. A bounds testing procedure is available to draw conclusive inference without knowing whether the variables are integrated of order zero or one, I(0) or I(1)"

Lastly, ardl will select the optimal lag length - you can specify aic or bic
1 like
Comment
Jonno WIlkes

Join Date: Apr 2020

Posts: 4
#3

07 Apr 2020, 05:11

Thanks for the advice Justin, but I still get the same error, please see my error as well as the Varsoc tests.
Attached Files
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#4

07 Apr 2020, 06:51

Remove the lags and run

Code:

ardl lnrex lnx lnm lnukgdp lnchinagdp, aic ec
1 like
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#5

07 Apr 2020, 06:58

This error typically occurs in models with many variables and few time periods. Note that an ARDL model with 4 independent variables and 4 lags for each independent variable and the dependent variable has 25 coefficients (including the intercept). Given your small number of observations, estimating such a model is impossible. In addition to Justin's recommendation, you need to restrict the maximum number of lags substantially with the maxlags() option.

More on ARDL estimation:
ARDL: updated Stata command for the estimation of autoregressive distributed lag and error correction models

Kripfganz, S. and D. C. Schneider (2018). ardl: Estimating autoregressive distributed lag and equilibrium correction models. Proceedings of the 2018 London Stata Conference.

https://www.kripfganz.de/stata/
1 like
Comment
Jonno WIlkes

Join Date: Apr 2020

Posts: 4
#6

08 Apr 2020, 06:32

Okay great thanks for that! I am still unsure why when I test for the number of lags to us using the VARSOC command that it suggest i use 4 lags which doesnt work. Yet if I use 3 lags I am able to get ARDL results
Comment
Shruti Gungabissoon

Join Date: Feb 2020

Posts: 5
#7

21 Apr 2020, 13:44

Hello, I am doing a thesis on the impact of Foreign Direct Investment on Employment for Mauritius and so, I adopted the ARDL model because there was a mix of I(1) and I(0) variables. However, all my tests are significant and good. I just had the issue of multicollinearity and so, I wanted to know if multicollineaity matters in ARDL and should I ignore it?
Comment
Justin Niakamal

Join Date: Aug 2017

Posts: 757
#8

21 Apr 2020, 14:29

Hi Shruti,

Short answer, it generally doesn't matter. Don't take my word for it, see Dave Gile's, "Some Questions About ARDL Models" where he addresses multicollinearity in point #3 of his post. Of course, make sure you have enough observations relative to the number of parameters you're estimating. Using the bic option (which is the default) will generally result in a more parsimonious fit if you're worried about overfitting in terms of lags or also consider the maxlags() option. Hope this helps.
Comment
Shruti Gungabissoon

Join Date: Feb 2020

Posts: 5
#9

22 Apr 2020, 05:24

Thank you Justin Blasongame. I will have a look and revert back if any issues.
Comment
Jonno WIlkes

Join Date: Apr 2020

Posts: 4
#10

05 May 2020, 08:15

Hi there,

I a running an ARDL model, it has 4 independent variables over 4 lags across 26 years of data. To carry out the F test for joint significance of the lagged variables as a sign of cointergration is this not possible without the minimum of 30 time periods? As Narayan (2005) produces critical values for small samples.

Is there an alternative test to the F test I can carry out?
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#11

05 May 2020, 08:51

The small-sample critical values implemented in the ardl postestimation command estat ectest are not from Narayan but from the following paper:
Kripfganz, S. and D. C. Schneider (2020). Response surface regressions for critical value bounds and approximate p-values in equilibrium correction models. Oxford Bulletin of Economics and Statistics, forthcoming.

These critical values are available for any number of time periods. However, as a precautionary measure, the command will not perform the bounds test if the number of observations is not at least twice as high as the number of estimated coefficients. With just 30 time periods and 4 independent variables, you need to limit the number of lags substantially to avoid overfitting of the model.

Last edited by Sebastian Kripfganz; 05 May 2020, 08:54.

https://www.kripfganz.de/stata/
1 like
Comment
John Costopoulos

Join Date: Dec 2020

Posts: 36
#12

18 Dec 2020, 12:54

I have two questions regarding post #11.
1. Is the calculation of the number of estimated coefficients based on the number of maximum lags (e.g. 4), or on the actual ARDL model (e.g. ARDL(1,2,2,1,2,3))?
2. Let's assume that we have 30 estimated coefficients on a sample of 90 observations, a situation which conforms to the recommendation of having a number of observations at least twice as high as the number of estimated coefficients. In this case, however, we have only 3 observations per estimated coefficient (regressor) when a common rule of thumb in regression analysis is to have at least 10 observations per regressor. Aren't these two recommendations contradictory?

Last edited by John Costopoulos; 18 Dec 2020, 13:32.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#13

19 Dec 2020, 08:50

It is based in the actual ARDL model.

I definitely agree with you that 3 observations per estimated coefficients is usually way too small. Having at least twice as many observations as the number of estimated coefficients is not a "recommendation" we make. It is merely a technical restriction that we impose. Below that, no bounds test results will be displayed. But you are right, in practice one should typically have many more observations. How many? Ultimately, that becomes a decision of the user, not one we want to impose.

https://www.kripfganz.de/stata/
Comment
John Costopoulos

Join Date: Dec 2020

Posts: 36
#14

19 Dec 2020, 12:06

Dear Professor Kripfganz,

Thank you very much for your prompt and enlightening response. I'm sorry for any inconvenience caused to you but I would appreciate it very much if you could provide some guidance on the second part of your answer.

I have noticed that many empirical studies, using the ARDL approach, struggle for data. A usual case is to have about 30-50 annual observations. Let's say that we have a sample of 40 observations and 3 independent variables. If we used an ARDL model with no exogenous independent variables (e.g. trend) and max lags=2, then we might arrive, for example, at the ARDL(1,1,1,1) and ARDL(1,2,2,2) models, using the BIC and AIC criteria respectively. Although the ARDL model with the AIC criterion provides richer information on delayed SR dynamics, let's assume that we finally preferred the more parsimonious ARDL(1,1,1,1) model based on BIC. In this case, we would get 1 ADJ coefficient + 3 LR coefficients + 4 SR coefficients = 8 coefficients. In other words, we would have 40/8=5 observations per coefficient (or 40/9=4.5 if we took into account the coefficient of the constant term). I wonder how can we decide if it is a reliable model or not. Would a good Adj R-squared value (e.g. 60) be enough to encourage us to proceed further?

Last edited by John Costopoulos; 19 Dec 2020, 12:12.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2575
#15

20 Dec 2020, 02:09

I am not aware of a universally accepted rule of thumb to decide whether the sample size is large enough.
An adj. R-squared of 60 might be encouraging for one application but not for another.

If you find some good reference about this issue elsewhere, it would be nice if you could post it here.

https://www.kripfganz.de/stata/
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment