Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Asha
    started a topic ARDL in stata

    ARDL in stata

    Does anyone know how to estimate an Autoregressive Distributed Lag Model in stata? Also called Bounds Testing method (Pesaran 2001)

  • Sebastian Kripfganz
    replied
    If there is a structural break, then the bounds test for the full sample would be unreliable - with or without dummy/interaction terms.

    Other than that, things look okay. You might have to re-optimize the lag orders when including the interaction terms.

    Leave a comment:


  • Sarah Magd
    replied
    Dear Prof. Sebastian Kripfganz

    Thank you very much for the constructive answers.


    Based on your answer, would the following steps be correct?

    Step one: run the bounds test on the full sample and the subsamples without any dummy variables.
    Step two: run the ARDL model on the full sample with a dummy variable for the subsample as an exogenous variable, but the interaction term would appear in the regular list of independent variables.


    For the first step, the code will be:
    Code:
    *Bounds test for the second subsample
    ardl Y X1 X2 X3  if t > 200, maxlags(6) aic maxcombs(15000) fast
    matrix list e(lags)
    ardl Y X1 X2 X3  if t > 200, ec1 lags(4 0 1 2)
    estat ectest     
    *Bounds test for the first subsample
    ardl Y X1 X2 X3  if t < 200, maxlags(6) aic maxcombs(15000) fast
    matrix list e(lags)
    ardl Y X1 X2 X3  if t < 200, ec1 lags(2 0 3 4)
    estat ectest  
    *Bounds test for the full sample
    ardl Y X1 X2 X3, maxlags(6) aic maxcombs(15000) fast
    matrix list e(lags)
    ardl Y X1 X2 X3, ec1 lags(6 0 5 4)
    estat ectest


    For the second step, the code will be:
    Code:
     
    X1_d = X1*d_s2
    X2_d = X2*d_s2
    X3_d = X3*d_s2
    * d_s2 is a dummy variable that takes 1 if time > 200
    ardl Y X1 X2 X3 X1_d  X2_d X3_d, ec1 exog(d_s2) lags(6 0 5 4)

    Could you please let me know if there are any mistakes in my codes? Should I also control for the structural break by adding a dummy variable as an exogenous variable when testing for cointegration in the full sample?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    (i) You can do that; just note that the ardl command does not accept factor variable notation. You would need to create separate variables for the interaction terms first. Consequently, the command will not know that these are interaction terms and therefore computes any long-run effects as if they were separate variables. However, this is probably okay in your case because the structural break should also imply different long-run effects in the two subsamples.

    (ii) There is really just one "endogenous" variable in an ARDL model, which is the dependent variable. You can specify the dummy variable itself with the exog() option, but interaction terms should usually still appear in the regular list of independent variables.

    (iii) The main issue would be that the bounds test for existence of a long-run level relationship is no longer valid for a model with a structural break. It would only be applicable on each subsample individually.

    Leave a comment:


  • Sarah Magd
    replied
    Dear Prof. Sebastian Kripfganz

    I am investigating the drivers of a dependent variable shown in the figure below. In the first part of my analysis, I found a structural break in May 2023. Therefore, I ran the initial analysis on two subsamples.

    However, to analyze the drivers of this variable, I want to conduct the analysis for the entire sample using the ARDL model. My questions are:

    (i) Given that my dependent variable has a structural break, can I include a dummy variable in my ARDL model and interact it with the independent variables I have?
    (ii) If so, should I treat the dummy variable for this structural break and the interaction terms as exogenous or endogenous variables?
    (iii) Would there be any issues if I use this model with daily data?

    Could you please provide guidance on this?


    Thanks.

    Click image for larger version

Name:	figure.png
Views:	1
Size:	51.5 KB
ID:	1758842

    Leave a comment:


  • Sebastian Kripfganz
    replied
    The lags of the first-differenced terms in the error-correction representation (option ec or ec1) are determined as the number of lags in the level representation (as specified with option lags()) minus 1. If you want 3 lags of the first-differenced terms, specify option lags(4).

    Leave a comment:


  • lorenabarberia
    replied
    Sebastian Kripfganz In the ardl command, it is possible to specify the number of lags for the depedent variable and regressors using "lags(1 1 . . . .)." This is very helpful in some scenarios. I am wondering how you can include lags of the first differences using this same logic. Is there a way to implement this manually so you ensure a certain number of lags of the first differences of specific regressors?

    Leave a comment:


  • joseph Mgaya
    replied
    sorry I mean q=0

    Leave a comment:


  • joseph Mgaya
    replied
    I used the Newey-West error estimator for the short run using ec1 but am missing results for variables that were d=0. How can I get these results?

    Leave a comment:


  • Sebastian Kripfganz
    replied
    The critical values of the bounds test rely on the assumption of iid errors. If there is remaining serial correlation in the errors, the bounds test is not reliable.

    Note that the newey command does not correct the coefficient estimates. It only produces robust standard errors. Hence, if there was serial correlation in the residuals before running the newey command, there will still be serial correlation in the residuals afterwards. The residuals are unchanged.

    Leave a comment:


  • lorenabarberia
    replied
    Also, I am not sure how we could obtain estimates of the residual after newey commands folllowing ardl. The purpose is to verify if we have white noise residuals. I tried:

    estimates store newey
    estimates restore newey
    predict residualnewy, resid

    However, Stata reports "option r not allowed r(198);"



    Originally posted by Sebastian Kripfganz View Post
    Code:
    . webuse lutkepohl2
    . quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic regstore(ardlreg)
    . quietly estimates restore ardlreg
    . local cmdline `"`e(cmdline)'"'
    . gettoken cmd cmdline : cmdline
    . newey `cmdline' lag(4)

    The local cmdline contains the corresponding command line for the regress command (excluding the command name), not the ardl command. With the ardl option regstore(), the results are stored using the regress command; these are then subsequently recovered with the estimates restore command. The newey command eventually fits the same regression.

    To see what is contained in the local cmdline, execute the above code and add the line
    Code:
    . display `"`cmdline'"'

    Leave a comment:


  • lorenabarberia
    replied
    Your package and working paper, as well as this thread are very helpful! Thank you. If you would like to use newey standard errors , I understand how you can use the reported results from the routine below to obtain the short and long-term effects of the model. However, I am not sure how you would conduct the bounds test considering using newey standard errors since one of the reasons you are correcting with newey is because you found autocorrelation present in the residual and the bounds test would also need to consider the new estimates. Any guidance you can offer would be much appreciated.

    Originally posted by Sebastian Kripfganz View Post
    Code:
    . webuse lutkepohl2
    . quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic regstore(ardlreg)
    . quietly estimates restore ardlreg
    . local cmdline `"`e(cmdline)'"'
    . gettoken cmd cmdline : cmdline
    . newey `cmdline' lag(4)
    The local cmdline contains the corresponding command line for the regress command (excluding the command name), not the ardl command. With the ardl option regstore(), the results are stored using the regress command; these are then subsequently recovered with the estimates restore command. The newey command eventually fits the same regression.

    To see what is contained in the local cmdline, execute the above code and add the line
    Code:
    . display `"`cmdline'"'

    Leave a comment:


  • Sebastian Kripfganz
    replied
    That is relatively small for a time series analysis. Based on this additional information, I would not recommend any longer to increase the maximum lag order.

    You might simply want to report both conventional and Newey-West standard errors instead of choosing between them.

    Leave a comment:


  • joseph Mgaya
    replied
    My sample size is 37 years of annual data.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    You shouldn't choose the standard errors based on whether this delivers "improved" statistical significance. In general, I would not expect the standard errors to become smaller - i.e., coefficient estimates to become more significant - when using Newey-West standard errors.

    You are right that a p-value of 0.066 is not very comfortable. This could be used as a justification to use Newey-West standard errors. Alternatively, you might want to address the serial correlation in the first place by allowing for higher-order lags in the ARDL specification. If your sample size allows, you could increase the maximum lag order with the ardl option maxlag(), and/or use the AIC instead of the BIC model selection criterion.

    Leave a comment:

Working...
X