Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to save the residuals from a Dickey-Fuller test (dfuller)?

    I'm not sure how to find and save the residuals from a Dickey-Fuller test, because I find no information whether or where these residuals are stored.
    So am I wrong if I suppose, it's similar as in a linear regression and write:

    PHP Code:
      dfuller RealGDPlags(4)
    predict residresiduals 
    According to my assumption the residuals should now be stored in "resid". If I list "resid", I receive indeed a time series that looks like residuals.

    But I want to be sure, since I do not find any information in the Stata Manual entry for "dfuller" that tells me wether a dfuller test stores the residuals.

    Thanks a lot for any clarification!

    John

  • #2
    The dfuller command is not an estimation command and the predict postestimation command thus does not work. Normally, you would receive an error message saying
    Code:
    last estimates not found
    r(301);
    If that is not the case and predict actually generates a new variable, then these must be the residuals from an earlier estimation that you have run before the dfuller command.

    You can use the regress option of the dfuller command to display the underlying regression estimates. However, this does not store the estimation results in the memory such that predict would still not work. However, it shows you how the regression model looks like and you can easily reestimate the model with the regress command, and subsequently use predict.
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Thanks a lot for clarification Sebastian!
      But it's sad that the "dfuller" residuals are not stored, since the ADF test requires uncorrelated residuals. So one should check the residuals somehow.
      Would a lag selection with "varsoc" make sense?

      Comment


      • #4
        I created now a kind of workaround: I have "viewsource"d the original "dfuller" code an then added a very small modification (which is marked in my code below) to save the ADF regression residuals in a time series called "ADFresiduals", which can be used after the new code called "dfuller_modified" has run. The new command is then "dfuller_modified YourTimeSeriesName". The code has to be saved under "C:\Program Files (x86)\Stata15\ado\base\d" (where the Stata code is stored typically) under the name "dfuller_modified.ado". The code is:


        PHP Code:
        *! version 1.3.0  10oct2017
        *** Modified by John Whymer to store ADF residuals in "ADFresiduals" ****
        program define dfuller_modifiedrclass
                version 6.0
        missing
                syntax varname
        (ts) [if] [in] [, TRend noCONstant /*
                        */ 
        DRift Lags(int -1REGress /*
                        */ 
        CERTIFY ]

                
        /* certify is an undocumented option that keeps the
                   dickey-fuller regression results lying around for
                   certification purposes
                */
                
        if "`drift'" != "" {
                        if 
        "`constan'" != "" {
        noi di as error "cannot specify drift if constant is excluded"
                                
        exit 198
                        
        }
                        if 
        "`trend'" != "" {
        noi di as error "cannot specify drift if time trend is included"
                                
        exit 198
                        
        }
                }
                        
                if      
        "`constan'" != "" "`trend'" == "" local case }
                else if 
        "`constan'" == "" "`trend'" == "" "`drift'" == "" {
                        
        local case 2
                
        }
                else if 
        "`constan'" == "" "`trend'" == "" "`drift'" != "" {
                        
        local case 3
                
        }      
                else if 
        "`constan'" == "" "`trend'" != "" local case }
                else {
                        
        noi di in red "cannot choose trend if constant is excluded"
                        
        exit 198
                
        }

                
        marksample touse
                _ts tvar panelvar 
        `if' `in', sort onepanel
                markout `touse' 
        `tvar'
                local samp "if 
        `touse'==1"

                tempname usrest
                // may not be e-class() stuff lying around, so capture this
                if "`certify'" == "" {
                        version 10: _estimates hold `usrest', copy restore nullok
                }
                quietly {
                        if `lags' < 0 { local lags 0 }
                        local mac
                        if `case' == 2 { local mac "
        c"  }
                        if `case' == 4 { local mac "
        ct" }
                        if "
        `trend'" != "" {
                                summ 
        `tvar' `samp'meanonly
                                local min 
        r(min)
                                
        tempvar tt
                                gen long 
        `tt' = `tvar'-r(min)
                        }
                        if `lags' 
        == {
                                
        reg D.`varlist' L.`varlist' `tt' `samp', `constan'
                        }
                        else {
                                reg D.`varlist' 
        L.`varlist' DL(1/`lags').`varlist' /*
                                        */ 
        `tt' `samp', `constan'
                                
        local aug "Augmented "
                        
        }
                        
        local T e(N)
                        
        local n e(N) - e(df_r)
                        
        local Zt _b[L.`varlist'] / _se[L.`varlist']

                        if "`mac'" != "" {
                                MacP `mac' `Zt'
                                local ztp = `r(p)'
                        }
                        if `case' == 3 {
                                local ztp = 1 - ttail(e(df_r), `Zt')
                        }
                        
                        GetCrit `case' `T' `varname'
                }
        ***************************************************************
        * Modification to save residuals in ADFresiduals by John Whymer:
        ***************************************************************
         predict ADFresid, residuals
         generate ADFresiduals = ADFresid
        ***************************************************************
                noi di in gr _n "
        `aug'Dickey-Fuller test for unit root" /*
                        */ _col(52) "Number of obs   = " in ye %9.0g 
        `T'
                if `case' 
        == {
                        
        di _n in smcl as text _col(32/*
                                */ 
        "{hline 11} Z(t) has t-distribution {hline 11}"
                
        }
                else {
                        
        di _n in smcl in gr _col(32/*
                                */ 
        "{hline 10} Interpolated Dickey-Fuller {hline 9}"
                
        }
                
        di in gr _col (19"Test" /*
                        */ 
        _col(32)  "1% Critical" /*
                        */ 
        _col(50)  "5% Critical" /*
                        */ 
        _col(67"10% Critical"
                
        di in gr _col (16"Statistic" /*
                        */ 
        _col(36)  "Value" /*
                        */ 
        _col(54)  "Value" /*
                        */ 
        _col(72"Value"
                
        di in gr in smcl "{hline 78}"

                
        di in gr " Z(t)" /*
                        */ 
        _col(15in ye %10.3f `Zt' /*
                        */ _col(33) %10.3f 
        `r(Zt1)' /*
                        */ _col(51) %10.3f `r(Zt5)' 
        /*
                        */ 
        _col(69) %10.3f `r(Zt10)'
                ret scalar cv10 = 
        `r(Zt10)'
                ret scalar cv5  = `r(Zt5)'
                
        ret scalar cv1  = `r(Zt1)'
                if 
        `case' == 3 {
                        di as text in smcl "{hline 78}"
                        di as text "p-value for Z(t) = " as res %6.4f `ztp'
                        
        ret scalar p = `ztp'
                }
                else if "
        `ztp'" != "" {
                        di in gr in smcl "{hline 78}"
                        di in gr "MacKinnon approximate p-value for Z(t) = " /*
                                */ in ye %6.4f `ztp'
                        
        ret scalar p   = `ztp'
                }

                if "
        `regress'" != "" {
                        di
                        if "`tt'" != "" {
                                DispReg `tt' `lags' `varlist'
                        }
                        else {
                                regress, nohead
                        }
                }

                ret scalar Zt     = `Zt'
                ret scalar N      = `T'
                ret scalar lags   = `lags'
        end

        program define GetCrit, rclass

                args case N varlist

                /* Take care of case 3 first, since easiest */
                if `case' == 3 {
                        local zt1 = invttail(e(df_r), 0.99)
                        local zt5 = invttail(e(df_r), 0.95)
                        local zt10 = invttail(e(df_r), 0.90)
                        return scalar Zt1  = `zt1'
                        return scalar Zt5  = `zt5'
                        return scalar Zt10 = `zt10'
                        exit
                }                      
                        

                tempname zt
                
                if `case' == 1 {
                        mat `zt' = ( -2.66,-2.62,-2.60,-2.58,-2.58,-2.58\ /*
                                  */ -1.95,-1.95,-1.95,-1.95,-1.95,-1.95\ /*
                                  */ -1.60,-1.61,-1.61,-1.62,-1.62,-1.62)
                }
                else if `case' == 2 {
                        mat `zt' = ( -3.75,-3.58,-3.51,-3.46,-3.44,-3.43\ /*
                                  */ -3.00,-2.93,-2.89,-2.88,-2.87,-2.86\ /*
                                  */ -2.63,-2.60,-2.58,-2.57,-2.57,-2.57)
                }
                else {
                        mat `zt' = ( -4.38,-4.15,-4.04,-3.99,-3.98,-3.96\ /*
                                  */ -3.60,-3.50,-3.45,-3.43,-3.42,-3.41\ /*
                                  */ -3.24,-3.18,-3.15,-3.13,-3.13,-3.12)
                }

                if `N' <= 25 {
                        local zt1  = `zt'[1,1]
                        local zt5  = `zt'[2,1]
                        local zt10 = `zt'[3,1]
                }
                else if `N' <= 50 {
                        local zt1  = `zt'[1,1] + (`N'-25)/25 * (`zt'[1,2]-`zt'[1,1])
                        local zt5  = `zt'[2,1] + (`N'-25)/25 * (`zt'[2,2]-`zt'[2,1])
                        local zt10 = `zt'[3,1] + (`N'-25)/25 * (`zt'[3,2]-`zt'[3,1])
                }
                else if `N' <= 100 {
                        local zt1  = `zt'[1,2] + (`N'-50)/50 * (`zt'[1,3]-`zt'[1,2])
                        local zt5  = `zt'[2,2] + (`N'-50)/50 * (`zt'[2,3]-`zt'[2,2])
                        local zt10 = `zt'[3,2] + (`N'-50)/50 * (`zt'[3,3]-`zt'[3,2])
                }
                else if `N' <= 250 {
                        local zt1  = `zt'[1,3] + (`N'-100)/150 * (`zt'[1,4]-`zt'[1,3])
                        local zt5  = `zt'[2,3] + (`N'-100)/150 * (`zt'[2,4]-`zt'[2,3])
                        local zt10 = `zt'[3,3] + (`N'-100)/150 * (`zt'[3,4]-`zt'[3,3])
                }
                else if `N' <= 500 {
                        local zt1  = `zt'[1,4] + (`N'-250)/250 * (`zt'[1,5]-`zt'[1,4])
                        local zt5  = `zt'[2,4] + (`N'-250)/250 * (`zt'[2,5]-`zt'[2,4])
                        local zt10 = `zt'[3,4] + (`N'-250)/250 * (`zt'[3,5]-`zt'[3,4])
                }
                else {
                        local zt1  = `zt'[1,6]
                        local zt5  = `zt'[2,6]
                        local zt10 = `zt'[3,6]
                }
                return scalar Zt1  = `zt1'
                return scalar Zt5  = `zt5'
                return scalar Zt10 = `zt10'
        end


        program define MacP, rclass
                args type tau

                local stype = lower("
        `type'")
                if "
        `stype'"=="c" { local type 0 }
                else              { local type 1 }


                local g3=0
                local min=.
                local max=.
                if `type'
        =={  /* no trend but constant in ADF regression */
                        
        if `tau'>-1.61 {
                                local min = -9999
                                local max = 2.74
                                local g0 = 1.7339
                                local g1 = 0.93202
                                local g2 = -0.12745
                                local g3 = -0.010368
                        }
                        else {
                                local min = -18.83
                                local g0 = 2.1659
                                local g1 = 1.4412
                                local g2 = 0.038269
                                local g3 = 0
                        }
                } /* type==0 */
                else if 
        `type'==1 {     /* linear trend and constant in ADF reg.*/
                        if `tau'
        >-2.89 {
                                
        local min = -9999
                                local max 
        0.70
                                local g0 
        2.5261
                                local g1 
        0.61654
                                local g2 
        = -0.37956
                                local g3 
        = -0.060285
                        
        }
                        else {
                                
        local min = -16.18
                                local g0 
        3.2512
                                local g1 
        1.6047
                                local g2 
        0.049588
                                local g3 
        0
                        
        }
                } 
        /* type==1 */

                
        local h = `g0' + `g1'*`tau' + `g2'*(`tau')^2 + `g3'*(`tau')^3
                local p = cond(
        `tau'<`min',0,cond(`tau'>`max',1,normprob(`h')))
                return 
        scalar p = `p'

                local h = 
        `g0' + `g1'*`tau' + `g2'*(`tau')^+ `g3'*(`tau')^3
                local p = cond(`tau'
        <`min',0,cond(`tau'>`max',1,normprob(`h')))
                return scalar p = 
        `p'
        end

        program define DispReg
                args tt lags dvar
                di in smcl in gr "{hline 13}{c TT}{hline 64}"
                di in smcl in gr abbrev("`e(depvar)'",12) _col(14) "
        {|}" /*
                        */ _col(21) "
        Coef." _col(29) "StdErr." _col(44) "t" /*
                        */ _col(49) "
        P>|t|" _col(59) "[95ConfInterval]"
                di in smcl in gr "
        {hline 13}{+}{hline 64}"
                di in smcl in gr %12s abbrev("
        `dvar'",12) _col(14) "{c |}"
                local vv "L1.
        `dvar'"
                local bv "_b[`vv'
        ]"
                local sv "
        _se[`vv']"
                di in smcl in gr _col(10) "L1. {c |}" in ye /*
                        */ _col(17) %9.0g 
        `bv' /*
                        */ _col(28) %9.0g `sv' 
        /*
                        */ 
        _col(38) %8.2f `bv'/`sv' /*
                        */ _col(48) %6.3f tprob(e(df_r),`bv'
        /`sv') /*
                        */ _col(58) %9.0g 
        `bv' - invt(`e(df_r)',$S_level/100)*`sv' /*
                        */ _col(70) %9.0g 
        `bv' + invt(`e(df_r)',$S_level/100)*`sv'
                local vv "LD.
        `dvar'"
                local bv "_b[`vv'
        ]"
                local sv "
        _se[`vv']"
                if 
        `lags' >= 1 {
                        di in smcl in gr _col(10) "LD. {c |}" in ye /*
                                */ _col(17) %9.0g `bv' 
        /*
                                */ 
        _col(28) %9.0g `sv' /*
                                */ _col(38) %8.2f 
        `bv'/`sv' /*
                                */ 
        _col(48) %6.3f tprob(e(df_r),`bv'/`sv') /*
                                */ _col(58) %9.0g `bv' 
        invt(`e(df_r)',/*
                                */ 
        $S_level/100)*`sv' /*
                                */ _col(70) %9.0g `bv' 
        invt(`e(df_r)',/*
                                */ 
        $S_level/100)*`sv'
                }
                local i 2
                while `i' 
        <= `lags' {
                        local vv "L
        `i'D.`dvar'"
                        local bv "
        _b[`vv']"
                        local sv "_se[
        `vv']"
                        di in smcl in gr %12s "L`i'
        D." " {|}" in ye /*
                                */ _col(17) %9.0g `bv' /*
                                */ _col(28) %9.0g `sv' /*
                                */ _col(38) %8.2f `bv'/`sv' /*
                                */ _col(48) %6.3f tprob(e(df_r),`bv'/`sv') /*
                                */ _col(58) %9.0g `bv' - invt(`e(df_r)',/*
                                */ 
        $S_level/100)*`sv' /*
                                */ _col(70) %9.0g `bv' + invt(`e(df_r)',/*
                                */ 
        $S_level/100)*`sv'
                        local i = `i'+1
                }
                local vv "
        `tt'"
                local bv "_b[
        `vv']"
                local sv "_se[`vv'
        ]"
                di in smcl in gr %12s "
        _trend" _col(14) "{|}" in ye /*
                        */ _col(17) %9.0g `bv' /*
                        */ _col(28) %9.0g `sv' /*
                        */ _col(38) %8.2f `bv'/`sv' /*
                        */ _col(48) %6.3f tprob(e(df_r),`bv'/`sv') /*
                        */ _col(58) %9.0g `bv' - invt(`e(df_r)',
        $S_level/100)*`sv' /*
                        */ _col(70) %9.0g `bv' + invt(`e(df_r)',
        $S_level/100)*`sv'
                local vv "
        _cons"
                local bv "
        _b[`vv']"
                local sv "_se[
        `vv']"
                di in smcl in gr %12s "_cons" _col(14) "{c |}" in ye /*
                        */ _col(17) %9.0g `bv' 
        /*
                        */ 
        _col(28) %9.0g `sv' /*
                        */ _col(38) %8.2f 
        `bv'/`sv' /*
                        */ 
        _col(48) %6.3f tprob(e(df_r),`bv'/`sv') /*
                        */ _col(58) %9.0g `bv' 
        invt(`e(df_r)',$S_level/100)*`sv' /*
                        */ _col(70) %9.0g `bv' 
        invt(`e(df_r)',$S_level/100)*`sv'
                di in smcl in gr "{hline 13}{c BT}{hline 64}"
        end 

        Comment


        • #5
          Originally posted by John Whymer View Post
          Would a lag selection with "varsoc" make sense?
          Indeed, the varsoc command is often used in combination with the dfuller command.
          https://www.kripfganz.de/stata/

          Comment


          • #6
            By the way, you could achieve the same goal by using the ardl command. Here is an example:
            Code:
            . webuse lutkepohl2
            (Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)
            
            . ardl ln_consump, ec
            
            ARDL(4) regression
            
            Sample: 1961q1 - 1982q4                         Number of obs     =         88
                                                            R-squared         =     0.1621
                                                            Adj R-squared     =     0.1218
            Log likelihood =  280.48555                     Root MSE          =     0.0103
            
            ------------------------------------------------------------------------------
            D.ln_consump |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            ADJ          |
              ln_consump |
                     L1. |  -.0022516   .0022221    -1.01   0.314    -.0066713    .0021681
            -------------+----------------------------------------------------------------
            SR           |
              ln_consump |
                     LD. |  -.1211548   .1034523    -1.17   0.245    -.3269172    .0846076
                    L2D. |   .2429064   .1017705     2.39   0.019     .0404889    .4453239
                    L3D. |    .307846   .1048838     2.94   0.004     .0992364    .5164555
                         |
                   _cons |   .0258569   .0164955     1.57   0.121    -.0069519    .0586658
            ------------------------------------------------------------------------------
            
            . estat ectest
            
            Pesaran, Shin, and Smith (2001) bounds test
            
            H0: no level relationship                                        F =     1.027
            Case 3                                                           t =    -1.013
            
            Finite sample (0 variables, 88 observations, 3 short-run coefficients)
            
            Kripfganz and Schneider (2018) critical values and approximate p-values
            
               | 10%              | 5%               | 1%               | p-value        
               |    I(0)     I(1) |    I(0)     I(1) |    I(0)     I(1) |    I(0)     I(1)
            ---+------------------+------------------+------------------+-----------------
             F |   6.581    6.570 |   8.255    8.236 |  12.119   12.071 |   0.742    0.742
             t |  -2.565   -2.569 |  -2.868   -2.874 |  -3.460   -3.470 |   0.735    0.738
            
            do not reject H0 if
                both F and t are closer to zero than critical values for I(0) variables
                  (if p-values > desired level for I(0) variables)
            reject H0 if
                both F and t are more extreme than critical values for I(1) variables
                  (if p-values < desired level for I(1) variables)
            
            . predict resid, residuals
            (4 missing values generated)
            The regression is the same as the augmented Dickey-Fuller regression with an optimal lag selection automatically applied. The t-statistic reported by the postestimation command estat ectest is the Dickey-Fuller test statistic. (Finite-sample) critical values and approximate p-values are provided as well (choose from the columns labelled I(0)). Finally, predict works in the usual way after ardl.

            Compare with the dfuller command:
            Code:
            . varsoc ln_consump
            
               Selection-order criteria
               Sample:  1961q1 - 1982q4                     Number of obs      =        88
              +---------------------------------------------------------------------------+
              |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
              |----+----------------------------------------------------------------------|
              |  0 | -64.5106                        .2595   1.48888   1.50022   1.51703  |
              |  1 |  273.713  676.45    1  0.000  .000122   -6.1753  -6.15262    -6.119  |
              |  2 |  273.958  .48997    1  0.484  .000124  -6.15815  -6.12412  -6.07369  |
              |  3 |   276.14   4.364    1  0.037  .000121  -6.18501  -6.13964   -6.0724  |
              |  4 |  280.486  8.6903*   1  0.003  .000112* -6.26104* -6.20433* -6.12028* |
              +---------------------------------------------------------------------------+
               Endogenous:  ln_consump
                Exogenous:  _cons
            
            . dfuller ln_consump, lags(3) regress
            
            Augmented Dickey-Fuller test for unit root         Number of obs   =        88
            
                                           ---------- Interpolated Dickey-Fuller ---------
                              Test         1% Critical       5% Critical      10% Critical
                           Statistic           Value             Value             Value
            ------------------------------------------------------------------------------
             Z(t)             -1.013            -3.527            -2.900            -2.585
            ------------------------------------------------------------------------------
            MacKinnon approximate p-value for Z(t) = 0.7484
            
            ------------------------------------------------------------------------------
            D.ln_consump |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
              ln_consump |
                     L1. |  -.0022516   .0022221    -1.01   0.314    -.0066713    .0021681
                     LD. |  -.1211548   .1034523    -1.17   0.245    -.3269172    .0846076
                    L2D. |   .2429064   .1017705     2.39   0.019     .0404889    .4453239
                    L3D. |    .307846   .1048838     2.94   0.004     .0992364    .5164555
                         |
                   _cons |   .0258569   .0164955     1.57   0.121    -.0069519    .0586658
            ------------------------------------------------------------------------------
            Last edited by Sebastian Kripfganz; 27 Jul 2018, 10:47. Reason: dfuller comparison added
            https://www.kripfganz.de/stata/

            Comment


            • #7
              OK! Cool! Thanks a lot Sebastian!

              Comment

              Working...
              X