Non-stationary dummy variable

Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#1

Non-stationary dummy variable

21 Jul 2020, 13:00

Dear Statalist users,

I have to estimate a time-series regression using the Newey-West estimator. Two of the explanatory variables are dummies. As a part of the diagnosis test, I conducted the Augumented-Duckey-Fuller unit root test and I found that both dummy variables are non-stationary in level. I would like to know how to deal with such a problem. As I see, first-differencing is more adequate for continuous variables.

Thanks a lot for your reply.

Emna
Tags: None
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#2

22 Jul 2020, 05:33

Hi Emna, a dummy variable cannot be non-stationary. Its value is constrained between 0 and 1.
1 like
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#3

22 Jul 2020, 07:53

Dear Joro,

Thanks for your response. I find that the dummy variables are not stationary. I shall understand from your reply that dummy variables are not concerned by the unit root test in time-series regressions.

Best,
Emna

Last edited by Emna Trabelsi; 22 Jul 2020, 08:40.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#4

25 Jul 2020, 05:38

Hi Emna,

Can you use -dataex- to show an example with your real data of how the Augumented-Duckey-Fuller unit root test says that your dummy variables are non-stationary?

Originally posted by Emna Trabelsi View Post

Dear Joro,

Thanks for your response. I find that the dummy variables are not stationary. I shall understand from your reply that dummy variables are not concerned by the unit root test in time-series regressions.

Best,
Emna
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#5

25 Jul 2020, 08:34

Hi Joro,

Thank you for your post.

1- I do apologize for the typo error. It is actually the unit root test of Augmented-Dickey-Fuller.

2- Please find attached an exerpt of the dataset (available in the link below). There are two dummies (x11, x12), x2 is a continous variable and the number of observations is 45.
The results of the unit root test are as follows:

. dfuller x11, lag(0)

Dickey-Fuller test for unit root Number of obs = 44

---------- Interpolated Dickey-Fuller ---------
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -1.406 -3.621 -2.947 -2.607

MacKinnon approximate p-value for Z(t) = 0.5793

. dfuller x12, lag(0)

Dickey-Fuller test for unit root Number of obs = 44

---------- Interpolated Dickey-Fuller ---------
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -1.496 -3.621 -2.947 -2.607

MacKinnon approximate p-value for Z(t) = 0.5356

. dfuller x13, lag(0)

Dickey-Fuller test for unit root Number of obs = 44

---------- Interpolated Dickey-Fuller ---------
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -0.679 -3.621 -2.947 -2.607

MacKinnon approximate p-value for Z(t) = 0.8522

. dfuller x2, lag(0)

Dickey-Fuller test for unit root Number of obs = 43

---------- Interpolated Dickey-Fuller ---------
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -2.430 -3.628 -2.950 -2.608

MacKinnon approximate p-value for Z(t) = 0.1334

.

3- My real question is whether it makes a sense to apply a unit root test for dummy variables or not.

Best,
Emna

Google Sheets - create and edit spreadsheets online, for free.

https://docs.google.com

Create a new spreadsheet and edit with others at the same time -- from your computer, phone or tablet. Get stuff done with or without an internet connection. Use Sheets to edit Excel files. Free from Google.
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#6

26 Jul 2020, 04:20

Dear Joro,

Please find attached the data_example.dta file if the previous link did not work.

Kind regards,
Emna
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#7

26 Jul 2020, 04:32

Hi Emna, I will try again now in maximum 30min.

What happened yesterday was that I tried to access your file, and it firstly did not give me permission. Then some time later (I think hours later) I got an email that I have gotten permission, but by this time it was already late in the night here, and I was doing something else.

Originally posted by Emna Trabelsi View Post

Dear Joro,

Please find attached the [ATTACH]n1565402[/ATTACH] file if the previous link did not work.

Kind regards,
Emna
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3047

26 Jul 2020, 05:20

Hi Emna, I opened your data, and I confirmed your results above: We fail to reject the null of a unit root in all of the dummy variables using the Dickey Fuller test.

By looking at your data, I can see where the problem is coming from. (The problem that we are not able to reject the null of a unit root for your dummies.)

All of your 3 dummies are very "stable". x11 starts at 1, then switches to 0 (only once) and stays at 0 thereafter. x13 starts at 0, then switches to 1 (only once) and stays at 1 thereafter. x12 starts at 0, switches to 1, and then to 0 (only two switches).

So all three dummies are heavily autoregressive, and this is what the Dickey Fuller test captures. As a side note, it is useful if you put in your Dickey Fuller test the -, regress- option, so that you can see the autoregression as well. The Dickey Fuller test runs the following test regression:
(1) (Yt - Yt-1) = a + (r - 1)*Yt-1 + e, which is obtained from the structural form
(2) Yt = a + r*Yt-1 + e = a + (r*Yt-1 - Yt-1) + Yt-1 + e, which gives rise to the test equation (1) after we combine the term in parenthesis, and flip over the +Yt-1 term to the left hand side.

The moral of the story is that if in the Dickey Fuller test regression we see a parameter close to 0, it means that the autoregressive parameter in the series is close to 1.

Code:

. foreach var in x11 x12 x13 {
  2. dfuller `var' , regress
  3. }

Dickey-Fuller test for unit root                   Number of obs   =        44

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -1.406            -3.621            -2.947            -2.607
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.5793

------------------------------------------------------------------------------
       D.x11 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         x11 |
         L1. |  -.0666667   .0474106    -1.41   0.167    -.1623451    .0290118
             |
       _cons |   6.94e-18   .0276818     0.00   1.000    -.0558642    .0558642
------------------------------------------------------------------------------

Dickey-Fuller test for unit root                   Number of obs   =        44

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -1.496            -3.621            -2.947            -2.607
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.5356

------------------------------------------------------------------------------
       D.x12 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         x12 |
         L1. |  -.1011494   .0676242    -1.50   0.142    -.2376207    .0353218
             |
       _cons |   .0344828   .0394841     0.87   0.387    -.0451993    .1141648
------------------------------------------------------------------------------

Dickey-Fuller test for unit root                   Number of obs   =        44

                               ---------- Interpolated Dickey-Fuller ---------
                  Test         1% Critical       5% Critical      10% Critical
               Statistic           Value             Value             Value
------------------------------------------------------------------------------
 Z(t)             -0.679            -3.621            -2.947            -2.607
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.8522

------------------------------------------------------------------------------
       D.x13 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         x13 |
         L1. |  -.0333333   .0491038    -0.68   0.501    -.1324289    .0657622
             |
       _cons |   .0333333   .0276983     1.20   0.236    -.0225641    .0892308
------------------------------------------------------------------------------

.

When I said that dummies are by definition stationary, I might have spoken too fast and at too high level of generality.

I am not the only one apparently with a bit of Google searching, David Giles says the same here
https://davegiles.blogspot.com/2011/...r-dummies.html

Dave Giles August 30, 2014 at 6:27 AM
you don't need to test dummy variables for stationarity - all dummy variables are stationary by construction.

Dave Giles November 10, 2016 at 4:29 AM
The dummy variable can only take values of zero or one. It's bounded and can't follow a random walk.
It might depend on the nature of your dummies.

Comment

Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#9

27 Jul 2020, 01:13

Dear Joro,

Many thanks for your assistance.
My problem is that if I write in a research paper that dummies are not concerned by the unit root test, I should cite some references (article, working paper, book, etc. ) to support the argument. I did not find the related sentence in the blogspot of Pr. Giles unless I missed something. Furthermore, I have no knowledge whether it is fine to cite a blogspot in a research paper.

Best,
Emna
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#10

27 Jul 2020, 04:50

Hi Emna,

If you want to do everything in the way I think is appropriate, do not do Newey and unit root pre-testing, but rather just go directly for Generalised Least Squares Cochrane-Orcutt (Stata command -[TS] prais-). This approach is one appropriate methodology, and this has been shown in

McCallum, Bennett T. "Is the spurious regression problem spurious?." Economics Letters 107, no. 3 (2010): 321-323.
and
Kolev, Gueorgui I. "The" spurious regression problem" in the classical regression model framework." Economics Bulletin 31, no. 1 (2011): 925-937.

I can help you carry this methodology to a successful end. I am familiar with this line of research.

If you want to stick with the admittedly standard technology of pre-testing for unit root, you are raising hard questions. What I can say in this case is the following:

1) Your dummies are non-stochastic terms, just like the constant, the drift and the time trend are non-stochastic terms in the standard Dickey Fuller test.
Nobody tests for stationary of those non-stochastic terms. However,
2) The presence of such non-stochastic terms generally changes the distribution of the Dickey Fuller statistic. The Dickey Fuller statistic distribution is different depending on whether you have no constant, only constant, a drift, or a time trend in your regression. Same if you have a dummy for a structural shift, your dummies look to me like structural shifters.
3) David Giles does say what I say in the comments of the post I showed you (that nobody tests non-stochastic terms for whether they are random walk by DF test, because they are either 0 or 1, and cannot drift away), just search for the strings I pasted above, I even put exactly the time when David Giles has said it in my previous post. How do you cite this, I dont know, depends on what you are writing, and who is going to read it.

Originally posted by Emna Trabelsi View Post

Dear Joro,

Many thanks for your assistance.
My problem is that if I write in a research paper that dummies are not concerned by the unit root test, I should cite some references (article, working paper, book, etc. ) to support the argument. I did not find the related sentence in the blogspot of Pr. Giles unless I missed something. Furthermore, I have no knowledge whether it is fine to cite a blogspot in a research paper.

Best,
Emna
1 like
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#11

27 Jul 2020, 06:31

Dear Joro,

Many thanks for your Kind reply.

I will have a look at the 'Prais' estimator and go back to you for a discussion.

Best,

Emna.
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#12

29 Jul 2020, 05:19

Dear Joro,

I had a quick look at the 'Prais' (Prais-Winsten and Cochrane-Orcutt) command in Stata. The estimator corrects heteroscedasticity and serial correlation at the first order only. I still find that the Newey-West estimator is more advantageous because it takes into account the serial correlation at higher orders. Isn't it?

Best,
Emna
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#13

29 Jul 2020, 06:19

If a series is stationary except that it has a once and for all change in the mean, the dfuller test will fail to reject the null of a unit root. The intutition is simple: when you take the first difference the change in the mean disappears. But this does not mean that it should be treated as a unit root series: it should be treated as a series with a structural break.This is exactly what the dfuller test is capturing for the dummy variables in question.
[On edit] The whole purpose of introducing the dummy variables is to capture a break or an outlier in the relationship.

Last edited by Eric de Souza; 29 Jul 2020, 06:33.
1 like
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#14

29 Jul 2020, 06:46

Hi Emna,

1) Newey-West is a variance fix, it does not change your linear regression point estimates at all, it just changes your standard errors and the variance matrix of your estimates. It does not help you at all if you have unit roots in your data. If you want to use linear regression with Newey-West variance estimator, you need to go through the standard procedure of determining whether your stochastic regressors and regressand have a unit root or not. As we discussed it, this is complicated by the fact that in the presence of the structural shift dummies the Dickey-Fuller distribution changes.

2) The Prais-Winsten /Cochrane-Orcutt estimator is Generalised Least Squares, it models the autocorrelation, indeed, assuming that the autocorrelation in the error term is AR(1). But there are tests for this, once you apply Prais-Winsten /Cochrane-Orcutt, you can test whether the error in the new, transformed equation is autocorrelated or not. If it is not, you are done. If it is, then there is a problem. But typically accounting for AR(1) resolves the problem. Prais-Winsten /Cochrane-Orcutt produces estimates different from linear regression, and automatically takes care if you have unit roots in your stochastic variables (this is the point of the references I cited in #10). So you do not need to do pre-testing for unit roots.

3) Finally, the usefulness of Eicker-White (robust) and Newey-West (HAC) variances is grossly overestimated. These so called non-parametric estimators end up being awfully parametric because they estimate lots of parameters. Then if you have a small sample you have a big problem, because you are estimating plenty of parameters on very few data points. The manifestation of this problem is that the tests based on these Eicker-White (robust) and Newey-West (HAC) variances have grossly inflated rejection rates in finite samples. E.g., you are carrying out the test at stated 5% nominal significance level, but the test rejects the correct null hypothesis 20% of the time.

Originally posted by Emna Trabelsi View Post

Dear Joro,

I had a quick look at the 'Prais' (Prais-Winsten and Cochrane-Orcutt) command in Stata. The estimator corrects heteroscedasticity and serial correlation at the first order only. I still find that the Newey-West estimator is more advantageous because it takes into account the serial correlation at higher orders. Isn't it?

Best,
Emna
1 like
Comment
Emna Trabelsi

Join Date: Feb 2019

Posts: 48
#15

29 Jul 2020, 14:35

Dear Joro,

1) I agree with you that the Newey-West estimator only changes the standard errors.

2) I see now that the 'Prais' estimator solves the issue of the unit root test.

I have one more question, what about the multicollinearity and the omitted variable bias issues? Shall we test them à priori after an OLS regression before running the final 'Prais' estimator?

Thank you for your response.

Have a nice day.
Emna
Comment

Announcement