Frisch-Waugh-Lovell theorem for data series using areg

Lucie Letrouit

Join Date: Jul 2015

Posts: 4
#1

Frisch-Waugh-Lovell theorem for data series using areg

10 Jul 2015, 19:30

Hello everyone,

I am studying a data series defined by :
tsset CountyCode year

I have 3 other variables : var1, var2 and var3.

I want to compare the results of 3 different regression methods :

1) First regression method :
xi:areg var1 var2 var3 i.year, absorb(CountyCode)

2) Second regression method using the residuals of the regression of var1, var2 and var3 respectively on year and county fixed effects (as in the Frisch-Waugh-Lovell theorem) and doing the final regression with reg :
xi:areg var1 i.year, absorb(CountyCode)
predict var1_res, residuals
xi:areg var2 i.year, absorb(CountyCode)
predict var2_res, residuals
xi:areg var3 i.year, absorb(CountyCode)
predict var3_res, residuals
reg var1_res var2_res var3_res

3) Third regression method using the residuals of the regression of var1, var2 and var3 respectively on year and county fixed effects and doing the final regression with xtreg :
xi:areg var1 i.year, absorb(CountyCode)
predict var1_res, residuals
xi:areg var2 i.year, absorb(CountyCode)
predict var2_res, residuals
xi:areg var3 i.year, absorb(CountyCode)
predict var3_res, residuals
xtreg var1_res var2_res var3_res, re

I obtain the following coefficients :
1) First regression :
var2 : 0.004***
var3 : 0.07***

2) Second regression :
var2_res : 0.02***
var3_res : 0.06*

3) Third regression :
var2_res : 0.005***
var3_res : 0.08***

My questions are the following :
- According to the Frisch-Waugh-Lovell theorem, I expected to get the same results from the regressions 1) and 2). Why is not it the case ?
- Is it only by chance that the regressions 1) and 3) give similar results ?
- If no, why is it important to use a random variable for the county-specific effects (i.e. to use xtreg..., re) whereas these effects seem to me to have been suppressed in the previous regressions (of type xi:are var1 i.year, absorb(CountyCode)) ?
- If yes, what regressions should I use in the method 2) to obtain the same result as in regression 1) ?

I thank you in advance for your answers.
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#2

11 Jul 2015, 10:43

By chance, do you have missing data on some of your variables? The three approaches are numerically equivalent provided you use the same observations. For example, if data are missing on var2 but not var3, then when you partial out you will be using extra observations for var3. You need to only use the complete cases, and then you will find all methods are identical.

I hope this is not a homework assignment .... JW
Comment
Lucie Letrouit

Join Date: Jul 2015

Posts: 4
#3

11 Jul 2015, 12:32

Dear Mr. Wooldridge,

I thank you for your answer. This is not a homework assignment. I am interested in the question because the method 2) gives me better results than the method 1) and I would like to know if these results are relevant.

There are no missing data in my data file. I join it to the message.

To simplify the problem, I suppressed var3 (as it would be too long to explain how I calculated it) and applied the 3 same methods :

1) First regression method :
xi:areg var1 var2 i.year, absorb(CountyCode)

2) Second regression method using the residuals of the regression of var1 and var2 respectively on year and county fixed effects (as in the Frisch-Waugh-Lovell theorem) and doing the final regression with reg :
xi:areg var1 i.year, absorb(CountyCode)
predict var1_res, residuals
xi:areg var2 i.year, absorb(CountyCode)
predict var2_res, residuals
reg var1_res var2_res

3) Third regression method using the residuals of the regression of var1 and var2 respectively on year and county fixed effects and doing the final regression with xtreg :
xi:areg var1 i.year, absorb(CountyCode)
predict var1_res, residuals
xi:areg var2 i.year, absorb(CountyCode)
predict var2_res, residuals
xtreg var1_res var2_res, re

I still get three different results.

Do you see any other explanation for these results ?

Attached Files

myData-Statalist.dta (254.4 KB, 1 view)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#4

11 Jul 2015, 17:21

Dear Lucie,

Are you getting different results with these data and these commands? I played with this and I got the same results with methods 1 and 2, as expected. If that is the case, this suggests that var3 is the cause of the problem. Are you sure you do not have missing observations in it, as Jeff suggested?

All the best,

Joao
Comment
Lucie Letrouit

Join Date: Jul 2015

Posts: 4
#5

11 Jul 2015, 19:48

Dear Mr. Santos Silva,

Thank you for your answer. I have been checking my data file : for a few counties, some years were totally missing. I suppressed these counties in the new data file attached to this message (I added the var3 in it). I also send the do file I have been using (for both cases : with and without var3).

Here are the results I get :

With var1, var2 and var3 :
1) First regression :
var2 : 0.0049***
var3 : 0.074***

2) Second regression :
var2_res : 0.022***
var3_res : 0.079**

3) Third regression :
var2_res : 0.0057***
var3_res : 0.083***

With var1 and var2 :
1) First regression :
var2 : -8e-6

2) Second regression :
var2_res : 0.018***

3) Third regression :
var2_res : 0.00022

The results of the different methods are still different. Do you see any problem in my do file or data file ?

Attached Files

myDoFile2-Statalist.do (976 Bytes, 1 view)

MyData2-Statalist.dta (287.5 KB, 1 view)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3000
#6

12 Jul 2015, 01:47

Dear Lucie,

The problem is that you are using _predict in place of predict. If you look at the help file, you'll see that _predict uses the data currently in memory, rather than the transformed data used in the areg estimation. Therefore, after areg, the predictions you get with _predict are meaningless. If you use predict, all the results are the same.

All the best,

Joao
1 like
Comment
Lucie Letrouit

Join Date: Jul 2015

Posts: 4
#7

12 Jul 2015, 10:57

Dear Mr. Santos Silva,

Thank you very much for your help. I had not remarked that there were two different commands predict. This solves my problem.

All the best,

Lucie
Comment

Announcement

Frisch-Waugh-Lovell theorem for data series using areg

Comment

Comment

Comment

Comment

Comment

Comment