Covariance between residuals and fitted values not zero in SURE models

shem shen

Join Date: Mar 2016
Posts: 136

Covariance between residuals and fitted values not zero in SURE models

09 Apr 2021, 11:56

Dear list,

In the example below, I ran a SURE model that consists of two equations. Then I calculated the covariance between the residuals and the fitted values of each equation, as well as the covariance between the residuals of equation 1 and the fitted values of equation 2, and the covariance between the residuals of equation 2 and the fitted values of equation 1.

My first question is: why are these covariances not zero? I thought they should be zero by assumption? I am not sure if I made any mistakes in my code, or if I have misunderstanding of the statistical principles behind SURE model. I tried to find the answer on google but have not found any discussion on this particular issue so far. I thought that they should be zero by assumption because I had the impression that in the covariance between residuals and fitted values in OLS should be zero. In the context of SURE model, the cross-equation residuals-fitted values covariance should also be zero by the assumption of the SURE model, no?

I then tried a different specification: transform the dependent variables into fractional ranks and then run the same SURE model. Now, the covariances are close to zero, but still not exactly zero. And so here is my second question: why the covariances are close to zero now? Are they close to zero because of any model assumption, or because of the fractional rank transformation that reduces the variances of the dependent variable (meaning that these covariances are never supposed to be zero by any assumption)?
Thank you!

Code:

sysuse auto,clear
sureg (price foreign headroom length) (weight mpg turn trunk)
predict pricehat,eq(price)
predict weighthat,eq(weight)
predict priceres,eq(price) res
predict weightres,eq(weight) res

corr pricehat priceres,cov
corr weighthat weightres,cov
corr pricehat weightres,cov
corr weighthat priceres,cov

. sysuse auto,clear
(1978 Automobile Data)

. sureg (price foreign headroom length) (weight mpg turn trunk)

Seemingly unrelated regression
--------------------------------------------------------------------------
Equation             Obs   Parms        RMSE    "R-sq"       chi2        P
--------------------------------------------------------------------------
price                 74       3    2411.322    0.3225      31.29   0.0000
weight                74       3    322.4527    0.8255     343.91   0.0000
--------------------------------------------------------------------------

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
price        |
     foreign |   2851.353   700.3963     4.07   0.000     1478.602    4224.105
    headroom |   -378.459   367.1648    -1.03   0.303    -1098.089    341.1709
      length |   86.51587   16.47117     5.25   0.000     54.23297    118.7988
       _cons |  -9808.761   2941.691    -3.33   0.001    -15574.37   -4043.153
-------------+----------------------------------------------------------------
weight       |
         mpg |  -40.26199   9.162906    -4.39   0.000    -58.22095   -22.30302
        turn |   93.34017   12.39739     7.53   0.000     69.04173    117.6386
       trunk |    30.4541   10.88417     2.80   0.005     9.121525    51.78667
       _cons |  -242.8304   614.8611    -0.39   0.693    -1447.936    962.2753
------------------------------------------------------------------------------

. predict pricehat,eq(price)
(option xb assumed; fitted values)

. predict weighthat,eq(weight)
(option xb assumed; fitted values)

. predict priceres,eq(price) res

. predict weightres,eq(weight) res

.
. corr pricehat priceres,cov
(obs=74)

             | pricehat priceres
-------------+------------------
    pricehat |  2.3e+06
    priceres |   271367  5.9e+06


. corr weighthat weightres,cov
(obs=74)

             | weight~t weight~s
-------------+------------------
   weighthat |   477031
   weightres |  10799.5   105400


. corr pricehat weightres,cov
(obs=74)

             | pricehat weight~s
-------------+------------------
    pricehat |  2.3e+06
   weightres |   104389   105400


. corr weighthat priceres,cov
(obs=74)

             | weight~t priceres
-------------+------------------
   weighthat |   477031
    priceres |   230632  5.9e+06


sysuse auto,clear
fracrank price, gen(pricerank)
fracrank weight, gen(weightrank)
sureg (pricerank foreign headroom length) (weightrank mpg turn trunk)
predict pricehat,eq(pricerank)
predict weighthat,eq(weightrank)
predict priceres,eq(pricerank) res
predict weightres,eq(weightrank) res

corr pricehat priceres,cov
corr weighthat weightres,cov
corr pricehat weightres,cov
corr weighthat priceres,cov


. sysuse auto,clear
(1978 Automobile Data)

. fracrank price, gen(pricerank)

. fracrank weight, gen(weightrank)

. sureg (pricerank foreign headroom length) (weightrank mpg turn trunk)

Seemingly unrelated regression
--------------------------------------------------------------------------
Equation             Obs   Parms        RMSE    "R-sq"       chi2        P
--------------------------------------------------------------------------
pricerank             74       3    .2059435    0.4910      68.92   0.0000
weightrank            74       3    .1208745    0.8246     346.33   0.0000
--------------------------------------------------------------------------

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pricerank    |
     foreign |   .3701051   .0627873     5.89   0.000     .2470443    .4931659
    headroom |  -.0796659   .0328007    -2.43   0.015    -.1439542   -.0153777
      length |    .011638   .0014551     8.00   0.000     .0087861      .01449
       _cons |  -1.558737   .2586763    -6.03   0.000    -2.065733   -1.051741
-------------+----------------------------------------------------------------
weightrank   |
         mpg |  -.0178046   .0035965    -4.95   0.000    -.0248537   -.0107555
        turn |   .0346134   .0048271     7.17   0.000     .0251524    .0440744
       trunk |   .0085549   .0042401     2.02   0.044     .0002446    .0168653
       _cons |  -.6108735   .2405101    -2.54   0.011    -1.082265   -.1394824
------------------------------------------------------------------------------


. predict pricehat,eq(pricerank)
(option xb assumed; fitted values)

. predict weighthat,eq(weightrank)
(option xb assumed; fitted values)

. predict priceres,eq(pricerank) res

. predict weightres,eq(weightrank) res

.
. corr pricehat priceres,cov
(obs=74)

             | pricehat priceres
-------------+------------------
    pricehat |  .039082
    priceres |  .001192  .042994


. corr weighthat weightres,cov
(obs=74)

             | weight~t weight~s
-------------+------------------
   weighthat |  .068783
   weightres |  .000427  .014811


. corr pricehat weightres,cov
(obs=74)

             | pricehat weight~s
-------------+------------------
    pricehat |  .039082
   weightres |  .003934  .014811


. corr weighthat priceres,cov
(obs=74)

             | weight~t priceres
-------------+------------------
   weighthat |  .068783
    priceres |  .004089  .042994

Last edited by shem shen; 09 Apr 2021, 12:15.

Tags: None

Joro Kolev

Join Date: Aug 2018

Posts: 3047
#2

09 Apr 2021, 13:46

We should not confuse assumptions, with algebraic properties.

We do assume both in OLS and GLS that the unobservable error term is uncorrelated with the regressors.

However only in OLS estimation the residual vector is orthogonal to the regressors by construction. Note that the residual vector is not the unobservable vector of error terms.

As you have discovered yourself, in a GLS regression the residual vector is not orthogonal to the regressors.
1 like
Comment
shem shen

Join Date: Mar 2016

Posts: 136
#3

09 Apr 2021, 14:18

Originally posted by Joro Kolev View Post

We should not confuse assumptions, with algebraic properties.

We do assume both in OLS and GLS that the unobservable error term is uncorrelated with the regressors.

However only in OLS estimation the residual vector is orthogonal to the regressors by construction. Note that the residual vector is not the unobservable vector of error terms.

As you have discovered yourself, in a GLS regression the residual vector is not orthogonal to the regressors.

Thank you very much Joro!
This is probably a long shot, but do you happen to know if there is any Stata command (or statistical models other than SURE) that can guarantee that the residual vector in Eq. 1 is, by construction, orthogonal to the regressors in Eq. 2 (and the residual vector in Eq. 2 is also, by construction, orthogonal to the regressors in Eq. 1)?

Last edited by shem shen; 09 Apr 2021, 14:25.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#4

09 Apr 2021, 14:28

You can easily enforce this property, Shem. Just take equation 1, augment it with all the regressors specific to equation 2 (or equivalently include the predicted values from equation 2), and estimate the resulting estimation equation by OLS.

I cannot recall seeing this procedure described or applied in published work, but I do not see much harm in doing this. Including irrelevant regressors results only in loss of efficiency, so I do not see how we can break the computer by including the specific to equation 2 regressors in equation 1.

Originally posted by shem shen View Post

Thank you very much Joro!
This is probably a long shot, but do you happen to know if there is any Stata command (or statistical models) that can force the residual vector in Eq. 1 to be orthogonal to the regressors in Eq. 2 in a two-equation regression system?
Comment
shem shen

Join Date: Mar 2016

Posts: 136
#5

09 Apr 2021, 15:00

Originally posted by Joro Kolev View Post

You can easily enforce this property, Shem. Just take equation 1, augment it with all the regressors specific to equation 2 (or equivalently include the predicted values from equation 2), and estimate the resulting estimation equation by OLS.

I cannot recall seeing this procedure described or applied in published work, but I do not see much harm in doing this. Including irrelevant regressors results only in loss of efficiency, so I do not see how we can break the computer by including the specific to equation 2 regressors in equation 1.

Thank you Joro! You are absolutely right. I really appreciate your help!
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3047
#6

10 Apr 2021, 00:05

You are welcome !

On second thought, including all the regressors in each equation is absolutely standard and is called Multivariate Regression model.

Because all the regressors are included in each equation, and the estimation method is OLS, the Multivariate Regression model has the property that you are looking for, that the residual in each equation is by construction uncorrelated with all the regressors appearing anywhere in the system.

Originally posted by shem shen View Post

Thank you Joro! You are absolutely right. I really appreciate your help!
1 like
Comment
shem shen

Join Date: Mar 2016

Posts: 136
#7

10 Apr 2021, 13:24

Originally posted by Joro Kolev View Post

You are welcome !

On second thought, including all the regressors in each equation is absolutely standard and is called Multivariate Regression model.

Because all the regressors are included in each equation, and the estimation method is OLS, the Multivariate Regression model has the property that you are looking for, that the residual in each equation is by construction uncorrelated with all the regressors appearing anywhere in the system.

Thank you again Joro! Great to know that it is called multivariate regression and thank you for sharing this information with me, which is very helpful.
By the way, since I mentioned a user-written command in my thread, I should have made it clear that: the fracrank command is made available by Philippe Van Kerm's fracrank package (bundled with sgini)
Code:

Code:

net install sgini, from("http://medim.ceps.lu/stata") replace
Comment

Announcement

Covariance between residuals and fitted values not zero in SURE models

Comment

Comment

Comment

Comment

Comment

Comment