Testing coefficients across different quantile regression models

Jann Goedecke

Join Date: Mar 2015
Posts: 7

Testing coefficients across different quantile regression models

24 Jul 2017, 20:22

Hi Stata users,
I would like to test for equality of coefficients from two different quantile regression models. The variables in both models are the same, but the models are applied to two different groups of observations. Stata provides the useful suest command for these kinds of tests but it does not work with quantile regression because of the non-standard computation of the variance-covariance matrix using the bootstrap. It's possible to run a simple linear Wald test by hand of course. I am not sure however if and how to adjust standard errors in that case (what suest would otherwise do for me). Running the two models separately yields different standard errors than pooling the models and adding an interaction of the group variable with all independent variables.

Please find below a working example of the model Q_0.5(y|x) = b1*x1 + b2*x2 + b3*x3 + e.
As can be seen the coefficients in the two models (A) are the same as in the pooled model (B), but the standard errors vary.

If someone has any idea of which standard errors are more appropriate for the Wald test (if any) I am grateful for suggestions!

Thank you very much!
Jann

Commands:

Code:

*A. both models separately
sqreg y x1 x2 x3  if R1
sqreg y x1 x2 x3  if R2

*B. Pooled regression with interaction terms
sqreg y (c.R2 c.R1)#(c.x1 c.x2 c.x3 cons)

Output (shortened):

Code:

. *A. both models separately
. sqreg y x1 x2 x3  if R1
(fitting base model)

Bootstrap replications (20)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
....................

Simultaneous quantile regression                     Number of obs =       302
  bootstrap(20) SEs                                  .50 Pseudo R2 =    0.4117

------------------------------------------------------------------------------
             |              Bootstrap
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
q50          |
          x1 |   .1463415   .0576159     2.54   0.012     .0329559     .259727
          x2 |   .6239837   .0435789    14.32   0.000     .5382223    .7097451
          x3 |   .0609756   .0465972     1.31   0.192    -.0307257     .152677
       _cons |  -1.050813   .9817498    -1.07   0.285    -2.982854    .8812274
------------------------------------------------------------------------------

. sqreg y x1 x2 x3  if R2
(fitting base model)

Bootstrap replications (20)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
....................

Simultaneous quantile regression                     Number of obs =       163
  bootstrap(20) SEs                                  .50 Pseudo R2 =    0.4651

------------------------------------------------------------------------------
             |              Bootstrap
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
q50          |
          x1 |    .147429   .0571097     2.58   0.011     .0346376    .2602204
          x2 |    .676735   .0513315    13.18   0.000     .5753554    .7781146
          x3 |   .0871988   .0611658     1.43   0.156    -.0336033     .208001
       _cons |  -1.944444   1.435971    -1.35   0.178    -4.780482    .8915926
------------------------------------------------------------------------------

. 
. *B. Pooled regression with interaction terms
. sqreg y (c.R2 c.R1)#(c.x1 c.x2 c.x3 cons)
note: 1.cons#c.R1 omitted because of collinearity
(fitting base model)

Bootstrap replications (20)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
....................

Simultaneous quantile regression                     Number of obs =       465
  bootstrap(20) SEs                                  .50 Pseudo R2 =    0.4363

------------------------------------------------------------------------------
             |              Bootstrap
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
q50          |
   c.R2#c.x1 |    .147429    .063335     2.33   0.020      .022965     .271893
             |
   c.R2#c.x2 |    .676735   .0413437    16.37   0.000     .5954876    .7579824
             |
   c.R2#c.x3 |   .0871988   .0438862     1.99   0.048     .0009551    .1734426
             |
   cons#c.R2 |
          1  |   -.893631   1.463595    -0.61   0.542    -3.769843    1.982581
             |
   c.R1#c.x1 |   .1463415   .0547102     2.67   0.008     .0388266    .2538563
             |
   c.R1#c.x2 |   .6239837   .0509805    12.24   0.000     .5237985     .724169
             |
   c.R1#c.x3 |   .0609756    .051976     1.17   0.241    -.0411659    .1631171
             |
   cons#c.R1 |
          1  |          0  (omitted)
             |
       _cons |  -1.050813   1.100029    -0.96   0.340    -3.212556    1.110929
------------------------------------------------------------------------------

Code:

Tags: None

John Mullahy

Join Date: Dec 2016

Posts: 733
#2

24 Jul 2017, 21:01

Jann: I wonder if the differences are due to different seeds being used to initiate the bootstraps. For instance, if you

Code:

set seed 23

before each of the three -sqreg- specifications you estimate, do the differences vanish? (23 is just an example seed, of course)
Comment
Jann Goedecke

Join Date: Mar 2015

Posts: 7
#3

24 Jul 2017, 22:46

Thanks John for your suggestion. I tried it and set the seed to the same value before each regression but the standard errors are still different. But you have a point that the bootstrap's (pseudo-)randomness plays a role. I think setting the seed can only help in reproducing the results of the exact same model. Since all three specifications use different sets of observations to draw from, the bootstrap will always produce different results in the third model compared to the first two models, irrespective of the seed.

But then again, if the differences in the standard errors were entirely random, the expectation the Wald statistics should be the same in both cases, so we could just pick one. What troubles me more is the assumptions we have to impose on the VCE in both cases. suest restricts the off-block covariances (ie, the covariances of the coefficients from different models) to be zero, while adjusting the variances (i.e. the values on the diagonal) accordingly. Both cannot be achieved through either option A or option B above. So I'm wondering whether either of the tests would be valid.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 733
#4

25 Jul 2017, 06:19

Jann: Perhaps I'm still misunderstanding, but if the subsamples defined by R1 and R2 are independent wouldn't it be the case that the off-block covariances are zero by definition?
Comment

Jann Goedecke

Join Date: Mar 2015
Posts: 7

25 Jul 2017, 19:19

John, thanks you're right. Initially I thought that even if the theoretical off-block covariances should the zero, the empirical ones would not be. But now I see they essentially are zero and it's just roundoff error that makes it a little less obvious at first sight.
I asume then the Wald tests based on the pooled regression with interaction should do it. Many thanks again!

Code:

reg y (c.R2 c.R1)#(c.x1 c.x2 c.x3 c.cons)
(output omitted)
mat list e(V)
 
symmetric e(V)[9,9]
                   c.R2#       c.R2#       c.R2#       c.R2#       c.R1#       c.R1#       c.R1#      co.R1#           
                   c.x1        c.x2        c.x3      c.cons        c.x1        c.x2        c.x3     co.cons       _cons
  c.R2#c.x1    .0031191
  c.R2#c.x2  -.00145725   .00268501
  c.R2#c.x3  -.00036844  -.00008964    .0019074
c.R2#c.cons   .00107875  -.00364803  -.03927392   1.2427001
  c.R1#c.x1  -1.805e-17   9.027e-18   8.955e-17   .00013916   .00133398
  c.R1#c.x2   1.252e-17   1.676e-17   3.961e-17   .00120123  -.00072968   .00161241
  c.R1#c.x3  -1.598e-16   3.651e-16  -4.050e-17   .01714931  -.00009144  -.00011797   .00083522
      co.R1#
    co.cons           0           0           0           0           0           0           0           0
      _cons   3.474e-15  -8.008e-15   3.522e-16  -.37710524  -.00013916  -.00120123  -.01714931           0   .37710524

Announcement

Testing coefficients across different quantile regression models

Comment

Comment

Comment

Comment