Thanks to Kit Baum a new package, qrprocess, is now available on SSC for Stata 9.2+. You can install it with
This package offers fast estimation and inference procedures for the linear quantile regression model. First, qrprocess implements new algorithms that are much quicker than the built-in Stata commands, especially when a large number of quantile regressions or bootstrap replications must be estimated. Second, the commands provide analytical estimates of the variance-covariance matrix of the coefficients for several quantile regressions allowing for weights, clustering, and stratification. Third, in addition to traditional pointwise confidence intervals, this command also provides functional confidence bands and tests of functional hypotheses. Fourth, predict called after qrprocess can generate monotone estimates of the conditional quantile and distribution functions obtained by rearrangement. Fifth, the new command plotprocess conveniently plots the estimated coefficients with their confidence intervals and uniform bands.
Let's consider an example. We load a data set with 5634 observations:
The median regression of lwage on age, age2, education, and indicator variables for black and hispanic can be estimated with
qrprocess is very similar to the official command qreg when a single quantile regression is estimated but qrprocess offers additional algorithms that are faster when the number of observations is very large and it provides standard errors that allow for clustering and stratification.
The main advantages of qrprocess appear when many quantile regressions must be estimated to analyze the conditional distribution of the outcome. For instance, we may estimate 81 quantile regression for the quantile indexes 0.1, 0.11, 0.12, ..., 0.9 with
We have activated the option noprint because the tables of coefficients is huge. Instead, we can easily plot all the coefficients with the command
and obtain
data:image/s3,"s3://crabby-images/82969/829695e504a2aa9116e5ab43334912c66c15aad2" alt="Click image for larger version
Name: figure1.png
Views: 1
Size: 33.5 KB
ID: 1546890"
Note that qrprocess is significantly faster than calling 81 times qreg. In addition, qrprocess also estimates the covariances between the coefficients estimated at different quantile indexes, which allows testing cross-restrictions.
If this algorithm is still too slow, qrprocess implements a new and even faster estimator, the one-step estimator. This estimator is not numerically identical to the traditional quantile regression estimator but it is asymptotically equivalent to it. We can select this algorithm with the option method(onestep)
Many of the hypotheses of interest to researchers involve the whole quantile regression process, e.g. (1) Has a variable any effect at all? I.e. is the coefficient on this variable 0 at all quantile indexes? (2) Has a variable a positive effect over the whole distribution (stochastic dominance)? (3) Is the effect of a variable homogenous (constant at all quantile indexes)?
These are functional null hypotheses. A naive approach consisting of estimating many quantile regressions and using pointwise tests will suffer from the multiple testing problem. qrprocess offers tests for functional hypotheses as well as uniform confidence bands that cover the whole function with a prespecified probability. The option functional must be activated. Only the bootstrap can be used for functional inference. Here we use the multiplier bootstrap, which is faster:
At the end of the omitted output the p-values for many functional null hypotheses are provided. We can plot the coefficients, the pointwise confidence intervals as well as the uniform bands with plotprocess. Without any argument, we can see all the coefficients. If we are especially interested in the effect of education, we can type
and we obtain
data:image/s3,"s3://crabby-images/0d8f1/0d8f1e166ae6a634af73f0ad043de1ec058a7994" alt="Click image for larger version
Name: figure2.png
Views: 1
Size: 19.7 KB
ID: 1546891"
qrprocess and plotprocess offer many additional options that you can discover by reading the help files. We have also written a paper that describes the algorithms, the inference procedures, and the codes: "Quantile and distribution regression in Stata: algorithms, pointwise and functional inference". We are still working on it with the objective to submit it to the Stata Journal. We have written another paper where we suggest the new algorithms that are implemented in the package: "Fast algorithms for the quantile regression process".
These codes and papers are the results of joint work by Victor Chernozhukov, Iván Fernández-Val and myself.
Code:
ssc install qrprocess
Let's consider an example. We load a data set with 5634 observations:
Code:
use http://www.stata.com/data/jwooldridge/eacsap/cps91
Code:
. qrprocess lwage c.age##c.age educ i.black i.hispanic Quantile regression No. of obs. 3286 Algorithm: qreg. Variance: kernel estimate of the sandwich as proposed by Powell(1990). ------------------------------------------------------------------------------ lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- Quant. 0.5 | age | .04578 .0077878 5.88 0.000 .0305106 .0610495 c.age#c.age | -.0005031 .000099 -5.08 0.000 -.0006972 -.000309 educ | .1018382 .0041052 24.81 0.000 .0937893 .1098871 1.black | -.021541 .0414319 -0.52 0.603 -.102776 .0596939 1.hispanic | .0484709 .0432474 1.12 0.262 -.0363236 .1332655 _cons | -.1473537 .1501476 -0.98 0.326 -.4417461 .1470388 ------------------------------------------------------------------------------
The main advantages of qrprocess appear when many quantile regressions must be estimated to analyze the conditional distribution of the outcome. For instance, we may estimate 81 quantile regression for the quantile indexes 0.1, 0.11, 0.12, ..., 0.9 with
Code:
qrprocess lwage c.age##c.age educ i.black i.hispanic, quantile(0.1(0.01)0.9) noprint
Code:
plotprocess
Note that qrprocess is significantly faster than calling 81 times qreg. In addition, qrprocess also estimates the covariances between the coefficients estimated at different quantile indexes, which allows testing cross-restrictions.
If this algorithm is still too slow, qrprocess implements a new and even faster estimator, the one-step estimator. This estimator is not numerically identical to the traditional quantile regression estimator but it is asymptotically equivalent to it. We can select this algorithm with the option method(onestep)
Code:
qrprocess lwage c.age##c.age educ i.black i.hispanic, quantile(0.1(0.01)0.9) noprint method(onestep)
Many of the hypotheses of interest to researchers involve the whole quantile regression process, e.g. (1) Has a variable any effect at all? I.e. is the coefficient on this variable 0 at all quantile indexes? (2) Has a variable a positive effect over the whole distribution (stochastic dominance)? (3) Is the effect of a variable homogenous (constant at all quantile indexes)?
These are functional null hypotheses. A naive approach consisting of estimating many quantile regressions and using pointwise tests will suffer from the multiple testing problem. qrprocess offers tests for functional hypotheses as well as uniform confidence bands that cover the whole function with a prespecified probability. The option functional must be activated. Only the bootstrap can be used for functional inference. Here we use the multiplier bootstrap, which is faster:
Code:
qrprocess lwage c.age##c.age i.black i.hispanic educ, quantile(0.1(0.01)0.9) functional vce(multiplier, reps(500))
Code:
plotprocess educ, ytitle("QR coefficent") title("Years of education")
qrprocess and plotprocess offer many additional options that you can discover by reading the help files. We have also written a paper that describes the algorithms, the inference procedures, and the codes: "Quantile and distribution regression in Stata: algorithms, pointwise and functional inference". We are still working on it with the objective to submit it to the Stata Journal. We have written another paper where we suggest the new algorithms that are implemented in the package: "Fast algorithms for the quantile regression process".
These codes and papers are the results of joint work by Victor Chernozhukov, Iván Fernández-Val and myself.
Comment