Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Significance test (t-test) for coefficients across multipe unconditional quantile regressions (rifreg)

    Hello,

    I would like to know how I can do a t-test for the coefficients of multiple rifregs (Firpo rifreg.ado). We´re researching the effect of several variables on the wage and especially the gender wage gap and consider special payments to be an important factor. We did rifregs for several quantiles with the Firpo rifreg.ado. One time the dependent variable was the pure wage and the other time the dependent variable was the wage plus the special payment. Now I need to know how I can check if the differences between the coefficients of the base wage and the base wage+special payments models are significant.

    We already did a comparison of the confidence intervals but that seems not to be enough because if they overlap it is still possible that there is a significant difference. I though about checking the significance with a t-test. For that, as I found out, it is needed to store the results of the regression with "eststo". After doing that you apparently have to combine the stored estimates using the "suest" command and afterswards you should be able to do the t-test. The problem now is that the suest command doesn't work with the eststo'ed rifreg results. I get the error message "*name of the eststo* was estimated with a nonstandard vce (robust)". I have to admit that I don't fully understand the problem stata is having there. I searched for solutions how to fix that but I didn't find any answers.

    So how can I get the t-test done? Or if that is not possible how can I check for significance another way?
    If we can't show the significance of the different coefficients that would be really bad.

    Thanks for any help!

    Best Regards,
    Christoph

    €: I also posted the question on stackoverflow: http://stackoverflow.com/questions/3...tional-quantil
    Last edited by Christoph Jehle; 17 Feb 2016, 11:26.

  • #2
    I'm sorry I forgot about that. The post at stackoverflow was (apparently) also written by me. Thanks for linking it!

    €: Above this post there was a post rightly criticizing that I didn't link my cross post. Now it has been deleted.
    Last edited by Christoph Jehle; 17 Feb 2016, 11:29.

    Comment


    • #3
      The question disappeared (temporarily) on CV, so I deleted my original post. I don't get the "(apparently)", but there you go.

      Comment


      • #4
        I get the error message "*name of the eststo* was estimated with a nonstandard vce (robust)".
        The help file for -suest- is very clear on this:

        2. Estimation should take place without the vce(robust) or vce(cluster clustvar) options. suest always computes the robust estimator of the (co)variance, and suest has a
        vce(cluster clustvar) option.
        So you will have to go back and re-estimate the models in question using the standard vce (i.e. don't specify any vce() option). Then -suest- will gladly calculate your cross-model tests. Do read the entire help file for -suest-.

        Comment


        • #5
          Thank you for your answer.

          The original code had an weight defined at the end "(aweight=xxx)". To check if that is the source of the problem I left it out. But the error still occured when I didn't specify the weight. The code was just "rifreg depvar indepvar indepvar indepvar, q(.10)". So I don't see where I specified any vce() option. I then tried to check if rifreg uses nonstandard vce by default but didn't find out anything about that yet. I will read the help file.

          Comment


          • #6
            I'm not previously familiar with -rifreg-, which is not part of official Stata (and isn't even findable using Stata's -search- command). But I found it on Google and looked at the code.* Rather unconventionally, robust vce is the default in -rifreg-. To avoid the use of vce(robust) you have to specify -rifreg-'s -norobust- option.

            *Normally I would look for information in the help file, but there is something wrong with the help file that came in that package and I was unable to open it.

            Comment


            • #7
              http://faculty.arts.ubc.ca/nfortin/datahead.html is a source (you are asked to say where user-written programs come from).

              This is the help

              Code:
              help rifreg,                                                   dialogs:  rifreg
              --------------------------------------------------------------------------------
              
              Title
              
                  [R] rifreg --
                                Recentered Influence Function (RIF) regression
              
              Syntax
              
                  Recentered Influence Function (RIF) regression
              
                      rifreg depvar [indepvars] [if] [in] [weight] [, rifreg_options]
              
                  rifreg_options             Description
                  --------------------------------------------------------------------------
                  Model
                    quantile(#)               specifies the # quantile; default is
                                               quantile(.5), it can also be 50 (equivalent
                                               to 0.5), or 50.5 (equivalent to 0.505)
                    gini                      specifies the Gini
                    variance                  specifies the variance
              
                  Quantile Options
                    kernop(string)            kernel function used with kdensity, one of
                                               epanechnikov, epan2, biweight, cosine,
                                               gaussian, parzen, rectangle, and triangle;
                                               default is gaussian
                    width(#)                  halfwidth of kernel; default is width(0.0);
                                               which calculates the 'optimal' value
                    retain(string)           Retain RIF column: save the calculated values
                                               into variable string, but only for values
                                               conditioned by [if] and [in]
                    norobust                  Do not use robust standard error estimation
                    generate(string string)  Save the values from the kernel density
                                               estimation so they do not need to be
                                               recomputed.
              
                  Reporting
                    level(#)                  Set the confidence level to #
                    bootstrap                 Generate bootstrapped standard errors, with a
                                               default of 50 repetitions
                    reps                      Number of repetitions used for boostrapping
                                               must be less than the maximum matrix size
                                               matsize
              
                  --------------------------------------------------------------------------
                  fweights, aweights, and iweights are allowed; see weight.
              
              
              Description
              
                  rifreg fits a regression model of the re-centered influence function (RIF)
                  of a distributional statistic of interest (quantile, variance or gini) of
                  the marginal distribution of depvar on indepvars. In the case of
                  quantiles, RIF-regressions can be thought of as unconditional quantile
                  regressions. The influence function is a widely used tool in robust
                  estimation; here it is recentered so the mean of the recentered influence
                  function corresponds to the statistic of interest.  In RIF-regressions,
                  the depvar is replaced by the corresponding RIF of the statistic of
                  interest.
              
              Options for rifreg
              
                      +-------+
                  ----+ Model +-------------------------------------------------------------
              
                  quantile(#) specifies the quantile for which the RIF is computed; should
                      be a number between 0 and 1, exclusive.  The default value of 0.5
                      corresponds to the median.  Syntactically, 50 is equivalent to 0.5,
                      and 50.5 is equivalent to 0.505.  The number of steps is computed
                      based on the number of significant decimal places, with more taking
                      longer to compute, and requiring more memory.
              
                  gini specifies the Gini as the distributional statistic of interest
              
                  variance specifie variance as the distributional statistic of interest
              
                      +------------------+
                  ----+ Quantile Options +--------------------------------------------------
              
                  kernop(string) specifies the kernel function used to estimate the density
                      of the dependent variable , one of epanechnikov, epan2, biweight,
                      cosine, gaussian, parzen, rectangle, and triangle; default is
                      gaussian.  See kdensity for further information. The RIF for quantiles
                      may be sensitive to the choice of bandwidth.  It is advisable to graph
                      the density and explore alternative choices of bandwidth for
                      appropriate smoothness using the options in vkdensity, for example.
              
                  width(#) Halfwidth of kernel; default is width(0.0); which calculates the
                      optimal value.  See kdensity for further information.
              
                  norobust Do not use robust standard error estimation
              
                  generate(string string) Save the values from the kernel density estimation
                      so they do not need to be recomputed.  If the variables named by
                      'string string' do not have values, they will be computed and saved.
                      If they do contain values, those values are used, if there are
                      sufficient observations.
              
                  retain(string) Retain RIF column: save the calculated values into variable
                      string, but only for values conditioned by [if] and [in], with others
                      set to '.'.
              
              
                      +-----------+
                  ----+ Reporting +---------------------------------------------------------
              
                  level Set the confidence interval for output
              
                  bootstrap Generate bootstrapped standard errors, with a default of 50
                      repetitions
              
                  reps Number of repetitions used for boostrapping, must be less than the
                      maximum matrix size matsize
              
              Examples
              
                  Single RIF-regressions
              
                  . rifreg y x1 x2 x3, quantile(0.9)
              
                  . rifreg lwage educ exp expsq union, variance
              
                  . rifreg income x1 x2 x3 , gini
                  . display "Gini: `e(gini)'
              
              
                  Using bootstrapping
              
                  . rifreg y x1 x2 x3, quantile(0.9) bootstrap reps(100)
                  . rifreg y x1 x2 x3, gini bootstrap reps(100)
              
              
                  Multiple RIF-regressions
              
                  When one wants to estimate more than one quantile, the following loop can
                  be used
              
                  . local quartile 0.25 0.5 0.75
                  . foreach q of local quartile {
                  .  rifreg lwage educ exp expsq union, quantile(`q') w(0.06)
                  . }
              
              
                  Sweeping a range of quantiles / using generate
              
                  It is often useful to compute the regression for multiple quantiles, in a
                  certain range, to create a graph, or to see all the different values.
                  This can be calculated more quickly by using the generate option, to
                  reduce the number of times the influence function has to be recalculated.
              
                  . forvalues q = 0.1(0.1)0.9 {
                  .  rifreg lwage educ exp expsq union, quantile(`q') w(0.06) generate(eval
                      dens)
                  . }
              
              
                  RIF-regression decompositions
              
                  The quantile RIF-regressions can be used to perform an Oaxaca
                  decomposition using the oaxaca8 command, but the option retain(string)
                  must be used to store the estimates of the RIF:
              
                    Step 1: Estimate and store the models
              
                  . rifreg lwage age schooling wkswk_18 if female==0 [weight=wgt], q(0.5)
                      w(0.06) re(rif_50m)
                  . estimates store malemed
                  . rifreg lwage age schooling wkswk_18 if female==1 [weight=wgt], q(0.5)
                      w(0.06) re(rif_50f)
                  . estimates store femalemed
              
                    Step 2: Use oaxaca8 to compute the decomposition
              
                  . oaxaca8 malemed femalemed, weight(1 0 0.525) detail notf
              
              
              
              References
              
                  Firpo, Sergio, Nicole Fortin and Thomas Lemieux, Unconditional Quantiles
                      Regressions," NBER Technical Paper T339, July 2007. (forthcoming in
                      Econometrica)
                  Firpo, Sergio, Nicole Fortin and Thomas Lemieux, Decomposing Wage
                      Distributions using Recentered Influence Function Regressions". June
                      2007.
              
              
              Also see
              
                  Online:  regress

              Comment


              • #8
                Thanks again! (and sorry for not adding the source of the ado in my startingpost)

                I guess i should have read the help file more carefully. Then I would have seen the nonrobust option, which makes it clear that robust is used by default. Tomorrow when I'm at university again I will use the nonrobust. But I hope the results are not to different to the original ones. I'm still baffled as why the robust is default since as far as I know the sandwhich estimator is only used in the case of heteroscedasticity. I will have to discuss with my colleagues if it's maybe even wiser to use the nonrobust for all models. Also it seems weird to check the significance with nonrobust but otherwise use robust.

                Comment


                • #9
                  Again, if you read the help for -suest- you will see that it expects estimates with non-robust VCEs, and it will "robustify" them as desired in calculating its own results.

                  Edited in later: I think I have misunderstood your remark in #8 about checking help files and using/not using robust. I thought you were referring to -suest-, but I now see ou're referring to -rifreg-. So, sorry for beating a dead horse on -suest-'s help file.
                  Last edited by Clyde Schechter; 17 Feb 2016, 13:49.

                  Comment


                  • #10
                    Nick Cox Thanks for posting that help file. Somehow my version got corrupted in the download and it's just a garble of random characters.

                    Comment


                    • #11
                      Stata .hlp files, which were always text, sometimes get misread as Windows help files, which aren't. That's the reason that StataCorp changed the extension to .sthlp.

                      Comment


                      • #12
                        Hello,

                        I know tried to run the -suest- command after using the -rifreg, norobust- option. As expected I don't get the error message with the vce robust anymore. But still suest won't gladly calculate my tests unfortunately. After using -suest- I now get another error message: unable to generate scores for model xxx. suest requires that predict allow the score option". And in the help of suest I can find that
                        "Different estimators are allowed, for example, a regress model and a probit model; the only requirement is that predict produce equation-level scores with the score option after an estimation command." but I don't understand that since I did use neither -predict- nor the score option. Also the rifreg help says nothing about either score or predict. And also the help of predict doesn't seem to help me much. Does it maybe mean that I have to use -predict, scores- after each rifreg for each variable and then -eststo- the predicts or something like that?

                        €: I just checked the code of the rifreg.ado and -predict- is used there for example here
                        Code:
                        //get uC
                           //depends on WC
                           qui reg `y' `rest' [aw=`WC'*`eweight'], robust
                           predict `uC', residual
                        
                           //get ephiC
                           //depends on uC
                           local `dflvarnames' : colnames e(b)
                           local `dflvarnames' : subinstr local `dflvarnames' "_cons" "", word
                           qui reg `uC' ``dflvarnames'' [aw=`eweight'] if `T1' == 0
                           predict `ephiC', xb
                        Maybe it is possible to just add the score option to every -predict- used?

                        I just tried exactly that. I added the score option after the both -predicts- and saved the ado and then ran the regressions again. But I still get the same error. I don't even know if its possible to change the ado code after the installation of an ado because I don't know if an ado gets installed once or if Stata reads the ado code every time the ado is used. If the first would be the case a change afterwards wouldn't accomplish anything of course.
                        Last edited by Christoph Jehle; 18 Feb 2016, 03:54.

                        Comment


                        • #13
                          Hello Christoph,

                          I am also trying to test RIFREQ parameter estimates across quantiles uisng the code Nicole Fortin posted online. So far I have not been successful. Did you ever figure out how to do it? If so could you please post your code. This is the first time I have used STATALIST and am not sure of the protocol. I tried to find you online without any luck, so I am trying the STATALIST process. Thank you for your time.

                          Comment


                          • #14
                            It seems nobody followed up with Christoph's post #12. I don't recall ever seeing it. Anyway, in light of the new post on this thread, I will add my pessmistic view of this situation.

                            It is -suest- itself that attempts to call -predict, score- when it runs. If -predict- cannot calculate -score- for the particular model estimated, then -suest- exits with an error message saying that it was unable to calculate score. And without calculating score, it cannot do anything.

                            So fixing this would require a major hack, perhaps to -rifreg- but definitely to -predict- to enable it to calculate scores after -rifreq-. I don't know if it is even possible: I am not familiar with -rifreg- and I don't know if scores are even definable for it. (It is possible only for likelihood-based estimators; I don't know if -rifreg- is likelihood-based or not.) Changes to -rifreg- itself might or might not be needed: if they are it would likely be minimal, causing it to return some values in e() or r() that are needed for calculating scores but aren't currently there. Even assuming it can be done, hacking -predict- is highly inadvisable: you might gain this small capability, but you might in the process introduce errors into one of the key commands used for all manner of things following estimation. I doubt it's worth it. At the very least, if you go this route, you should make a back-up copy of the original predict.ado and of any associated files you modify.

                            A safer way to proceed, assuming that -rifreg- is likelihood based and that scores can be calculated in some reasonable way for it, would be to write a special purpose ado file that basically emulates what -suest- does, but is specialized only to do the calculations for -rifreg-, and instead of calling -predict- it calculates the scores internally. This is a major undertaking that would require a good knowledge of mathematical statistics, the workings of -rifreg-, the workings of -suest-, and meticulous programming skills.

                            Comment

                            Working...
                            X