Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem/bug with the new absorb() option in StataNow for regress: incorrect scores -> suest invalid

    I very much welcome the new absorb() option for the regress command introduced in Stata 18.5 (StataNow). However, this now creates problems further down the assembly line.

    The regress postestimation help file states the following description for the scores option of the predict command:
    score is equivalent to residuals in linear regression.
    However, this is no longer correct when variables have been absorbed. The scores produced here are incorrect. As a consequence, subsequent commands that require those scores will produce incorrect results as well. First and foremost, this is an issue for the suest command; see the following example:
    Code:
    . webuse psidextract
    
    . quietly regress lwage wks, absorb(id)
    
    . estimates store reg
    
    . suest reg, vce(cluster id)
    
    Cluster adjusted results for reg                         Number of obs = 4,165
    
                                       (Std. err. adjusted for 595 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
                 | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    mean         |
             wks |   .0010085   .0041811     0.24   0.809    -.0071864    .0092033
           _cons |   6.629139   .1973423    33.59   0.000     6.242355    7.015923
    -------------+----------------------------------------------------------------
    lnvar        |
           _cons |  -2.696966   .1758439   -15.34   0.000    -3.041613   -2.352318
    ------------------------------------------------------------------------------
    
    . regress lwage wks, absorb(id) vce(cluster id)
    
    Linear regression, absorbing indicators         Number of obs     =      4,165
                                                    F(0, 594)         =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.7287
                                                    Adj R-squared     =     0.6835
                                                    Root MSE          =     .25963
    
                                       (Std. err. adjusted for 595 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
           lwage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             wks |   .0010085   .0014418     0.70   0.485    -.0018231      .00384
           _cons |   6.629139   .0674909    98.22   0.000     6.496589    6.761689
    ------------------------------------------------------------------------------
    The robust standard errors obtained from suest are now very different (and wrong) compared to the correct robust standard errors from regress. Without the absorb() option, the are virtually identical (aside from different degrees-of-freedom corrections).

    Ideally, this should be fixed by computing the correct scores, which are the residuals after absorbing the respective variables. Currently, they are computed as y-xb, ignoring the absorbed variables.
    https://www.kripfganz.de/stata/

  • #2
    Originally posted by Sebastian Kripfganz View Post
    I very much welcome the new absorb() option for the regress command introduced in Stata 18.5 (StataNow).
    I think you are confusing the -absorb()- option introduced in xtreg with the undocumented -absorb()- option of regress. I have StataNow 18.5 and there is no documented -absorb()- option for regress.

    [R] regress -- Linear regression
    (View complete PDF manual entry)


    Syntax

    regress depvar [indepvars] [if] [in] [weight] [, options]

    options Description
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Model
    noconstant suppress constant term
    hascons has user-supplied constant
    tsscons compute total sum of squares with constant; seldom used

    SE/Robust
    vce(vcetype) vcetype may be ols, robust, cluster clustvarlist, bootstrap, jackknife, hc2 [clustvar], or hc3

    Reporting
    level(#) set confidence level; default is level(95)
    beta report standardized beta coefficients
    eform(string) report exponentiated coefficients and label as string
    depname(varname) substitute dependent variable name; programmer's option
    clustertable display table of multiway cluster combinations
    display_options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable
    labeling

    noheader suppress output header
    notable suppress coefficient table
    plus make table extendable
    mse1 force mean squared error to 1
    coeflegend display legend instead of statistics
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------
    indepvars may contain factor variables; see fvvarlist.
    depvar and indepvars may contain time-series operators; see tsvarlist.
    bayes, bootstrap, by, collect, fmm, fp, jackknife, mfp, mi estimate, nestreg, rolling, statsby, stepwise, and svy are allowed; see prefix. For more details, see
    [BAYES] bayes: regress and [FMM] fmm: regress.
    vce(bootstrap) and vce(jackknife) are not allowed with the mi estimate prefix.
    Weights are not allowed with the bootstrap prefix.
    aweights are not allowed with the jackknife prefix.
    hascons, tsscons, vce(), beta, noheader, notable, plus, depname(), mse1, and weights are not allowed with the svy prefix.
    aweights, fweights, iweights, and pweights are allowed; see weight.
    noheader, notable, plus, mse1, and coeflegend do not appear in the dialog box.
    See [R] regress postestimation for features available after estimation.
    The undocumented -absorb()- option predates StataNow 18.5, and it effectively turns regress into areg (see #5 of this thread from 2022, for example: https://www.statalist.org/forums/for...r-svy-xtreg-fe). There are also other undocumented options of regress, e.g., syntax that allows it to estimate instrumental variables 2SLS regression. Had you used areg, it would have informed you that suest does not support areg.

    Code:
    sysuse auto, clear
    areg mpg weight, absorb(rep78)
    suest .
    Res.:

    Code:
    . suest .
    areg is not supported by suest
    r(322);
    I agree that if regress is allowed to work as areg, then it should behave the same with other post-estimation commands. However, I think that the Stata developers would argue that using undocumented options may have unintended consequences and that they cannot guarantee support for such options.
    Last edited by Andrew Musau; 03 Jul 2024, 13:06.

    Comment


    • #3
      You are absolutely right. I indeed confused xtreg, fe with regress. This happened while I was working on a new postestimation command that is supposed to work after both commands. Apparently, I got lost in the various help files I had open at the same time.

      Since absorb() is undocumented for regress, we should indeed not expect that all aspects of it work as intended.

      The reason why suest does not work after areg is that predict after areg does not have a scores option.

      It would be possible to provide correct scores, but this is a different grumble.
      https://www.kripfganz.de/stata/

      Comment


      • #4
        Whether you absorb or not, reg and suest will provide different standard errors, even for a single model.

        Code:
        sysuse auto, clear
        eststo reg: reg price mpg weight foreign
        suest reg

        Comment


        • #5
          Originally posted by George Ford View Post
          Whether you absorb or not, reg and suest will provide different standard errors, even for a single model.

          Code:
          sysuse auto, clear
          eststo reg: reg price mpg weight foreign
          suest reg
          That's simply because the standard errors from regress are not robust, while those from suest are. Even if you run regress with vce(robust), the standard errors will differ numerically in small samples due to the degrees-of-freedom correction, but they are asymptotically equivalent. In my initial example, the standard errors computed by suest are just wrong.
          https://www.kripfganz.de/stata/

          Comment

          Working...
          X