Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Use estout to report postselection coefficients after lasso

    I would like to use esttab/estout to report postselection coefficients after running lasso.

    For example
    Code:
    sysuse auto, clear
    lasso linear price mpg headroom trunk
    esttab
    matrix list e(b_postselection)
    I would like esttab to display the results stored in the e(b_postselection) matrix. Any suggestions on how to do this? Thank you!

  • #2
    estout is from the Stata Journal/ SSC (FAQ Advice #12).

    Code:
    mat p= e(b_postselection)'
    esttab mat(p), collabel("postsel. coefs.") mlabel(none)
    Last edited by Andrew Musau; 01 Apr 2021, 16:46.

    Comment


    • #3
      Thank you very much Andrew. This achieves what I had asked for. I realized however my initial question was incomplete. I would like esttab to display p-values, significance stars, etc. e(b_postselection) only contains point estimates, but r(table) has the associated statistics I need. Is there a simple way to get estout to reference r(table) and calculate significant starts etc. from that?

      My goal is to do something like this:
      Code:
      sysuse auto, clear
      eststo, title(OLS): reg price mpg headroom trunk
      eststo, title("LASSO (postselection)"): lasso linear price mpg headroom trunk
      esttab
      However this code (1) doesn't display the postselection, and (2) doesn't display t-stats or significance stars. Thoughts on how to fix those two issues? Thanks again!!

      Comment


      • #4
        Code:
        sysuse auto, clear
        eststo, title("LASSO (postselection)"): lasso linear price mpg headroom trunk
        mat se= r(table)["se", 1...]
        mat pval= r(table)["pvalue", 1...]
        estadd matrix serr = se
        estadd matrix pval = pval
        esttab, cells(b(star pvalue(pval) fmt(%9.3f)) serr(par fmt(%9.3f))) collab(none) stats(N, fmt(0))
        Res.:

        Code:
        . esttab, cells(b(star pvalue(pval) fmt(%9.3f)) serr(par fmt(%9.3f))) collab(none) stats(N, fmt(0))
        
        ----------------------------
                              (1)   
                            price   
        ----------------------------
        mpg              -194.376** 
                         (65.593)   
        trunk               8.677   
                         (88.719)   
        _cons           10185.570***
                       (2349.084)   
        ----------------------------
        N                      74   
        ----------------------------

        Comment


        • #5
          Thanks Andrew. Unless I'm missing something, these are still not displaying the post-selection coefficients?

          Comment


          • #6
            Ah... reference the post-selection coefficients matrix.

            Code:
            sysuse auto, clear
            eststo lasso: lasso linear price mpg headroom trunk
            mat se= r(table)["se", 1...]
            mat pval= r(table)["pvalue", 1...]
            estadd matrix serr = se
            estadd matrix pval = pval
            esttab, cells(b_postselection(star pvalue(pval) fmt(%9.3f)) serr(par fmt(%9.3f))) collab(none) stats(N, fmt(0))

            Code:
            . esttab, cells(b_postselection(star pvalue(pval) fmt(%9.3f)) serr(par fmt(%9.3f))) collab(none) stats(N, fmt(0))
            
            ----------------------------
                                  (1)   
                                price   
            ----------------------------
            mpg              -220.165** 
                             (65.593)   
            trunk              43.559   
                             (88.719)   
            _cons           10254.950***
                           (2349.084)   
            ----------------------------
            N                      74   
            ----------------------------

            Comment


            • #7
              Got it, thank you! To get both the normal OLS and LASSO post-selection coefficients to show up on the same lines, I had to create new matrices with the same name after both the OLS and LASSO and then reference that. Probably a better way of doing it, but that seemed to work. Thanks again for your assistance, hugely appreciated!

              Comment


              • #8
                Depending on how often you do this, you can write a program that posts these estimates to e(b) and e(V). The program below calls erepost from SSC.

                Code:
                *ssc install erepost, replace
                cap prog drop lassotoestout
                prog define lassotoestout, eclass
                mat b= e(b_postselection)
                mat se= r(table)["se", 1...]
                eststo lasso: qui regress `e(post_sel_vars)'
                mat V=e(V)
                forval i = 1/`=colsof(se)'{
                    mat V[`i',`i']= (se[1, `i'])^2
                }
                erepost b=b, rename
                erepost V=V, rename
                end
                
                *TEST
                sysuse auto, clear
                eststo ols: regress price mpg headroom trunk
                lasso linear price mpg headroom trunk
                *ALWAYS RUN PROGRAM AFTER LASSO COMMAND
                lassotoestout
                esttab ols lasso, scalars(N) mtitles(" OLS " "Lasso") nonumb
                Res.:

                Code:
                . esttab ols lasso, scalars(N) mtitles(" OLS " "Lasso") nonumb
                
                --------------------------------------------
                                     OLS            Lasso  
                --------------------------------------------
                mpg                -224.4***       -220.2**
                                  (-3.44)         (-3.36)  
                
                headroom           -659.5                  
                                  (-1.36)                  
                
                trunk               126.6           43.56  
                                   (1.18)          (0.49)  
                
                _cons             11175.8***      10254.9***
                                   (4.60)          (4.37)  
                --------------------------------------------
                N                      74              74  
                --------------------------------------------
                t statistics in parentheses
                * p<0.05, ** p<0.01, *** p<0.001
                Last edited by Andrew Musau; 05 Apr 2021, 12:19.

                Comment


                • #9
                  It just occurred to me that the lasso estimates you want are estimates from the regression with the post-selection variables. Therefore, the easiest way is:

                  Code:
                  sysuse auto, clear
                  eststo ols: regress price mpg headroom trunk
                  lasso linear price mpg headroom trunk
                  eststo lasso: regress `e(post_sel_vars)'
                  esttab ols lasso, scalars(N) mtitles(" OLS " "Lasso") nonumb

                  Comment

                  Working...
                  X