Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman Model for Selection Bias with Panel Data

    Hello everyone,

    I am examining the relation between tax planning and firm value. My data is an unbalanced panel data and I estimated the regression with fixed effects. As I suspect a selection bias, I want to perform a heckman selection model.
    However, I am not sure about the control variables and whether to exclude any in the model.

    My regression looked like this:
    HTML Code:
    reghdfe tobin btd  TA sales ppe vol nolassets foreigninc std ltd rd, absorb(Year sic) vce(cluster isin Year)
    I tried the Heckman approach and the code looks like this:
    HTML Code:
    heckman tobin btd TA sales ppe vol nolassets foreigninc std ltd rd i.Year, select(taxavoid = TA sales vol ppe nolassets foreigninc std ltd rd i.Year) vce(robust)
    HTML Code:
    Heckman selection model                         Number of obs     =      2,829
    (regression model with sample selection)        Censored obs      =      1,435
                                                    Uncensored obs    =      1,394
    
                                                    Wald chi2(19)     =     237.86
    Log pseudolikelihood = -4144.806                Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
                 |               Robust
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    tobin        |
           w_btd |   6.904373   1.240759     5.56   0.000     4.472531    9.336215
            w_TA |  -4.898546   .6854721    -7.15   0.000    -6.242046   -3.555045
           sales |  -.0730264   .0401021    -1.82   0.069    -.1516251    .0055723
            ppe2 |  -1.255607   .1520859    -8.26   0.000     -1.55369   -.9575241
             vol |   .0260516   .0290061     0.90   0.369    -.0307993    .0829026
       nolassets |   121.1902   113.3512     1.07   0.285     -100.974    343.3544
      foreigninc |    .014931   .0835962     0.18   0.858    -.1489145    .1787765
             std |   .3343805   .8761062     0.38   0.703    -1.382756    2.051517
             ltd |   .4227263   .1981058     2.13   0.033     .0344461    .8110065
        changerd |   23.37676   4.954492     4.72   0.000     13.66613    33.08739
                 |
                 |
           _cons |    1.31306   .1498813     8.76   0.000     1.019298    1.606822
    -------------+----------------------------------------------------------------
    taxavoid     |
            w_TA |   .6723251   .3413934     1.97   0.049     .0032062    1.341444
           sales |  -.2075946   .0338522    -6.13   0.000    -.2739438   -.1412455
             vol |   .0559122   .0211032     2.65   0.008     .0145507    .0972737
            ppe2 |   .5709638   .1226611     4.65   0.000     .3305524    .8113752
       nolassets |   419.0152   123.7301     3.39   0.001     176.5086    661.5217
      foreigninc |   .1127357   .0660463     1.71   0.088    -.0167126     .242184
             std |   1.129645   .5186263     2.18   0.029     .1131565    2.146134
             ltd |   .1060875   .0973646     1.09   0.276    -.0847435    .2969186
        changerd |  -.0004607   1.700443    -0.00   1.000    -3.333269    3.332347
                 |
                 |
           _cons |   .0981113   .0917423     1.07   0.285    -.0817003    .2779229
    -------------+----------------------------------------------------------------
         /athrho |   -.009978   .0570314    -0.17   0.861    -.1217576    .1018015
        /lnsigma |   .2156416   .0404441     5.33   0.000     .1363726    .2949106
    -------------+----------------------------------------------------------------
             rho |  -.0099777   .0570258                     -.1211595    .1014513
           sigma |   1.240658   .0501773                      1.146109    1.343006
          lambda |  -.0123789   .0707588                     -.1510635    .1263057
    ------------------------------------------------------------------------------
    Wald test of indep. eqns. (rho = 0): chi2(1) =     0.03   Prob > chi2 = 0.8611

    I also tried using the code xtheckmanfe but it does keep calculating and calculating and does not deliver any results.
    1.) My question is now, is the code for the model conducted right?
    2.) My fixed effects model before also showed a positive relation between tax avoidance (btd) and firm value (Tobin). Am I interpreting the Heckman model right that the coefficient of 6.9 also shows a positive relation or do I have to interpret it in another way?
    3.) Is it possible that I do not have a selection problem because I cannot reject the H0 of the pictured Wald test?
    I am using Stata 14.

    Thank you in advance!

  • #2
    I see some potential problems here. First, the Heckman approach is really only suited to cases where data are missing on y. Do you have missing x variables, too?

    Second, you really need a variable that affects selection — tax avoidance — but not y. You seem to have assumed the opposite.

    Third, this way of implementing Heckman does not make it more robust. In fact, it makes the very strong assumption that the explanatory variables are independent of the heterogeneity in both equations. This is a very strong assumption.

    Fourth, this command rules out serial correlation in the outcome and selection equations. Again, this is way too strong.

    If data are only missing on y and you find an exclusion restriction, it’s much easier and more robust to use the pooled correlated random effects approach in my 1995 Journal of Econometrics paper.

    Comment


    • #3
      Hello Jeff,

      thank you very much for your answer!

      First, yes I potentially have missing variables as I have missing ones on y (firm value) and on x (tax avoidance).

      Second, thank you I see what you mean. I misunderstood it and just implemented all control variables in the select command but as you stated I just have to implement those that directly affect tax avoidance. I correct the code to this:

      HTML Code:
      heckman tobin btd TA sales ppe vol nolassets foreigninc std ltd rd i.Year, select(taxavoid = TA nolassets foreigninc std ltd i.Year) vce(robust)
      About the third and fourth command, are those statistical problems removed if the variables included in "select" only include this affected with tax avoidance or is this "heckman" code generally a problem here?

      Sorry for my further questions but I've never worked with heckman and I'm just trying to figure it out.

      Comment


      • #4
        Dear Jeff Wooldridge, with regard to the paper you suggest in #2, I am wondering about the differences of using a FE estimation in the second stage instead of a RE (with Mundlak) or a pooled OLS (also with Mundlak). Without bodering you too much, can you clarify a little bit about it?
        Thanks a lot in advance.

        Comment


        • #5
          Originally posted by Jeff Wooldridge View Post
          I see some potential problems here. First, the Heckman approach is really only suited to cases where data are missing on y. Do you have missing x variables, too?

          Second, you really need a variable that affects selection — tax avoidance — but not y. You seem to have assumed the opposite.

          Third, this way of implementing Heckman does not make it more robust. In fact, it makes the very strong assumption that the explanatory variables are independent of the heterogeneity in both equations. This is a very strong assumption.

          Fourth, this command rules out serial correlation in the outcome and selection equations. Again, this is way too strong.

          If data are only missing on y and you find an exclusion restriction, it’s much easier and more robust to use the pooled correlated random effects approach in my 1995 Journal of Econometrics paper.
          Dear Prof. Wooldridge,

          I have a data with missing y but no missing values for x's. Could you please suggest how to do heckman sample selection for a linear panel model with random effect manually [without using stata package as that works only for versions 16 onwards].

          Thanks,
          Nitin

          Comment

          Working...
          X