Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Applying the Heckman selection model in panel data with fixed effects

    Hello all,

    I run a fixed effects regression in a linear probability model of health outcomes/behaviors and local employment change over three waves. One of these behaviors is the quantity of cigarettes consumed. It was suggested that an OLS model with Heckman correction for number of cigarettes consumed would model the decision to smoke or not, and then conditional on this fact, the quantity smoked. I agree with this, but am not sure how to apply a Heckman selection model in panel data with fixed effects.

    In my analysis I model several outcomes and behaviors in Stata as below, and would like to keep this approach when applying the heckman correction, for comparability across outcomes studied and also because I need to apply weights to my analysis of cigarette consumption.

    I saw a suggestion on stack exchange to cluster the standard errors on the panel id (https://stats.stackexchange.com/ques...and-panel-data) so would that mean updating my current clustering from county to individual id? I xtset the data by id year.

    Alternatively I saw a comment by Phil Bromiley that
    Fixed effects can be done with i.panel in heckman. You'll probably need to increase matsize and you'll end up with a pile of parameter estimate on the panels that are not of interest. xtreg y x with the panel called panel is identical to reg y x i.panel
    (https://www.statalist.org/forums/for...for-panel-data) but I don't know what that would mean in an applied sense in Stata.

    Although I found that the UNESCAP suggest doing the following:

    Heckman depvar indepvar1 indepvar2 … dum1 dum2 …, select(indepvar1 indepvar2 … dum1 dum2 … overidvar1…) options

    https://artnet.unescap.org/tid/artnet/mtg/cbtr7-s12.pdf

    But I'm not even sure what the dummies I'm supposed to add are....


    I thought xtheckman might save me, but it's a random effects regression with selection and I need fixed effects (https://www.stata.com/new-in-stata/xtheckman/).

    I would really appreciate applied advice on what I should do to my analysis to apply a Heckman correction.

    Thanks for any help,

    John

    This is my core model:

    Code:
    . xtreg no_cigs_cons_deflated_y  psum_unemployed_total_cont_y i.yrlycurrent_county_y1 i.year age_y i.marita
    > lstatus_y if has_y0_questionnaire==1 & has_y5_questionnaire==1, cluster (current_county_y1) fe robust 
    note: 6.yrlycurrent_county_y1 omitted because of collinearity
    note: 15.yrlycurrent_county_y1 omitted because of collinearity
    note: 18.yrlycurrent_county_y1 omitted because of collinearity
    note: 23.yrlycurrent_county_y1 omitted because of collinearity
    note: 25.yrlycurrent_county_y1 omitted because of collinearity
    note: 26.yrlycurrent_county_y1 omitted because of collinearity
    note: 29.yrlycurrent_county_y1 omitted because of collinearity
    note: 5.year omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs      =      1152
    Group variable: id                              Number of groups   =       642
    
    R-sq:  within  = 0.0605                         Obs per group: min =         1
           between = 0.0179                                        avg =       1.8
           overall = 0.0145                                        max =         2
    
                                                    F(13,28)           =         .
    corr(u_i, Xb)  = -0.8476                        Prob > F           =         .
    
                                         (Std. Err. adjusted for 29 clusters in current_county_y1)
    ----------------------------------------------------------------------------------------------
                                 |               Robust
         no_cigs_cons_deflated_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------------------+----------------------------------------------------------------
    psum_unemployed_total_cont_y |  -.2387741   .1100417    -2.17   0.039    -.4641842    -.013364
                                 |
           yrlycurrent_county_y1 |
                          Clare  |    1.84201   2.511288     0.73   0.469    -3.302129     6.98615
                           Cork  |   .9439361   2.271351     0.42   0.681    -3.708716    5.596588
                        Donegal  |          0  (omitted)
                      Dublin 16  |   .0798436   2.427069     0.03   0.974    -4.891781    5.051468
                    Dublin City  |   1.268084   2.435825     0.52   0.607    -3.721478    6.257646
         Dún Laoghaire-Rathdown  |   .4580872   2.367576     0.19   0.848    -4.391673    5.307847
                         Fingal  |   .1145035   2.333406     0.05   0.961    -4.665262    4.894269
                         Galway  |  -16.52429   .3514215   -47.02   0.000    -17.24415   -15.80444
                    Galway City  |  -17.09233   .4548787   -37.58   0.000     -18.0241   -16.16055
                          Kerry  |   1.898583   2.566648     0.74   0.466    -3.358958    7.156123
                        Kildare  |   1.688322   2.394418     0.71   0.487     -3.21642    6.593064
                       Kilkenny  |          0  (omitted)
                          Laois  |   2.852193   1.208139     2.36   0.025      .377433    5.326952
                        Leitrim  |   2.076192   2.333259     0.89   0.381    -2.703273    6.855657
                       Limerick  |          0  (omitted)
                       Longford  |   .5373577   2.372396     0.23   0.822    -4.322276    5.396991
                          Louth  |   1.385586   2.386451     0.58   0.566    -3.502838     6.27401
                           Mayo  |  -17.88611   .3841588   -46.56   0.000    -18.67302    -17.0992
                          Meath  |   .1920723   2.276061     0.08   0.933    -4.470227    4.854372
                       Monaghan  |          0  (omitted)
                         Offaly  |   .9486299   2.335269     0.41   0.688    -3.834952    5.732212
                      Roscommon  |          0  (omitted)
                          Sligo  |          0  (omitted)
                   South Dublin  |   .0798436   2.427069     0.03   0.974    -4.891781    5.051468
                      Tipperary  |  -.0933459   .3734837    -0.25   0.804    -.8583927    .6717008
                Tipperary North  |          0  (omitted)
                      Waterford  |  -15.97167   .4552278   -35.09   0.000    -16.90416   -15.03918
                      Westmeath  |   1.313337   2.349551     0.56   0.581      -3.4995    6.126175
                        Wexford  |   -.604106   2.456075    -0.25   0.808    -5.635147    4.426935
                        Wicklow  |   3.927572    3.03076     1.30   0.206    -2.280659     10.1358
                                 |
                          5.year |          0  (omitted)
                           age_y |   .0837821   .0470026     1.78   0.086    -.0124983    .1800625
                                 |
                 maritalstatus_y |
                     Cohabiting  |   .5289705   .4076338     1.30   0.205    -.3060295    1.363971
                      Separated  |   -.547115   .1271997    -4.30   0.000    -.8076718   -.2865582
                       Divorced  |  -6.950598   1.454566    -4.78   0.000    -9.930142   -3.971054
                        Widowed  |    3.47176   1.996616     1.74   0.093    -.6181229    7.561643
           Single/Never married  |  -1.460055   1.615909    -0.90   0.374    -4.770094    1.849984
                                 |
                           _cons |   5.822622   2.518999     2.31   0.028     .6626857    10.98256
    -----------------------------+----------------------------------------------------------------
                         sigma_u |  9.0440127
                         sigma_e |  3.4804153
                             rho |  .87100821   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------------------
    And I want to model it as something like

    Code:
    heckman no_cigs_cons_y psum_unemployed_total_cont_y i.yrlycurrent_county_y1 i.year age_y i.maritalstatus_y [pw=ipw55] if has_y0_questionnaire==1 & has_y5_questionnaire==1, select(age_y medical_card_y i.year) vce (cluster id)

  • #2
    Thinking about this now, I understand that the reason to cluster at the individual id is to control for within individual correlation in the standard errors when applying a Heckman correction in panel data, but I'm not sure how or why to include dummy variables, could anyone please advise this?

    Thank you for your time,

    John

    Comment


    • #3
      Hello, have you tried using the xtheckmanfe command? You can install it via ssc. It is a little slow, but its worked for me so far.

      Comment

      Working...
      X