Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Later waves missing in xtreg output despite being present in panel data

    Hi everyone,
    I'm working with panel data that spans 14 waves (survey years 2008/09 to 2021/22). I have verified via a tab wave command that all waves are present in the data. However, when I run the following regression

    Code:
    xtreg ln_income i.wave##i.ho_experience_pre c.age##c.age c.working_hours##c.working_hours isco_group industry yeduc ft_empl marry kids career_orient, fe vce(cluster id)
    
    Fixed-effects (within) regression               Number of obs     =     14,582
    Group variable: id                              Number of groups  =      2,680
    
    R-squared:                                      Obs per group:
         Within  = 0.3281                                         min =          2
         Between = 0.5410                                         avg =        5.4
         Overall = 0.5172                                         max =         11
    
                                                    F(32, 2679)       =      76.17
    corr(u_i, Xb) = 0.2180                          Prob > F          =     0.0000
    
                                                        (Std. err. adjusted for 2,680 clusters in id)
    -------------------------------------------------------------------------------------------------
                                    |               Robust
                          ln_income | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    --------------------------------+----------------------------------------------------------------
                               wave |
                         2 2009/10  |   .0681846   .0128561     5.30   0.000     .0429758    .0933934
                         3 2010/11  |   .0604937   .0190624     3.17   0.002     .0231151    .0978723
                         4 2011/12  |   .0724554   .0275179     2.63   0.009      .018497    .1264139
                         5 2012/13  |   .0799369   .0338413     2.36   0.018     .0135793    .1462946
                         6 2013/14  |   .1038995   .0418849     2.48   0.013     .0217695    .1860295
                         7 2014/15  |   .1159107   .0491002     2.36   0.018     .0196325    .2121888
                         8 2015/16  |   .1617871   .0572526     2.83   0.005     .0495233    .2740509
                         9 2016/17  |   .1749853   .0651693     2.69   0.007     .0471981    .3027725
                        10 2017/18  |    .188722   .0730147     2.58   0.010     .0455512    .3318928
                        11 2018/19  |    .189117   .0818202     2.31   0.021     .0286799    .3495541
                                    |
                1.ho_experience_pre |  -.1041576   .0390347    -2.67   0.008    -.1806987   -.0276165
                                    |
             wave#ho_experience_pre |
                       2 2009/10#1  |   .0938141    .038596     2.43   0.015     .0181331    .1694951
                       3 2010/11#1  |   .1262978   .0401339     3.15   0.002     .0476014    .2049943
                       4 2011/12#1  |   .1249205   .0396449     3.15   0.002     .0471829    .2026582
                       5 2012/13#1  |   .1408421   .0395729     3.56   0.000     .0632455    .2184386
                       6 2013/14#1  |   .1555202   .0399638     3.89   0.000     .0771572    .2338831
                       7 2014/15#1  |   .1327628   .0405883     3.27   0.001     .0531752    .2123503
                       8 2015/16#1  |   .1445162   .0402785     3.59   0.000     .0655361    .2234962
                       9 2016/17#1  |    .101765   .0405026     2.51   0.012     .0223455    .1811844
                      10 2017/18#1  |   .1424053   .0402076     3.54   0.000     .0635642    .2212464
                      11 2018/19#1  |    .167149   .0417178     4.01   0.000     .0853465    .2489514
                                    |
                                age |   .0826679   .0103653     7.98   0.000     .0623431    .1029928
                                    |
                        c.age#c.age |  -.0009762   .0000923   -10.57   0.000    -.0011572   -.0007951
                                    |
                      working_hours |     .03485   .0027815    12.53   0.000     .0293959    .0403041
                                    |
    c.working_hours#c.working_hours |  -.0003158   .0000324    -9.76   0.000    -.0003792   -.0002523
                                    |
                         isco_group |  -.0286973   .0124213    -2.31   0.021    -.0530535    -.004341
                           industry |  -.0065527   .0058398    -1.12   0.262    -.0180037    .0048983
                              yeduc |   .0509694   .0108919     4.68   0.000     .0296121    .0723267
                            ft_empl |   .1602967   .0140782    11.39   0.000     .1326915    .1879018
                              marry |    .024089   .0102281     2.36   0.019     .0040333    .0441448
                               kids |   -.037875   .0144466    -2.62   0.009    -.0662027   -.0095473
                      career_orient |  -.0037158   .0019303    -1.93   0.054    -.0075008    .0000692
                              _cons |   4.127984   .3203886    12.88   0.000      3.49975    4.756218
    --------------------------------+----------------------------------------------------------------
                            sigma_u |  .32644559
                            sigma_e |  .20413771
                                rho |  .71888487   (fraction of variance due to u_i)
    -------------------------------------------------------------------------------------------------
    the output only displays coefficients for waves 2 through 11. The later waves (12–14) are missing from the regression output, even though I know they exist in the data.

    Here’s what I’ve already checked:
    • The tab wave output confirms waves 12–14 are present and has sufficient data.
    • The sample filtering steps don’t seem to drop observations from the later waves.
    • The key variable ho_experience_pre is constructed using a “freeze” approach based on wave 11, but I intended it to be applied to all waves (meaning the waves12-14 contain the values from prior wave11). This cannot be the issue because if I use the unfreezed version ho_experience I still have the same issue.
    Has anyone encountered similar issues or have suggestions on why waves 12–14 might not be included in the regression output? Any help or insights would be greatly appreciated.

    Thanks in advance!

  • #2
    Sophia:
    re-run as:
    Code:
     
     xtreg ln_income i.wave##i.ho_experience_pre c.age##c.age c.working_hours##c.working_hours isco_group industry yeduc ft_empl marry kids career_orient, fe vce(cluster id) allbase
    and see what Stata gives you back.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Thank you Carlo for your quick response. Unfortunately, it stays the same:
      Code:
      xtreg ln_income i.wave##i.ho_experience_pre c.age##c.age c.working_hours##c.working_hours isco_group industry yeduc ft_empl marry kids career_orient, fe vce(cluster id) allbase
      
      Fixed-effects (within) regression               Number of obs     =     14,582
      Group variable: id                              Number of groups  =      2,680
      
      R-squared:                                      Obs per group:
           Within  = 0.3281                                         min =          2
           Between = 0.5410                                         avg =        5.4
           Overall = 0.5172                                         max =         11
      
                                                      F(32, 2679)       =      76.17
      corr(u_i, Xb) = 0.2180                          Prob > F          =     0.0000
      
                                                          (Std. err. adjusted for 2,680 clusters in id)
      -------------------------------------------------------------------------------------------------
                                      |               Robust
                            ln_income | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      --------------------------------+----------------------------------------------------------------
                                 wave |
                           1 2008/09  |          0  (base)
                           2 2009/10  |   .0681846   .0128561     5.30   0.000     .0429758    .0933934
                           3 2010/11  |   .0604937   .0190624     3.17   0.002     .0231151    .0978723
                           4 2011/12  |   .0724554   .0275179     2.63   0.009      .018497    .1264139
                           5 2012/13  |   .0799369   .0338413     2.36   0.018     .0135793    .1462946
                           6 2013/14  |   .1038995   .0418849     2.48   0.013     .0217695    .1860295
                           7 2014/15  |   .1159107   .0491002     2.36   0.018     .0196325    .2121888
                           8 2015/16  |   .1617871   .0572526     2.83   0.005     .0495233    .2740509
                           9 2016/17  |   .1749853   .0651693     2.69   0.007     .0471981    .3027725
                          10 2017/18  |    .188722   .0730147     2.58   0.010     .0455512    .3318928
                          11 2018/19  |    .189117   .0818202     2.31   0.021     .0286799    .3495541
                                      |
                    ho_experience_pre |
                                   0  |          0  (base)
                                   1  |  -.1041576   .0390347    -2.67   0.008    -.1806987   -.0276165
                                      |
               wave#ho_experience_pre |
                         1 2008/09#0  |          0  (base)
                         1 2008/09#1  |          0  (base)
                         2 2009/10#0  |          0  (base)
                         2 2009/10#1  |   .0938141    .038596     2.43   0.015     .0181331    .1694951
                         3 2010/11#0  |          0  (base)
                         3 2010/11#1  |   .1262978   .0401339     3.15   0.002     .0476014    .2049943
                         4 2011/12#0  |          0  (base)
                         4 2011/12#1  |   .1249205   .0396449     3.15   0.002     .0471829    .2026582
                         5 2012/13#0  |          0  (base)
                         5 2012/13#1  |   .1408421   .0395729     3.56   0.000     .0632455    .2184386
                         6 2013/14#0  |          0  (base)
                         6 2013/14#1  |   .1555202   .0399638     3.89   0.000     .0771572    .2338831
                         7 2014/15#0  |          0  (base)
                         7 2014/15#1  |   .1327628   .0405883     3.27   0.001     .0531752    .2123503
                         8 2015/16#0  |          0  (base)
                         8 2015/16#1  |   .1445162   .0402785     3.59   0.000     .0655361    .2234962
                         9 2016/17#0  |          0  (base)
                         9 2016/17#1  |    .101765   .0405026     2.51   0.012     .0223455    .1811844
                        10 2017/18#0  |          0  (base)
                        10 2017/18#1  |   .1424053   .0402076     3.54   0.000     .0635642    .2212464
                        11 2018/19#0  |          0  (base)
                        11 2018/19#1  |    .167149   .0417178     4.01   0.000     .0853465    .2489514
                                      |
                                  age |   .0826679   .0103653     7.98   0.000     .0623431    .1029928
                                      |
                          c.age#c.age |  -.0009762   .0000923   -10.57   0.000    -.0011572   -.0007951
                                      |
                        working_hours |     .03485   .0027815    12.53   0.000     .0293959    .0403041
                                      |
      c.working_hours#c.working_hours |  -.0003158   .0000324    -9.76   0.000    -.0003792   -.0002523
                                      |
                           isco_group |  -.0286973   .0124213    -2.31   0.021    -.0530535    -.004341
                             industry |  -.0065527   .0058398    -1.12   0.262    -.0180037    .0048983
                                yeduc |   .0509694   .0108919     4.68   0.000     .0296121    .0723267
                              ft_empl |   .1602967   .0140782    11.39   0.000     .1326915    .1879018
                                marry |    .024089   .0102281     2.36   0.019     .0040333    .0441448
                                 kids |   -.037875   .0144466    -2.62   0.009    -.0662027   -.0095473
                        career_orient |  -.0037158   .0019303    -1.93   0.054    -.0075008    .0000692
                                _cons |   4.127984   .3203886    12.88   0.000      3.49975    4.756218
      --------------------------------+----------------------------------------------------------------
                              sigma_u |  .32644559
                              sigma_e |  .20413771
                                  rho |  .71888487   (fraction of variance due to u_i)
      -------------------------------------------------------------------------------------------------
      
      .

      Comment


      • #4
        Well, you say that the usual sample filtering rules don't seem to drop observations from the later waves, but you don't say how you arrived at that conclusion.

        The most likely culprit here is missing values in the model variables. Any observation that has a missing value in any model variable is necessarily excluded from the analysis. So let's check that:
        Code:
        egen excludable = rowmiss(ln_income wave ho_experience_pre age working_hours ///
            isco_group industry yeduc ft_empl marry kids career_orientcluster id)
        replace excludable = min(1, excludable)
        
        tab wave excludable
        Perhaps the table will show that all observations in the late waves are excludable on this basis.

        If that doesn't turn out to be the answer, I think your chances of getting more helpful advice would be greatly improved by posting some example data that reproduces the problem. Be sure to use the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data. Also bear in mind that in order to be helpful, your example data needs to include all of the model variables and also result in the same problem you are currently having when run with the same code you are using.

        Comment


        • #5
          Thank you Clyde,
          that was helpful. I now have an answer and need to check where I did something wrong with filtering etc.:
          Code:
          . tab wave excludable
          
              Survey |      excludable
                year |         0          1 |     Total
          -----------+----------------------+----------
           1 2008/09 |     1,100        197 |     1,297
           2 2009/10 |     1,365        201 |     1,566
           3 2010/11 |     1,369        212 |     1,581
           4 2011/12 |     1,396        192 |     1,588
           5 2012/13 |     1,385        166 |     1,551
           6 2013/14 |     1,327        165 |     1,492
           7 2014/15 |     1,311        154 |     1,465
           8 2015/16 |     1,369        141 |     1,510
           9 2016/17 |     1,369        149 |     1,518
          10 2017/18 |     1,332        144 |     1,476
          11 2018/19 |     1,259        137 |     1,396
          12 2019/20 |         0      1,265 |     1,265
          13 2020/21 |         0      1,236 |     1,236
          14 2021/22 |         0      1,186 |     1,186
          -----------+----------------------+----------
               Total |    14,582      5,545 |    20,127

          Comment

          Working...
          X