Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Year Fixed effects vs dropping one dummy

    I am running regressions with year fixed effects, but one of my covariates is age, which is perfectly collinear with year. I’ve noticed that some papers include year dummies but drop one year to allow age to appear in the results. Could you explain the pros and cons of this approach? Is there a clear advantage to either approach, or is one generally considered better?

    Thanks!

  • #2
    Jun:
    the one level (that is one year) of the categorical variable -i.year- is dropped to protect your analysis from the so called dummy trap (Dummy variable (statistics) - Wikipedia), as you can see from the following toy-example:
    Code:
    . use "https://www.stata-press.com/data/r18/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . xtset idcode year
    
    Panel variable: idcode (unbalanced)
     Time variable: year, 68 to 88, but with gaps
             Delta: 1 unit
    
    . xtreg ln_wage c.age##c.age i.year, fe vce(cluster idcode)
    
    Fixed-effects (within) regression               Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.1162                                         min =          1
         Between = 0.1078                                         avg =        6.1
         Overall = 0.0932                                         max =         15
    
                                                    F(16, 4709)       =      79.11
    corr(u_i, Xb) = 0.0613                          Prob > F          =     0.0000
    
                                 (Std. err. adjusted for 4,710 clusters in idcode)
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
                 |
     c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
                 |
            year |
             69  |   .0647054   .0155249     4.17   0.000     .0342693    .0951415
             70  |   .0284423   .0264639     1.07   0.283    -.0234395     .080324
             71  |   .0579959   .0384111     1.51   0.131    -.0173078    .1332996
             72  |   .0510671   .0502675     1.02   0.310    -.0474808     .149615
             73  |   .0424104   .0624924     0.68   0.497    -.0801038    .1649247
             75  |   .0151376    .086228     0.18   0.861    -.1539096    .1841848
             77  |   .0340933   .1106841     0.31   0.758    -.1828994     .251086
             78  |   .0537334   .1232232     0.44   0.663    -.1878417    .2953084
             80  |   .0369475   .1473725     0.25   0.802    -.2519716    .3258667
             82  |   .0391687   .1715621     0.23   0.819    -.2971733    .3755108
             83  |    .058766   .1836086     0.32   0.749    -.3011928    .4187249
             85  |   .1042758   .2080199     0.50   0.616    -.3035406    .5120922
             87  |   .1242272   .2327328     0.53   0.594    -.3320379    .5804922
             88  |   .1904977   .2486083     0.77   0.444    -.2968909    .6778863
                 |
           _cons |   .3937532   .2469015     1.59   0.111    -.0902893    .8777957
    -------------+----------------------------------------------------------------
         sigma_u |  .40275174
         sigma_e |  .30127563
             rho |  .64120306   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    
    . tab year
    
      Interview |
           year |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             68 |      1,375        4.82        4.82
             69 |      1,232        4.32        9.14
             70 |      1,686        5.91       15.05
             71 |      1,851        6.49       21.53
             72 |      1,693        5.93       27.47
             73 |      1,981        6.94       34.41
             75 |      2,141        7.50       41.91
             77 |      2,171        7.61       49.52
             78 |      1,964        6.88       56.40
             80 |      1,847        6.47       62.88
             82 |      2,085        7.31       70.18
             83 |      1,987        6.96       77.15
             85 |      2,085        7.31       84.45
             87 |      2,164        7.58       92.04
             88 |      2,272        7.96      100.00
    ------------+-----------------------------------
          Total |     28,534      100.00
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Carlo Lazzaro Thanks Carlo! My question was more about whether the results of putting in dummies for years vs year fixed effects are essentially the same? The results seem to vary slightly.. but is that not suppose to happen? Thanks!

      Comment


      • #4
        Jun:
        if you do not -xtset- your data with -timevar-, you will obtain th every same results:
        Code:
        . xtset idcode
        
        Panel variable: idcode (unbalanced)
        
        . xtreg ln_wage c.age##c.age i.year, fe vce(cluster idcode)
        
        Fixed-effects (within) regression               Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-squared:                                      Obs per group:
             Within  = 0.1162                                         min =          1
             Between = 0.1078                                         avg =        6.1
             Overall = 0.0932                                         max =         15
        
                                                        F(16, 4709)       =      79.11
        corr(u_i, Xb) = 0.0613                          Prob > F          =     0.0000
        
                                     (Std. err. adjusted for 4,710 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
                 age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
                     |
         c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
                     |
                year |
                 69  |   .0647054   .0155249     4.17   0.000     .0342693    .0951415
                 70  |   .0284423   .0264639     1.07   0.283    -.0234395     .080324
                 71  |   .0579959   .0384111     1.51   0.131    -.0173078    .1332996
                 72  |   .0510671   .0502675     1.02   0.310    -.0474808     .149615
                 73  |   .0424104   .0624924     0.68   0.497    -.0801038    .1649247
                 75  |   .0151376    .086228     0.18   0.861    -.1539096    .1841848
                 77  |   .0340933   .1106841     0.31   0.758    -.1828994     .251086
                 78  |   .0537334   .1232232     0.44   0.663    -.1878417    .2953084
                 80  |   .0369475   .1473725     0.25   0.802    -.2519716    .3258667
                 82  |   .0391687   .1715621     0.23   0.819    -.2971733    .3755108
                 83  |    .058766   .1836086     0.32   0.749    -.3011928    .4187249
                 85  |   .1042758   .2080199     0.50   0.616    -.3035406    .5120922
                 87  |   .1242272   .2327328     0.53   0.594    -.3320379    .5804922
                 88  |   .1904977   .2486083     0.77   0.444    -.2968909    .6778863
                     |
               _cons |   .3937532   .2469015     1.59   0.111    -.0902893    .8777957
        -------------+----------------------------------------------------------------
             sigma_u |  .40275174
             sigma_e |  .30127563
                 rho |  .64120306   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        .
        If what above does not reply to your query, posting an example/excerpt (via -dataex-) of your data in support to your question can help enormously.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment

        Working...
        X