  • Collinearity Dropping Everything in Fixed Effects Regression


    I am running individual-level fixed on panel data. The panel is perfectly balanced. My main outcome is a dummy variable. I have a gender dummy and year categorical variable and the interaction between the two.

    The codes that I have tried:

    . xtreg emp_in_lockdown male time male##i.time, fe
    note: male omitted because of collinearity
    note: time omitted because of collinearity
    note: 1.male omitted because of collinearity
    note: 4.time omitted because of collinearity
    note: 1.male#4.time omitted because of collinearity
    Fixed-effects (within) regression               Number of obs     =     50,439
    Group variable: id                              Number of groups  =     50,439
    R-sq:                                           Obs per group:
         within  =      .                                         min =          1
         between =      .                                         avg =        1.0
         overall =      .                                         max =          1
                                                    F(0,0)            =       0.00
    corr(u_i, Xb)  =      .                         Prob > F          =          .
    emp_in_loc~n |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            male |          0  (omitted)
            time |          0  (omitted)
          1.male |          0  (omitted)
          4.time |          0  (omitted)
       male#time |
            1 4  |          0  (omitted)
           _cons |   .2222288          .        .       .            .           .
         sigma_u |  .41574825
         sigma_e |          .
             rho |          .   (fraction of variance due to u_i)
    F test that all u_i=0: F(50438, 0) = .                       Prob > F =      .
    The output from reghdfe:
    . reghdfe emp_in_lockdown male time male##i.time, absorb(id)
    (dropped 50439 singleton observations)
    insufficient observations
    I think so the sample strength is not large enough to support any regression or time may be collinear with fixed effects. Example of data using dataex:

    * Example generated by -dataex-. To install: ssc install dataex
    input float id byte(emp_in_lockdown male) float time
     1 . 1 1
     1 . 1 2
     1 . 1 3
     1 0 . 4
     2 . 0 1
     2 . 0 2
     2 . 0 3
     2 0 . 4
     3 . 0 1
     3 . 0 2
     3 . 0 3
     3 0 . 4
     4 . 0 1
     4 . 0 2
     4 . 0 3
     4 0 . 4
     5 . 0 1
     5 . 0 2
     5 . 0 3
     5 0 . 4
     6 . 1 1
     6 . 1 2
     6 . 1 3
     6 0 1 4
     7 . 0 1
     7 . 0 2
     7 . 0 3
     7 1 0 4
     8 . 0 1
     8 . 0 2
     8 . 0 3
     8 0 0 4
     9 . . 1
     9 . . 2
     9 . . 3
     9 0 . 4
    10 . . 1
    10 . . 2
    10 . . 3
    10 0 . 4
    11 . 1 1
    11 . 1 2
    11 . 1 3
    11 0 . 4
    12 . 0 1
    12 . 0 2
    12 . 0 3
    12 0 . 4
    13 . 1 1
    13 . 1 2
    13 . 1 3
    13 0 . 4
    14 . 0 1
    14 . 0 2
    14 . 0 3
    14 0 . 4
    15 . 1 1
    15 . 1 2
    15 . 1 3
    15 0 . 4
    16 . 1 1
    16 . 1 2
    16 . 1 3
    16 0 . 4
    17 . 0 1
    17 . 0 2
    17 . 0 3
    17 0 . 4
    18 . 1 1
    18 . 1 2
    18 . 1 3
    18 0 . 4
    19 . 1 1
    19 . 1 2
    19 . 1 3
    19 0 . 4
    20 . 1 1
    20 . 1 2
    20 . 1 3
    20 0 1 4
    21 . 0 1
    21 . 0 2
    21 . 0 3
    21 0 0 4
    22 . 1 1
    22 . 1 2
    22 . 1 3
    22 0 1 4
    23 . 1 1
    23 . 1 2
    23 . 1 3
    23 0 1 4
    24 . 0 1
    24 . 0 2
    24 . 0 3
    24 0 0 4
    28 . 1 1
    28 . 1 2
    28 . 1 3
    28 0 . 4

  • #2
    1) your regressand is a two-level categorical variable: so why going -xtreg- (linear probability model?);
    2) your dataset is full of missing values, that save only one observation per panel: hence, any panel data regression (that needs at least two waves of data) fails.
    Kind regards,
    (StataNow 18.5)


    • #3
      Originally posted by Carlo Lazzaro View Post
      1) your regressand is a two-level categorical variable: so why going -xtreg- (linear probability model?);
      2) your dataset is full of missing values, that save only one observation per panel: hence, any panel data regression (that needs at least two waves of data) fails.
      Dear Carlo Lazzaro ,

      I tried using the simple reg. Again everything other than gender dummy gets omitted. I tried areg but did not work. I tried to use household fixed effects but I encountered the same problems.

      . reg emp_in_lockdown male time male##i.time
      note: time omitted because of collinearity
      note: 1.male omitted because of collinearity
      note: 4.time omitted because of collinearity
      note: 1.male#4.time omitted because of collinearity
            Source |       SS           df       MS      Number of obs   =    50,439
      -------------+----------------------------------   F(1, 50437)     =   9452.45
             Model |  1375.98189         1  1375.98189   Prob > F        =    0.0000
          Residual |  7342.05514    50,437  .145568831   R-squared       =    0.1578
      -------------+----------------------------------   Adj R-squared   =    0.1578
             Total |  8718.03703    50,438  .172846604   Root MSE        =    .38153
      emp_in_loc~n |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              male |   .3311417    .003406    97.22   0.000     .3244659    .3378175
              time |          0  (omitted)
            1.male |          0  (omitted)
            4.time |          0  (omitted)
         male#time |
              1 4  |          0  (omitted)
             _cons |      .0451    .002491    18.10   0.000     .0402175    .0499824


      • #4
        missing values (that imply casewise deletion) strike back.
        If you do not manage this issue, you cannot move forward.
        Kind regards,
        (StataNow 18.5)

