Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quadratic term omitted when using # operator

    Hello all,

    I am running xtlogit command in Stata 14.2 and my main variable of interest includes a quadratic term, which I included based on theory and a utest confirming the presence of a U-shape.

    I am working with an unbalanced panel with 15,165 observations (see example data below). The panel variable is id_ocad and the time variable is semester

    My concern is that when I run the command using the # operator to generate the quadratic, the coefficient on the quadratic term is reported as 0 and the standard error is omitted in the output table.

    Code:
     xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap c.months_election##c.months_election i.semester, fe vce(oim)
    Code:
    . xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap c.months_election##c.months_election i.semester, fe vce(oim)
    note: c.months_election#c.months_election omitted because of collinearity
    note: 12.semester omitted because of collinearity
    note: multiple positive outcomes within groups encountered.
    note: 139 groups (622 obs) dropped because of all positive or
          all negative outcomes.
    
    Iteration 0:   log likelihood = -2597.9842  
    Iteration 1:   log likelihood = -2448.5756  
    Iteration 2:   log likelihood = -2437.0854  
    Iteration 3:   log likelihood = -2437.0664  
    Iteration 4:   log likelihood = -2437.0664  
    
    Conditional fixed-effects logistic regression   Number of obs     =      6,652
    Group variable: id_ocad                         Number of groups  =        796
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        8.4
                                                                  max =         10
    
                                                    LR chi2(16)       =    1162.32
    Log likelihood  = -2437.0664                    Prob > chi2       =     0.0000
    
    -----------------------------------------------------------------------------------------------------
                           prob_project |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------------------------+----------------------------------------------------------------
                      n_projects_cumlag |  -.2028613   .0168779   -12.02   0.000    -.2359414   -.1697811
                        ln_densidad_pob |   11.82582   58.98681     0.20   0.841    -103.7862    127.4378
                           ln_poblacion |  -7.168727   58.99419    -0.12   0.903    -122.7952    108.4578
                                        |
                    ln_indice_desempeno |
                                    L2. |   .3498155     .18455     1.90   0.058    -.0118958    .7115267
                                        |
                           ln_tasa_mort |
                                    L2. |  -.0167774   .0661643    -0.25   0.800     -.146457    .1129022
                                        |
                             ln_balance |
                                    L2. |  -3.038936   2.949626    -1.03   0.303    -8.820098    2.742225
                                        |
                   ln_regalias_efec_cap |   .0736869   .0083789     8.79   0.000     .0572645    .0901094
                        months_election |  -.3035517   .0288018   -10.54   0.000    -.3600022   -.2471011
                                        |
    c.months_election#c.months_election |          0  (omitted)
                                        |
                               semester |
                                     4  |  -.5233661   .1549722    -3.38   0.001     -.827106   -.2196262
                                     5  |   -3.47403    .295229   -11.77   0.000    -4.052668   -2.895392
                                     6  |  -4.443102   .4478829    -9.92   0.000    -5.320936   -3.565268
                                     7  |  -6.573392   .6058894   -10.85   0.000    -7.760914   -5.385871
                                     8  |  -7.220639   .7680139    -9.40   0.000    -8.725919    -5.71536
                                     9  |   3.111472    .491074     6.34   0.000     2.148984    4.073959
                                    10  |   2.046095   .3227467     6.34   0.000     1.413523    2.678667
                                    11  |   .6660429   .1767568     3.77   0.000     .3196059     1.01248
                                    12  |          0  (omitted)
    -----------------------------------------------------------------------------------------------------
    However, when I manually generate the quadratic term and include it in the (otherwise) identical regression, the coefficient is reported as statistically significant and non-zero, and a utest confirms the presence of a U-shape, as mentioned above.

    I imagine there is a reason for the different outputs, which may tell me something important about my data and the appropriateness of the model I am running.

    In addition, as I would like to use margins after estimation, I would need to use the # operator to generate the quadratic term if possible.

    Thank you in advance for any suggestions.

    Best regards,

    Theo

    Code:
    xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap months_election months_election_sq i.semester, fe vce(oim)
    utest months_election months_election_sq, prefix ( prob_project )
    Code:
    . xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap months_election months_election_sq i.semester, fe vce(oim)
    note: 10.semester omitted because of collinearity
    note: 12.semester omitted because of collinearity
    note: multiple positive outcomes within groups encountered.
    note: 139 groups (622 obs) dropped because of all positive or
          all negative outcomes.
    
    Iteration 0:   log likelihood = -2597.9842  
    Iteration 1:   log likelihood = -2448.5756  
    Iteration 2:   log likelihood = -2437.0854  
    Iteration 3:   log likelihood = -2437.0664  
    Iteration 4:   log likelihood = -2437.0664  
    
    Conditional fixed-effects logistic regression   Number of obs     =      6,652
    Group variable: id_ocad                         Number of groups  =        796
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        8.4
                                                                  max =         10
    
                                                    LR chi2(16)       =    1162.32
    Log likelihood  = -2437.0664                    Prob > chi2       =     0.0000
    
    --------------------------------------------------------------------------------------
            prob_project |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------------+----------------------------------------------------------------
       n_projects_cumlag |  -.2028613   .0168779   -12.02   0.000    -.2359414   -.1697811
         ln_densidad_pob |   11.82582   58.98681     0.20   0.841    -103.7862    127.4378
            ln_poblacion |  -7.168727   58.99419    -0.12   0.903    -122.7952    108.4578
                         |
     ln_indice_desempeno |
                     L2. |   .3498155     .18455     1.90   0.058    -.0118958    .7115267
                         |
            ln_tasa_mort |
                     L2. |  -.0167774   .0661643    -0.25   0.800     -.146457    .1129022
                         |
              ln_balance |
                     L2. |  -3.038936   2.949626    -1.03   0.303    -8.820098    2.742225
                         |
    ln_regalias_efec_cap |   .0736869   .0083789     8.79   0.000     .0572645    .0901094
         months_election |  -1.838123   .2689011    -6.84   0.000    -2.365159   -1.311087
      months_election_sq |    .028418   .0044826     6.34   0.000     .0196323    .0372037
                         |
                semester |
                      4  |  -.5233661   .1549722    -3.38   0.001     -.827106   -.2196262
                      5  |  -5.520125   .5898451    -9.36   0.000      -6.6762    -4.36405
                      6  |  -10.58139   1.377183    -7.68   0.000    -13.28062   -7.882158
                      7  |  -18.84996   2.491531    -7.57   0.000    -23.73327   -13.96665
                      8  |  -27.68159   3.932863    -7.04   0.000    -35.38986   -19.97332
                      9  |  -3.026814   .5554395    -5.45   0.000    -4.115455   -1.938172
                     10  |          0  (omitted)
                     11  |   .6660429   .1767568     3.77   0.000     .3196059     1.01248
                     12  |          0  (omitted)
    --------------------------------------------------------------------------------------
    
    . utest months_election months_election_sq, prefix (prob_project)
    (983 missing values generated)
    (1,996 missing values generated)
    
    Specification: f(x)=x^2
    Extreme point:  32.34084
    
    Test:
         H1: U shape
     vs. H0: Monotone or Inverse U shape
    
    -------------------------------------------------
                     |   Lower bound      Upper bound
    -----------------+-------------------------------
    Interval         |           0               42
    Slope            |   -1.838123          .548988
    t-value          |   -6.835684         5.063425
    P>|t|            |    4.44e-12         2.11e-07
    -------------------------------------------------
    
    Overall test of presence of a U shape:
         t-value =      5.06
         P>|t|   =  2.11e-07
    Code:
    input float(prob_project n_projects_cumlag ln_densidad_pob ln_poblacion ln_indice_desempeno ln_tasa_mort ln_balance ln_regalias_efec_cap months_election months_election_sq semester) long id_ocad
    0   0  3.945458  9.845434   4.21763  2.961141   13.6579   11.47631 42 1764  1     0
    0   0  3.945458  9.845434   4.21763  2.961141   13.6579   11.47631 36 1296  2     0
    0   0 3.9661324  9.865941 4.2298265  2.947067 13.654828  10.629907 30  900  3     0
    0   0 3.9661324  9.865941 4.2298265  2.947067 13.654828  10.629907 24  576  4     0
    1   0 3.9862025  9.886138 4.3641763  3.884652 13.655166   12.31328 18  324  5     0
    0   1 3.9862025  9.886138 4.3641763  3.884652 13.655166   12.31328 12  144  6     0
    0   1 4.0066056  9.906583 4.0745883  1.541159 13.651732  12.624626  6   36  7     0
    0   1 4.0066056  9.906583 4.0745883  1.541159 13.651732  12.624626  0    0  8     0
    0   1  4.027492  9.927351 4.1196294  3.016025 13.637353    11.8977 42 1764  9     0
    0   1  4.027492  9.927351 4.1196294  3.016025 13.637353    11.8977 36 1296 10     0
    0   1  4.047253  9.947169  4.064282         . 13.651488   11.55211 30  900 11     0
    0   1  4.047253  9.947169  4.064282         . 13.651488   11.55211 24  576 12     0
    0   1  4.067316   9.96726         .         .         .          . 18  324 13     0
    0   1  4.067316   9.96726         .         .         .          . 12  144 14     0
    0   0  3.902377 12.139313  3.984617  2.933325  13.65175  11.759857 42 1764  1 60092
    1   0  3.902377 12.139313  3.984617  2.933325  13.65175  11.759857 36 1296  2 60092
    1  17  3.924149  12.16117   4.34484  3.034472 13.619888  11.960607 30  900  3 60092
    1  20  3.924149  12.16117   4.34484  3.034472 13.619888  11.960607 24  576  4 60092
    1  42  3.946038   12.1829  3.397157 2.9343886  13.68308  11.899978 18  324  5 60092
    1  57  3.946038   12.1829  3.397157 2.9343886  13.68308  11.899978 12  144  6 60092
    0  89  3.967458 12.204366 4.2517734  2.933325 13.644894   7.416076  6   36  7 60092
    1  89  3.967458 12.204366 4.2517734  2.933325 13.644894   7.416076  0    0  8 60092
    0  94  3.988799 12.225733 4.3862324 2.8673306 13.640287  11.284286 42 1764  9 60092
    1  94  3.988799 12.225733 4.3862324 2.8673306 13.640287  11.284286 36 1296 10 60092
    0 104 4.0098753  12.24682  4.229876         . 13.642162  11.316903 30  900 11 60092
    1 104 4.0098753  12.24682  4.229876         . 13.642162  11.316903 24  576 12 60092
    1 105 4.0306945   12.2676         .         .         .          . 18  324 13 60092
    1 109 4.0306945   12.2676         .         .         .          . 12  144 14 60092
    0   0  4.325456  9.145802  3.890944 2.3702438  13.65227  12.265366 42 1764  1 60093
    1   0  4.325456  9.145802  3.890944 2.3702438  13.65227  12.265366 36 1296  2 60093
    1   6 4.3317857  9.152076  3.598994 3.3991954 13.653942   13.31222 30  900  3 60093
    1   7 4.3317857  9.152076  3.598994 3.3991954 13.653942   13.31222 24  576  4 60093
    0  11  4.338989  9.159258  4.244644  2.519308 13.654224   12.26798 18  324  5 60093
    1  11  4.338989  9.159258  4.244644  2.519308 13.654224   12.26798 12  144  6 60093
    1  14  4.344195  9.164506  4.115339         . 13.645218  12.139977  6   36  7 60093
    1  17  4.344195  9.164506  4.115339         . 13.645218  12.139977  0    0  8 60093
    0  18  4.351052 9.1713915 4.0765953 3.8811514  13.65453  11.402854 42 1764  9 60093
    1  18  4.351052 9.1713915 4.0765953 3.8811514  13.65453  11.402854 36 1296 10 60093
    0  21 4.3574777  9.177817 4.1624994         . 13.653556   11.16387 30  900 11 60093
    1  21 4.3574777  9.177817 4.1624994         . 13.653556   11.16387 24  576 12 60093
    1  23  4.364372  9.184612         .         .         .          . 18  324 13 60093
    1  24  4.364372  9.184612         .         .         .          . 12  144 14 60093
    0   0  4.148517 10.408164  4.179895  1.978239 13.654896  10.157875 42 1764  1 60094
    1   0  4.148517 10.408164  4.179895  1.978239 13.654896  10.157875 36 1296  2 60094
    0   2   4.14091  10.40053  4.229979 2.0399208 13.653278   11.41471 30  900  3 60094
    1   2   4.14091  10.40053  4.229979 2.0399208 13.653278   11.41471 24  576  4 60094
    0   5  4.133405 10.392926   4.19092  3.098289 13.652154   10.42813 18  324  5 60094
    1   5  4.133405 10.392926   4.19092  3.098289 13.652154   10.42813 12  144  6 60094
    1   6   4.12552  10.38508  4.190453  3.221672 13.651053  10.259785  6   36  7 60094
    1   8   4.12552  10.38508  4.190453  3.221672 13.651053  10.259785  0    0  8 60094
    0  10 4.1174097 10.377016  4.222297 3.3697066 13.654318 -1.7917595 42 1764  9 60094
    0  10 4.1174097 10.377016  4.222297 3.3697066 13.654318 -1.7917595 36 1296 10 60094
    0  10 4.1097255 10.369295  4.272544         .  13.65218   10.17851 30  900 11 60094
    1  10 4.1097255 10.369295  4.272544         .  13.65218   10.17851 24  576 12 60094
    0  11 4.1014857  10.36107         .         .         .          . 18  324 13 60094
    1  11 4.1014857  10.36107         .         .         .          . 12  144 14 60094
    end

  • #2
    Dear Theodore
    The problem that you are facing is not because Stata is arbitrarily dropping the interaction. If you look at your output, when the squared term is included manually, you drop to dummies associated with "semester".
    This tells me that "semester" and Months of the election are perfectly collinear. and adding the interaction manually only masks this problem.
    I would suggest reconsidering using "semester" as an explanatory variable, if you are indeed interested in "months of election". Alternatively, try to see why those variables are not linearly independent.
    Best
    Fernando

    Comment

    Working...
    X