Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help in diagnosing/resolving an issue: i.year##c.Var1 interaction is 'omitted because of collinearity'

    Greetings,

    I'm running Stata 15.1 on OSX. I'm conducting a basic content analysis on New York Times articles (1980-2019) that I pulled from LexisNexis. The variables in this dataset all store the percent of total NYT articles in a given year that mention a specific term or phrase. I'd like to examine whether, over time, the percent of annual articles mentioning term X increasingly predict the percent of articles that mention Z. I thus estimated the following OLS model and received the following output:

    Code:
    . regress WSPC_YEAR c.WHITEPC_YEAR##i.year
    note: 2019.year omitted because of collinearity
    note: 1981.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1982.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1983.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1984.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1985.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1986.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1987.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1988.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1989.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1990.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1991.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1992.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1993.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1994.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1995.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1996.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1997.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1998.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 1999.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2000.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2001.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2002.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2003.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2004.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2005.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2006.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2007.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2008.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2009.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2010.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2011.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2012.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2013.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2014.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2015.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2016.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2017.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2018.year#c.WHITEPC_YEAR omitted because of collinearity
    note: 2019.year#c.WHITEPC_YEAR omitted because of collinearity
    
          Source |       SS           df       MS      Number of obs   =   142,231
    -------------+----------------------------------   F(39, 142191)   =         .
           Model |  55505.9797        39  1423.23025   Prob > F        =         .
        Residual |           0   142,191           0   R-squared       =    1.0000
    -------------+----------------------------------   Adj R-squared   =    1.0000
           Total |  55505.9797   142,230  .390255078   Root MSE        =         0
    
    -------------------------------------------------------------------------------------
              WSPC_YEAR |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
           WHITEPC_YEAR |   3.321562          .        .       .            .           .
                        |
                   year |
                  1981  |   .2738851          .        .       .            .           .
                  1982  |   .5414511          .        .       .            .           .
                  1983  |   .3368037          .        .       .            .           .
                  1984  |   .3480026          .        .       .            .           .
                  1985  |  -.1165383          .        .       .            .           .
                  1986  |   .0544986          .        .       .            .           .
                  1987  |  -.1245103          .        .       .            .           .
                  1988  |   -.289388          .        .       .            .           .
                  1989  |  -.5340557          .        .       .            .           .
                  1990  |  -.8089314          .        .       .            .           .
                  1991  |  -.3684183          .        .       .            .           .
                  1992  |  -.5237221          .        .       .            .           .
                  1993  |   .0391256          .        .       .            .           .
                  1994  |  -.1221813          .        .       .            .           .
                  1995  |   .4550614          .        .       .            .           .
                  1996  |   .7422796          .        .       .            .           .
                  1997  |   1.033317          .        .       .            .           .
                  1998  |   .8154973          .        .       .            .           .
                  1999  |   1.009021          .        .       .            .           .
                  2000  |   .5087013          .        .       .            .           .
                  2001  |   .9019873          .        .       .            .           .
                  2002  |   1.017792          .        .       .            .           .
                  2003  |   1.065856          .        .       .            .           .
                  2004  |   1.180784          .        .       .            .           .
                  2005  |   1.084137          .        .       .            .           .
                  2006  |    1.09677          .        .       .            .           .
                  2007  |   1.118167          .        .       .            .           .
                  2008  |   .5535289          .        .       .            .           .
                  2009  |   1.267848          .        .       .            .           .
                  2010  |   1.119098          .        .       .            .           .
                  2011  |   1.154076          .        .       .            .           .
                  2012  |   1.070433          .        .       .            .           .
                  2013  |   1.076881          .        .       .            .           .
                  2014  |   .6158266          .        .       .            .           .
                  2015  |   .7174238          .        .       .            .           .
                  2016  |  -.8348896          .        .       .            .           .
                  2017  |   .4079323          .        .       .            .           .
                  2018  |  -.5526747          .        .       .            .           .
                  2019  |          0  (omitted)
                        |
    year#c.WHITEPC_YEAR |
                  1981  |          0  (omitted)
                  1982  |          0  (omitted)
                  1983  |          0  (omitted)
                  1984  |          0  (omitted)
                  1985  |          0  (omitted)
                  1986  |          0  (omitted)
                  1987  |          0  (omitted)
                  1988  |          0  (omitted)
                  1989  |          0  (omitted)
                  1990  |          0  (omitted)
                  1991  |          0  (omitted)
                  1992  |          0  (omitted)
                  1993  |          0  (omitted)
                  1994  |          0  (omitted)
                  1995  |          0  (omitted)
                  1996  |          0  (omitted)
                  1997  |          0  (omitted)
                  1998  |          0  (omitted)
                  1999  |          0  (omitted)
                  2000  |          0  (omitted)
                  2001  |          0  (omitted)
                  2002  |          0  (omitted)
                  2003  |          0  (omitted)
                  2004  |          0  (omitted)
                  2005  |          0  (omitted)
                  2006  |          0  (omitted)
                  2007  |          0  (omitted)
                  2008  |          0  (omitted)
                  2009  |          0  (omitted)
                  2010  |          0  (omitted)
                  2011  |          0  (omitted)
                  2012  |          0  (omitted)
                  2013  |          0  (omitted)
                  2014  |          0  (omitted)
                  2015  |          0  (omitted)
                  2016  |          0  (omitted)
                  2017  |          0  (omitted)
                  2018  |          0  (omitted)
                  2019  |          0  (omitted)
                        |
                  _cons |  -2.465472          .        .       .            .           .
    -------------------------------------------------------------------------------------
    As shown, all the estimates for the interaction term were omitted. What's going on here and how do I resolve it?

    Here is some sample data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(WSPC_YEAR WHITEPC_YEAR year)
     .9925123 1.0393106 1980
     .8585935   .889102 1981
     .8551458  .7576681 1982
     .8682891  .8493891 1983
     .7799819  .8421696 1984
     .8046948 1.0830024 1985
     .9286741  .9659038 1986
     .7952349  1.059963 1987
      .893328 1.0757904 1988
     .9505889 1.1528882 1989
    1.0103576  1.247764 1990
     .8976109 1.1285552 1991
     .9410325  1.204068 1992
      .892553 1.0299592 1993
    1.0238991   1.04965 1994
     .9579538  .9249209 1995
     .9319496  .7980101 1996
     .9725213  .7164902 1997
    1.1973426  .7693729 1998
     .9383616  .7384797 1999
     .9658987  .8353717 2000
     .8934433  .7107584 2001
     .8972512   .664284 2002
     .9895188   .635982 2003
    1.1008787  .5841192 2004
    1.0499339  .6417411 2005
    1.0944157  .6180353 2006
    1.1638317   .620284 2007
    1.0603167  .8075929 2008
    1.0753922  .6335288 2009
    1.0665824  .6547699 2010
     1.241938   .639901 2011
     1.278908  .6912675 2012
    1.4445844  .7267002 2013
    1.4135656  .8899446 2014
    1.4643455  .9122679 2015
      1.49988  1.475882 2016
    2.0604258  1.404532 2017
    2.2694461 1.6083313 2018
     2.596925   1.61524 2019
            .         .    .
    end
    Thanks in advance for your help!


  • #2
    In your sample, you have only one value for WHITE for each year. In this case, with dummies for each year, the dummy fully takes care of the values of WHITE. You simply cannot have the interaction. Also, since you only have one value per year, and you're already putting in one parameter for each year, you're trying to estimate far more parameters than you have observations which is completely impossible.

    Comment


    • #3
      Hey Phil,

      Thanks for your reply. I have monthly (i.e. percent of all monthly articles mentioning term X) variables as well. Would you suggest regressing a yearly frequency variable on an interaction between a monthly frequency variable and year (the time variable)? That way each 'year' mean would consist of 12 values....or would that not work either?

      -Zach

      Comment


      • #4
        Zach:
        as an aside to Phil's helpful reply, I see two main issues here:
        - you do not have enough predictors to perform an OLS;
        - if you collected more predictors, since you have a long T dimension in your data (you do not tell us whether you also have a -panelid-) you should consider a panel data regression model first.
        Eventually, if you have a T dimension only, perhaps you can challenge yourself and your data with a time-series related analysis.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Hey Carlo,

          There's no panel-id, just the time variables (year, month, and year+month). Again, my goal is to assess whether temporal shifts in the percent of all (monthly or yearly) articles mentioning term A predict corresponding shifts in the percent mentioning term B. In my dataset, each term corresponds to two variables: one that stores its annual frequency and another that stores its monthly frequency (e.g. percent of all NYT articles in July that mention 'abortion').

          Here are some more sample data to illustrate (note that the stub 'YM' = the monthly frequency variables):

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(INEQPC_YEAR INEQPC_YM WHITEPC_YEAR WHITEPC_YM year yearmonth)
          .29833865   .3821461 1.0393106  .9782941 1980 245
          .29833865  .56570226 1.0393106 1.0182642 1980 246
          .29833865  .28831562 1.0393106  1.350531 1980 247
          .29833865  .27386007 1.0393106  .9585102 1980 248
          .29833865  .24723333 1.0393106 1.0242524 1980 249
          .29833865  .24363503 1.0393106  .9623584 1980 250
          .29833865  .16335763 1.0393106 1.0178436 1980 251
           .2669485  .27829313   .889102  .7288629 1981 252
           .2669485  .29299736   .889102  .9668913 1981 253
           .2669485  .19664395   .889102  .7865758 1981 254
           .2669485  .28905714   .889102 1.0048176 1981 255
           .2669485   .3062991   .889102  .8389932 1981 256
           .2669485  .24402775   .889102  .8990495 1981 257
           .2669485   .2908514   .889102  .8857747 1981 258
           .2669485  .25697032   .889102  .6809713 1981 259
           .2669485   .2388535   .889102  1.220807 1981 260
           .2669485  .20603563   .889102  .8120227 1981 261
           .2669485  .30156815   .889102  .9650181 1981 262
           .2669485   .3079292   .889102  .8981268 1981 263
          .22375576   .2217036  .7576681   .630105 1982 264
          .22375576  .27061674  .7576681 1.0254949 1982 265
          .22375576   .2243632  .7576681   .725881 1982 266
          .22375576  .21307763  .7576681   .679185 1982 267
          .22375576   .2148635  .7576681  .8341759 1982 268
          .22375576   .3187251  .7576681  .7569721 1982 269
          .22375576   .1626898  .7576681  .7456616 1982 270
          .22375576    .184259  .7576681  .6449066 1982 271
          .22375576   .2225313  .7576681  .8484006 1982 272
          .22375576  .28528902  .7576681  .6946167 1982 273
          .22375576  .20120724  .7576681   .657277 1982 274
          .22375576  .13609146  .7576681  .8029396 1982 275
           .1978943   .3261756  .8493891 1.0057081 1983 276
           .1978943  .12879221  .8493891  .7870635 1983 277
           .1978943  .22164276  .8493891  .7692308 1983 278
           .1978943   .2111654  .8493891  .9502442 1983 279
           .1978943  .19118023  .8493891 1.0578638 1983 280
           .1978943  .25327143  .8493891  .6753905 1983 281
           .1978943   .1579571  .8493891   .868764 1983 282
           .1978943   .2443992  .8493891 1.0047523 1983 283
           .1978943  .19528526  .8493891   .781141 1983 284
           .1978943  .17352504  .8493891  .7436787 1983 285
           .1978943   .1408631  .8493891  .7555385 1983 286
           .1978943  .13526309  .8493891  .7845259 1983 287
          .20869784   .2079813  .8421696  .7799298 1984 288
          .20869784  .27361563  .8421696  .7166124 1984 289
          .20869784  .26737967  .8421696  .7002801 1984 290
          .20869784  .15084852  .8421696  .9302325 1984 291
          .20869784    .197409  .8421696  .8143122 1984 292
          .20869784   .2251954  .8421696  .9670155 1984 293
          .20869784    .186753  .8421696  .9088646 1984 294
          .20869784   .1227747  .8421696  .9330878 1984 295
          .20869784  .14199045  .8421696  .8777592 1984 296
          .20869784  .25912836  .8421696  .8951707 1984 297
          .20869784   .2233805  .8421696  .7942418 1984 298
          .20869784   .2509576  .8421696  .7792894 1984 299
           .2157666 .013227513 1.0830024  .9259259 1985 300
           .2157666   .1894452 1.0830024 1.0960758 1985 301
           .2157666  .12122682 1.0830024  .9455692 1985 302
           .2157666   .2638191 1.0830024 1.0427136 1985 303
           .2157666  .33141035 1.0830024  .8960353 1985 304
           .2157666  .25843132 1.0830024  .8269803 1985 305
           .2157666  .21069266 1.0830024 1.0534633 1985 306
           .2157666  .27830487 1.0830024  1.366224 1985 307
           .2157666   .2157634 1.0830024 1.3961163 1985 308
           .2157666  .17076503 1.0830024 1.1498178 1985 309
           .2157666   .2974774 1.0830024 1.1066159 1985 310
           .2157666   .2282008 1.0830024 1.1770358 1985 311
          .17994353   .1902105  .9659038  .9637331 1986 312
          .17994353  .23952097  .9659038    .85163 1986 313
          .17994353  .17163172  .9659038  .6987863 1986 314
          .17994353   .1736973  .9659038  .8312654 1986 315
          .17994353   .1955273  .9659038 1.0020775 1986 316
          .17994353   .1265983  .9659038 1.2153437 1986 317
          .17994353  .15384616  .9659038 1.1923077 1986 318
          .17994353   .1093826  .9659038  .9722898 1986 319
          .17994353  .15471894  .9659038  .9669933 1986 320
          .17994353  .12547052  .9659038  .9125128 1986 321
          .17994353   .3046994  .9659038  .7617486 1986 322
          .17994353  .21483634  .9659038 1.2511058 1986 323
          .18278848  .11169025  1.059963 1.5512534 1987 324
          .18278848  .14571466  1.059963  .9935091 1987 325
          .18278848   .1846381  1.059963 1.1324471 1987 326
          .18278848    .222855  1.059963  .9285626 1987 327
          .18278848  .21364984  1.059963 1.3175074 1987 328
          .18278848   .1626016  1.059963 1.0704607 1987 329
          .18278848  .23495626  1.059963 .57433754 1987 330
          .18278848   .1140829  1.059963  .8619597 1987 331
          .18278848  .15963815  1.059963 1.0642543 1987 332
          .18278848  .15424775  1.059963  .8780256 1987 333
          .18278848  .17797816  1.059963 1.0204082 1987 334
          .18278848   .3134796  1.059963  1.306165 1987 335
           .1887907  .21348737 1.0757904 1.3688308 1988 336
           .1887907  .12856776 1.0757904 1.0671124 1988 337
           .1887907  .12383901 1.0757904 1.3374612 1988 338
           .1887907  .19987507 1.0757904  1.236727 1988 339
           .1887907   .2537151 1.0757904  .7128186 1988 340
           .1887907    .243079 1.0757904  .9588116 1988 341
           .1887907  .15416238 1.0757904  .9763618 1988 342
           .1887907   .2166709 1.0757904 1.0196279 1988 343
           .1887907  .09070882 1.0757904 1.0366724 1988 344
          end
          format %tm yearmonth
          So what should or can I do to get at my research question (which is whether shifts in the frequency of one term predict shifts in the frequency of another)?

          Thanks again!

          Comment


          • #6
            Zach:
            I would make things as simple(r) as:
            Code:
            . ktau INEQPC_YEAR WHITEPC_YEAR, stats(taua taub score se obs p)
            
              Number of obs =     100
            Kendall's tau-a =      -0.1291
            Kendall's tau-b =      -0.1442
            Kendall's score =    -639
                SE of score =     331.007   (corrected for ties)
            
            Test of Ho: INEQPC_YEAR and WHITEPC_YEAR are independent
                 Prob > |z| =       0.0539  (continuity corrected)
            
            .
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Carlo:
              In what way does using ktau simplify the analysis (and allow me to get at my research question)? I'm not merely interested in observing overall correlations. I'd also like to examine over time variation in the strength of correlations.

              -Zach

              Comment


              • #8
                Zach:
                I do not think that, given your data, regression will tale you any farther.
                That's why I proposed a correlation.
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment

                Working...
                X