Hello everyone,
I am running a regression to analyze the impact of election years and landslide elections on disclosure size, which the log of website size. My model includes an interaction term between electionyear and landslide, but this interaction term is omitted due to collinearity. Here is a data sample:
(Note on variables - ein is a unique identifier for each organization, logdisclsize is the log of website size, org is the type of organization, electionyear is a binary indicator for whether there was an election, landslide is a binary indicator for whether there was a landslide election, size and the winsorized variables are controls)
Here is one of the regressions I run, and the output:
I understand the interaction is omitted due to collinearity - every landslide is also an electionyear, and only 5.94% of observations are landslides (and 7% for this subset of REPR organizations). However, all electionyears are not landslides. How can I interpret these main effects given that I ideally would have wanted the coefficient on the interaction term?
Moreover, does anybody have advice on presenting this table in a paper or even testing it differently, as I am told that academically it is best practice to run the regression with an interaction rather than just the main effects, but I've never seen a table presented with the interaction omitted as it is here? Thank you so much!
I am running a regression to analyze the impact of election years and landslide elections on disclosure size, which the log of website size. My model includes an interaction term between electionyear and landslide, but this interaction term is omitted due to collinearity. Here is a data sample:
(Note on variables - ein is a unique identifier for each organization, logdisclsize is the log of website size, org is the type of organization, electionyear is a binary indicator for whether there was an election, landslide is a binary indicator for whether there was a landslide election, size and the winsorized variables are controls)
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(logdisclsize electionyear landslide size wins_ExecComp wins_leverage wins_ContribReliance) long ein float year str16 state str4 org 10.84887 0 0 16.34981 .06247739 .01618405 .7696673 10248780 2011 "ME" "ENV" 10.694057 0 0 16.353073 .06970697 .011605377 .6577324 10248780 2013 "ME" "ENV" 10.795568 1 0 16.415674 .08384993 .06066541 .7097442 10248780 2014 "ME" "ENV" 10.870756 0 0 16.435535 .07682364 .05358697 .59489995 10248780 2015 "ME" "ENV" 11.00408 0 0 16.493631 .10753528 .03700265 .7087289 10248780 2016 "ME" "ENV" 10.983002 1 0 16.5836 .07915805 .037205808 .51900214 10248780 2018 "ME" "ENV" 11.102217 0 0 16.619827 .073426425 .031817265 .6083745 10248780 2019 "ME" "ENV" 10.12475 0 0 15.689817 .09311032 .02294062 .8391001 10270690 2013 "ME" "ENV" 11.652548 0 0 15.76943 .09187462 .025827337 .8749067 10270690 2015 "ME" "ENV" 11.8213 0 0 15.754642 .09122185 .035800748 .8071181 10270690 2016 "ME" "ENV" 12.106854 0 0 15.84974 .08144714 .03633184 .9579841 10270690 2017 "ME" "ENV" 12.061775 1 0 16.01647 .07850363 .031095315 .9270397 10270690 2018 "ME" "ENV" 12.056853 0 0 16.306654 .073099725 .02553698 .931366 10270690 2019 "ME" "ENV" 10.518646 0 0 15.52521 .031852446 .08306593 .7954195 10317679 2012 "ME" "REPR" 10.63246 0 0 15.641062 .03481711 .1558381 .7699555 10317679 2013 "ME" "REPR" 10.839287 1 0 15.705325 .024914693 .180689 .6935283 10317679 2014 "ME" "REPR" 10.85532 0 0 15.618464 .02366554 .1922071 .7194572 10317679 2015 "ME" "REPR" 10.849357 0 0 15.548765 .02367953 .19987574 .7202281 10317679 2016 "ME" "REPR" 10.888838 0 0 15.575302 .033151954 .18535903 .7151666 10317679 2017 "ME" "REPR" 11.097547 1 0 15.578058 .033256307 .1112484 .7204551 10317679 2018 "ME" "REPR" end
Code:
reghdfe logdisclsize electionyear##landslide size wins_ExecComp wins_leverage wins > _ContribReliance if org == "REPR", absorb (ein year) cluster(state) (dropped 18 singleton observations) (MWFE estimator converged in 6 iterations) note: 1.electionyear#1.landslide omitted because of collinearity HDFE Linear regression Number of obs = 1,323 Absorbing 2 HDFE groups F( 6, 46) = 0.99 Statistics robust to heteroskedasticity Prob > F = 0.4457 R-squared = 0.9897 Adj R-squared = 0.9874 Within R-sq. = 0.0048 Number of clusters (state) = 47 Root MSE = 0.5070 (Std. err. adjusted for 47 clusters in state) ------------------------------------------------------------------------------------ | Robust logdisclsize | Coefficient std. err. t P>|t| [95% conf. interval] -------------------+---------------------------------------------------------------- 1.electionyear | .0775608 .0513554 1.51 0.138 -.0258123 .1809339 1.landslide | .0218505 .0929655 0.24 0.815 -.1652793 .2089803 | electionyear#| landslide | 0 1 | 0 (empty) 1 1 | 0 (omitted) | size | .0347288 .0279295 1.24 0.220 -.0214904 .090948 wins_ExecComp | -.0763669 .2829463 -0.27 0.788 -.6459082 .4931745 wins_leverage | -.0495822 .1025383 -0.48 0.631 -.255981 .1568167 wins_ContribReli~e | -.0879667 .1880076 -0.47 0.642 -.4664063 .2904729 _cons | 7.392996 .3775152 19.58 0.000 6.633097 8.152895 ------------------------------------------------------------------------------------ Absorbed degrees of freedom: -----------------------------------------------------+ Absorbed FE | Categories - Redundant = Num. Coefs | -------------+---------------------------------------| ein | 234 0 234 | year | 9 1 8 | -----------------------------------------------------+
Moreover, does anybody have advice on presenting this table in a paper or even testing it differently, as I am told that academically it is best practice to run the regression with an interaction rather than just the main effects, but I've never seen a table presented with the interaction omitted as it is here? Thank you so much!
Comment