Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interacting firm fixed effects for numerous firms with a continuous variable - Execution and interpretation

    Dear all,

    I am attempting to replicate a series of OLS regressions with respect to the effect of CEO overconfidence on net debt issues for my master's thesis and am encountering several issues while doing so. The full OLS regression in the original paper looks as follows:

    D =β1+β2*FD+β3*OC+β4*CV+β5*FD x OC+β6*FD x CV+ ε

    Where D = net debt issued, FD = Financial deficit, OC = binary overconfidence measure and CV = set of control variables at both CEO and firm level

    As you can see, all control variables are included both as level effects and interacted with FD. The coefficient of interest is the interaction term between FD and OC, as I want to know whether overconfidence affects the relative amount of debt issued when a financing deficit occurs in a firm. The authors further add year fixed effects to control for hot equity issuance markets, firm fixed effects to separate effects attributed to overconfident CEOs from time-invariant firm effects and cluster standard errors at firm level. Lastly, they add the interactions of firm fixed effects with FD. I understand that this is done in order to estimate separate intercepts and slopes for each individual firm. The test is supposed to identify the impact of overconfidence on the proportion of the financing deficit covered with debt using only variation that is not confounded by firm-specific effects.

    I have replicated this regression in Stata with the following code:

    Code:
    xtset firmid year
    xtreg ndissn i.longholder##c.fidefn c.ceostock##c.fidefn c.ceovest##c.fidefn c.profch##c.fidefn c.tangch##c.fidefn c.qch##c.fidefn c.logsalech##c.fidefn c.booklev##c.fidefn i.year i.firmid##c.fidefn,fe vce(cluster firmid)
    Where ndissn = net debt issued, fidefn = financial deficit, longholder = overconfidence measures, i.year = year fixed effects, i.firmid = firm fixed effects and all other variables are control variables.


    This is where my issues start. Firstly, in the text the authors mention that when they add the interaction between firm fixed effects and FD, they drop the level effect of FD to avoid collinearity. In their table of results (Table V., attached), the level effect of FD is indeed not reported. However, as far as I understand, dropping the level effect of FD changes the interpretation of the interaction terms of FD and other variables. As FD is interacted with almost every variable in this regression, I do not quite understand what that implies for the interpretation of the results. In my current code, the level effect is included automatically due to the ## factor variable I use and I am somewhat reluctant to use the # factor variable to eliminate the level effect without understanding what it does to my results.

    Secondly, interacting firm fixed effects with FD appears to be a computationally heavy command for Stata. The authors have a dataset of 2385 observations with 263 firms, which is still somewhat feasible, but my dataset contains 7960 observations with 1418 firms. It therefore takes a long time (1hour+) for the code to run, and I have to run it 5 times (adding a control variable/effect every time). As I want to experiment with various changes to my dataset, I would like to know whether there a manner to speed this process up. Perhaps some form of interacting a continuous variable with the xtreg,fe command? I read that for interactions in fixed effects regressions, demeaning the product term is a possibility, but that the standard errors calculated would not be accurate. Is this an option and if so, how would I go about doing that in Stata?

    Excuse the lengthy post, but I figured that eliciting answers without adequate background information would prove difficult. I look forward to your insights.

    Kind regards,

    Marc


    Source paper

    Malmendier, U., Tate, G. & Yan, J. (2011). Overconfidence and early-life experiences: the effect of managerial traits on corporate finance policies. Journal of Finance, 66(5), 1687-1733.

    The analysis of concern spans from page 1711 to page 1714 and starts at the heading "Specification 2: Financing Deficit".

    The author's regression results are summarized in Table V. Debt vs Equity (II): Financing Deficit (attached as .png).
    Attached Files
    Last edited by Marc Dierick; 26 Apr 2019, 05:20.

  • #2
    First, some unsolicited advice. You can simplify your code by taking advantage of factor-variable notation's algebraic properties:
    Code:
    xtset firmid year
    xtreg ndissn (i.longholder c.ceostock c.ceovest c.profch c.tangch c.qch c.logsalech c.booklev i.firmid)##c.fidefn i.year,fe vce(cluster firmid)


    Firstly, in the text the authors mention that when they add the interaction between firm fixed effects and FD, they drop the level effect of FD to avoid collinearity. In their table of results (Table V., attached), the level effect of FD is indeed not reported. However, as far as I understand, dropping the level effect of FD changes the interpretation of the interaction terms of FD and other variables. As FD is interacted with almost every variable in this regression, I do not quite understand what that implies for the interpretation of the results. In my current code, the level effect is included automatically due to the ## factor variable I use and I am somewhat reluctant to use the # factor variable to eliminate the level effect without understanding what it does to my results.
    If FD is a time-invariant property of the firm, then there is no choice but to omit it from the model: it will be colinear with the firm fixed effects and the level effects of FD are unidentifiable in a fixed effects model. Leave your ## notation as it is. If your analogous variable is, in fact, also time-invariant within firms, Stata will automatically omit it for you. By the way, the omission of this level effect does not change the interpretation of the other model parameters in this circumstance. In general, it does, but not when the variable is omitted due to colinearity. The omission due to colinearity implies that the information contained in the variable is already accounted for in the model by the other variable(s) with which it is colinaer, so there is no problem.

    As to your second question, I don't know the answer. I can certainly see why this would be computationally intensive: you are hugely increasing the dimensionality of the matrix that needs to be inverted, and the time required to invert a matrix goes up by, if I remember correctly, the cube of the dimensionality. Have you tried using -reghdfe- (by Sergio Correa, available from SSC) instead of -xtreg, fe-. That might be faster--I don't know. An approach that would definitely be much faster, but involves a different kind of model, is to use random rather than fixed firm effects (and, correspondingly, random slopes) and run this in -mixed-.

    That said, my sympathy for having to endure 1 hour runs is limited. In my workflow, runs that take a couple of days are not unusual. My best counsel is patience, caffeine, and something else to occupy your time (a good book, Statalist, whatever).

    Comment


    • #3
      Dear Clyde,

      Thank you for the quick response. I did not know I could simplify my code in this manner - thank you for mentioning it.

      With regards to the first question, Stata actually does not automatically omit the level effects of FD in this regression, because it is not a time-invariant property of each firm. The financing deficit (FD) is different for each year for every firm. In this case, forcibly omitting the level effects would change the interpretation of the coefficients as it is not omitted due to colinearity. This is exactly why it confuses me that the authors do appear to omit the level effects.

      As to my second question, I will try the -reghdfe- command and report back whether that decreased the computation time. I don't think I can use a random effects model here without affecting the economic meaning of what I'm trying to find, but thank you for the suggestion.

      If nothing works, I will simply accept the waiting time and keep in mind that some models do indeed take much longer to run than mine.


      Comment


      • #4
        Update:

        The -reghdfe- worked like a charm! I used the following code and had my results in mere seconds:

        Code:
        reghdfe ndissn (i.longholder c.ceostock c.ceovest c.profch c.tangch c.qch c.logsalech c.booklev i.firmid)##c.fidefn, absorb(i.year i.firmid##c.fidefn) vce(cluster firmid)
        Moreover, this command tells me that FD is indeed probably colinear with fixed effects, as you hypothesized Clyde. Intriguing, as it is economically not a time-invariant property of firms, but apparently it is still statistically colinear with time-invariant firm properties.

        Anyways, thank you very much for bringing this command to my attention! This will save me countless hours.

        Kind regards,

        Marc

        Comment

        Working...
        X