Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logit/Probit Model with Two Fixed Effects

    Dear Statalist,

    I have a cross-sectional dataset of between-firm relationships. Each relationship features a firm and its partner. A firm could have multiple partners and a partner could be matched with multiple firms.

    I would like to run a logit or probit model with a dummy variable as the dependent variable. The dummy equals one whenever the firm owns the partner. On the right-hand side, I would like to include firm fixed effects and the partner's industry fixed effects. Since there are 35,214 firms and 156 partner industries, directly adding these two fixed effects into a logit or probit model takes forever to run. I cannot seem to find an equivalent of reghdfe for logit or probit.

    I also tried to xtset the two fixed effects, but Stata does not allow me to do that because the two fixed effects do not uniquely identify the observations in this dataset (a firm may have multiple partners within the same industry). Alternatively, I can xtset firm and partner fixed effects, as they uniquely identify the observations in the dataset. However, this does not seem like a good solution, as I have a cross-sectional dataset, not a panel.

    I am wondering if there is any way to get around this problem? Thanks!

  • #2
    The words "fixed effects" are inappropriate here, in my view (even though they are widely used by economists). What you want to include as explanatory variables, it appears, is a set of binary indicator variables to allow your model intercept to vary with firm and partner industry. You do not have panel data; -xtset- is irrelevant. The indicators can be entered in the regression using factor variable notation.

    You haven't said how large your dataset is. Whatever, "forever to run" is very vague. Minutes? Hours? Days?

    There are Stata modules for linear regression models with high dimensional "fixed effects" that uses the "Guimaraes and Portugal" algorithm and are fast (relatively speaking). E.g. -reghdfe- (and references therein) on SSC; and -gpreg- on SSC. If you are prepared to fit a linear probability model rather than logit or probit, these are probably your 'friends'. I am aware of Poisson regression models with high dimensional fixed effects (see recent SJ article), but not analogous logit/probit models.

    Comment


    • #3
      Thank you, Stephen. I have 469,544 firm-partner relationships with 61,077 unique firms and 81,664 unique partners (156 partner industries). I have tried reghdfe with firm and industry indicators. It generates good results. My reviewers ask me to try probit and logit regressions to examine robustness. I have tried adding these indicators directly in logit and probit regressions and run the commands on my iMac (3.2 GHz Quad-Core Intel Core i5 with 32 GB memory) and my PC (2.1 GHz, 8cores*2 processor, 64GB RAM). Neither computer generates any results within a span of several hours. I use 6-core Stata 16 MP.

      Comment


      • #4
        You have 2 computers. Why not leave one running for as long as it takes? "Several hours" doesn't seem all that long to me. Try "overnight" for a start, and look at the iteration log to get clues about convergence speed.

        Also consider using the maximize options to -logit- (or -probit-). For example, get Stata to pump out trace and gradient information at each iteration? Also, consider using the from() option -- can you feed the estimates from the model fitted by -reghdfe- as starting values to your -logit- regression?

        Comment


        • #5
          Thanks for the excellent suggestions. I will give it a try!

          Comment

          Working...
          X