Problem with 2SLS Regressions for Firm-Month and Firm-Year Data: High Standard Errors and Covariance Matrix Issues

Philipp Goedecke

Join Date: Jan 2024
Posts: 6

Problem with 2SLS Regressions for Firm-Month and Firm-Year Data: High Standard Errors and Covariance Matrix Issues

20 Jan 2025, 01:30

Hi all,

I am attempting to replicate results from a paper to extend its analysis in subsequent research. Currently, I’m running into challenges with my 2SLS regressions and would greatly appreciate your insights. I suspect my code or data set being at fault as i get meaningful results for my OLS regressions
As an example, I am analyzing the effect of climate patents being granted on cumulative abnormal returns (CARs) over the subsequent 18 months. For this, I split the data into terciles based on the Media Coverage of Climate Change (MCCC), represented by three dummy variables: MCCC_H, MCCC_M, and MCCC_L.

When conducting the 2SLS regressions, I am encountering two primary problems:

Very high standard errors in some cases
No results at all due to the covariance matrix not being of full rank (example regression output included below).

I suspect the issue could stem from either the dataset or my code (possibly both). Partialing out variables has somewhat alleviated the problem, but I’m not fully confident in this approach and whetehr I chose the right variables.

Data:
The dataset consists of firm-month panel data with approximately 25,000 observations in total. However, each regression excludes the majority of observations and typically uses 7,500–8,000 observations per CAR regression.

Model:
I am using a two-stage least squares (2SLS) regression with extensive fixed effects and firm-level controls. My primary Stata command is ivreghdfe. Here’s an outline of the regression setup:

Dependent variable - CAR_k_w from time t to time t + k
Main independent variable: ln_Num_Pats_Granted_w instrumented using avg_leniency
Firm Controls: market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1
Fixed effects: Industry × Month F.E., Art Unit × Year F.E., and Num_Pat_App
FE Standard Errors: Double-clustered at art-unit and industry-year level (represented by egen cluster_var = group(IND_ID dec_year))
Partialing Out:I am currently partialing out my firm controls - however I have to admit im not to familiar with this type of procedure.

Equation for the seconds stage regression:

Click image for larger version

Name: Screenshot 2025-01-20 092138.png
Views: 1
Size: 15.9 KB
ID: 1771072

Regression Code:

Code:

    * Run regressions
        forvalues k = 0/18 {
            * IV regression with interactions
            ivreghdfe CAR_`k'_w ///
                MCCC_H MCCC_M ///
                market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 ///
                rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1 ///
                (c.ln_Num_Pats_Granted_w#i.MCCC_H ///
                 c.ln_Num_Pats_Granted_w#i.MCCC_M ///
                 c.ln_Num_Pats_Granted_w#i.MCCC_L = ///
                 c.avg_leniency#i.MCCC_H ///
                 c.avg_leniency#i.MCCC_M ///
                 c.avg_leniency#i.MCCC_L) ///
                , absorb(IND_ID#month art_unit_num#dec_year Num_Pat_App) ///
                cluster(art_unit_num cluster_var) ///
                partial(market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1)            
        }

Example Output:

Code:

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on art_unit_num and cluster_var

Number of clusters (art_unit_num) =    378            Number of obs =     7787
Number of clusters (cluster_var) =    204             F(  3,   203) =     0.43
                                                      Prob > F      =   0.7292
Total (centered) SS     =  249.3154017                Centered R2   =  -0.0248
Total (uncentered) SS   =  249.3154017                Uncentered R2 =  -0.0248
Residual SS             =  255.4890154                Root MSE      =    .1821

------------------------------------------------------------------------------------------------
                               |               Robust
                       CAR_9_w | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------------------+----------------------------------------------------------------
MCCC_H#c.ln_Num_Pats_Granted_w |
                            0  |   .0509095   .1007255     0.51   0.614    -.1476928    .2495118
                            1  |  -.0557951   .1398749    -0.40   0.690    -.3315891    .2199989
                               |
MCCC_M#c.ln_Num_Pats_Granted_w |
                            0  |   .0669551   .1072587     0.62   0.533     -.144529    .2784391
                               |
                        MCCC_H |          0  (omitted)
                        MCCC_M |          0  (omitted)
------------------------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):             17.629
                                                   Chi-sq(1) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               19.417
                         (Kleibergen-Paap rk Wald F statistic):         11.216
Stock-Yogo weak ID test critical values:                       <not available>
------------------------------------------------------------------------------
Warning: estimated covariance matrix of moment conditions not of full rank.
         overidentification statistic not reported, and standard errors and
         model tests should be interpreted with caution.
Possible causes:
         number of clusters insufficient to calculate robust covariance matrix
         singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
partial option may address problem.
------------------------------------------------------------------------------
Collinearities detected among instruments: 2 instrument(s) dropped
Instrumented:         0b.MCCC_H#c.ln_Num_Pats_Granted_w
                      1.MCCC_H#c.ln_Num_Pats_Granted_w
                      0b.MCCC_M#c.ln_Num_Pats_Granted_w
Included instruments: MCCC_H MCCC_M
Excluded instruments: 0b.MCCC_H#c.avg_leniency 1.MCCC_H#c.avg_leniency
                      0b.MCCC_M#c.avg_leniency
Partialled-out:       market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12
                      roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd
                      EnvSc_yearl1 _cons
                      nb: total SS, model F and R2s are after partialling-out;
                          any small-sample adjustments include partialled-out
                          variables in regressor count K
Dropped collinear:    1.MCCC_M#c.ln_Num_Pats_Granted_w
                      0b.MCCC_L#c.ln_Num_Pats_Granted_w
                      1.MCCC_L#c.ln_Num_Pats_Granted_w 1.MCCC_M#c.avg_leniency
                      0b.MCCC_L#c.avg_leniency 1.MCCC_L#c.avg_leniency
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------------------+
             Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------------------+---------------------------------------|
            IND_ID#month |      1623        1623           0    *|
   art_unit_num#dec_year |      1644        1644           0    *|
             Num_Pat_App |        70           1          69     |
-----------------------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
(dropped 2395 singleton observations)
Warning - collinearities detected
Vars dropped:       1.MCCC_M#c.ln_Num_Pats_Granted_w
                    0b.MCCC_L#c.ln_Num_Pats_Granted_w
                    1.MCCC_L#c.ln_Num_Pats_Granted_w 1.MCCC_M#c.avg_leniency
                    0b.MCCC_L#c.avg_leniency 1.MCCC_L#c.avg_leniency
(MWFE estimator converged in 32 iterations)

Any insights on potential mistakes in my code or structural issues in the regression setup would be highly appreciated.

Thanks in advance,
Philipp

Attached Files

Tags: None

Announcement

Problem with 2SLS Regressions for Firm-Month and Firm-Year Data: High Standard Errors and Covariance Matrix Issues