Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with 2SLS Regressions for Firm-Month and Firm-Year Data: High Standard Errors and Covariance Matrix Issues

    Hi all,

    I am attempting to replicate results from a paper to extend its analysis in subsequent research. Currently, I’m running into challenges with my 2SLS regressions and would greatly appreciate your insights. I suspect my code or data set being at fault as i get meaningful results for my OLS regressions
    As an example, I am analyzing the effect of climate patents being granted on cumulative abnormal returns (CARs) over the subsequent 18 months. For this, I split the data into terciles based on the Media Coverage of Climate Change (MCCC), represented by three dummy variables: MCCC_H, MCCC_M, and MCCC_L.

    When conducting the 2SLS regressions, I am encountering two primary problems:
    1. Very high standard errors in some cases
    2. No results at all due to the covariance matrix not being of full rank (example regression output included below).
    I suspect the issue could stem from either the dataset or my code (possibly both). Partialing out variables has somewhat alleviated the problem, but I’m not fully confident in this approach and whetehr I chose the right variables.

    Data:
    The dataset consists of firm-month panel data with approximately 25,000 observations in total. However, each regression excludes the majority of observations and typically uses 7,500–8,000 observations per CAR regression.

    Model:
    I am using a two-stage least squares (2SLS) regression with extensive fixed effects and firm-level controls. My primary Stata command is ivreghdfe. Here’s an outline of the regression setup:

    Dependent variable - CAR_k_w from time t to time t + k
    Main independent variable: ln_Num_Pats_Granted_w instrumented using avg_leniency
    Firm Controls: market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1
    Fixed effects: Industry × Month F.E., Art Unit × Year F.E., and Num_Pat_App
    FE Standard Errors: Double-clustered at art-unit and industry-year level (represented by egen cluster_var = group(IND_ID dec_year))
    Partialing Out:I am currently partialing out my firm controls - however I have to admit im not to familiar with this type of procedure.


    Equation for the seconds stage regression:
    Click image for larger version

Name:	Screenshot 2025-01-20 092138.png
Views:	1
Size:	15.9 KB
ID:	1771072


    Regression Code:
    Code:
        * Run regressions
            forvalues k = 0/18 {
                * IV regression with interactions
                ivreghdfe CAR_`k'_w ///
                    MCCC_H MCCC_M ///
                    market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 ///
                    rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1 ///
                    (c.ln_Num_Pats_Granted_w#i.MCCC_H ///
                     c.ln_Num_Pats_Granted_w#i.MCCC_M ///
                     c.ln_Num_Pats_Granted_w#i.MCCC_L = ///
                     c.avg_leniency#i.MCCC_H ///
                     c.avg_leniency#i.MCCC_M ///
                     c.avg_leniency#i.MCCC_L) ///
                    , absorb(IND_ID#month art_unit_num#dec_year Num_Pat_App) ///
                    cluster(art_unit_num cluster_var) ///
                    partial(market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12 roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd EnvSc_yearl1)            
            }

    Example Output:
    Code:
    IV (2SLS) estimation
    --------------------
    
    Estimates efficient for homoskedasticity only
    Statistics robust to heteroskedasticity and clustering on art_unit_num and cluster_var
    
    Number of clusters (art_unit_num) =    378            Number of obs =     7787
    Number of clusters (cluster_var) =    204             F(  3,   203) =     0.43
                                                          Prob > F      =   0.7292
    Total (centered) SS     =  249.3154017                Centered R2   =  -0.0248
    Total (uncentered) SS   =  249.3154017                Uncentered R2 =  -0.0248
    Residual SS             =  255.4890154                Root MSE      =    .1821
    
    ------------------------------------------------------------------------------------------------
                                   |               Robust
                           CAR_9_w | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------------------+----------------------------------------------------------------
    MCCC_H#c.ln_Num_Pats_Granted_w |
                                0  |   .0509095   .1007255     0.51   0.614    -.1476928    .2495118
                                1  |  -.0557951   .1398749    -0.40   0.690    -.3315891    .2199989
                                   |
    MCCC_M#c.ln_Num_Pats_Granted_w |
                                0  |   .0669551   .1072587     0.62   0.533     -.144529    .2784391
                                   |
                            MCCC_H |          0  (omitted)
                            MCCC_M |          0  (omitted)
    ------------------------------------------------------------------------------------------------
    Underidentification test (Kleibergen-Paap rk LM statistic):             17.629
                                                       Chi-sq(1) P-val =    0.0000
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic):               19.417
                             (Kleibergen-Paap rk Wald F statistic):         11.216
    Stock-Yogo weak ID test critical values:                       <not available>
    ------------------------------------------------------------------------------
    Warning: estimated covariance matrix of moment conditions not of full rank.
             overidentification statistic not reported, and standard errors and
             model tests should be interpreted with caution.
    Possible causes:
             number of clusters insufficient to calculate robust covariance matrix
             singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
    partial option may address problem.
    ------------------------------------------------------------------------------
    Collinearities detected among instruments: 2 instrument(s) dropped
    Instrumented:         0b.MCCC_H#c.ln_Num_Pats_Granted_w
                          1.MCCC_H#c.ln_Num_Pats_Granted_w
                          0b.MCCC_M#c.ln_Num_Pats_Granted_w
    Included instruments: MCCC_H MCCC_M
    Excluded instruments: 0b.MCCC_H#c.avg_leniency 1.MCCC_H#c.avg_leniency
                          0b.MCCC_M#c.avg_leniency
    Partialled-out:       market_cap_y_lag12 tobins_q_lag12 cash_ratio_lag12
                          roa_lag12 rd_ratio_lag12 past_12m_ret past_12m_sd
                          EnvSc_yearl1 _cons
                          nb: total SS, model F and R2s are after partialling-out;
                              any small-sample adjustments include partialled-out
                              variables in regressor count K
    Dropped collinear:    1.MCCC_M#c.ln_Num_Pats_Granted_w
                          0b.MCCC_L#c.ln_Num_Pats_Granted_w
                          1.MCCC_L#c.ln_Num_Pats_Granted_w 1.MCCC_M#c.avg_leniency
                          0b.MCCC_L#c.avg_leniency 1.MCCC_L#c.avg_leniency
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------------------+
                 Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------------------+---------------------------------------|
                IND_ID#month |      1623        1623           0    *|
       art_unit_num#dec_year |      1644        1644           0    *|
                 Num_Pat_App |        70           1          69     |
    -----------------------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    (dropped 2395 singleton observations)
    Warning - collinearities detected
    Vars dropped:       1.MCCC_M#c.ln_Num_Pats_Granted_w
                        0b.MCCC_L#c.ln_Num_Pats_Granted_w
                        1.MCCC_L#c.ln_Num_Pats_Granted_w 1.MCCC_M#c.avg_leniency
                        0b.MCCC_L#c.avg_leniency 1.MCCC_L#c.avg_leniency
    (MWFE estimator converged in 32 iterations)
    Any insights on potential mistakes in my code or structural issues in the regression setup would be highly appreciated.

    Thanks in advance,
    Philipp
    Attached Files
Working...
X