Hello everyone,
I’m working on a panel data model similar to the approach used in Gulen and Ion’s (2016) paper on Policy Uncertainty and Corporate Investment. I amanalyzing the effect of Policy Uncertainty (PU) on firm-level capital investment, measured by CAPX/Total Assets(CAPX/TA). In this model, CAPX/TA is the dependent variable, and I lead it by 1 to 4 quarters to capture the future impact of PU on investment. I am using Stata 18.5 on Mac.
The independent variables: Include Policy Uncertainty (PU), Tobin’s Q, Cash Flow, Sales Growth, GDP Growth, and an Election Indicator (which is a dummy variable indicating U.S. presidential election years [Either equals 0 or 1]).
I am using reghdfe to absorb firm fixed effects while including quarter dummies to account for time-specific effects. However, I’m encountering problems where the explanatory power (R²) becomes disproportionately high when the quarter dummies are included, and the coefficients of key independent variables like Policy Uncertainty and Tobin’s Q deviate significantly from baseline expectations. I believe that the quarter dummies are absorbing too much of the variation, which leads to this inflated R² and the unexpected behavior of my key variables.
My question is, how can I include quarter dummies without without causing inflated explanatory power or distorting the coefficients of key variables like Policy Uncertainty? Is there a recommended approach to this?
Any guidance or examples of how to structure this would be greatly appreciated, as I am very new to Stata and coding in general.
Thanks in advance for your help!
Code:
// Sorting the data by firm identifier and time variable
sort gvkey qtrdate
// Set panel data
tsset gvkey_num qtrdate
// Standardize relevant variables (normalize by their standard deviation)
foreach var in log_PU_Policy_Uncertainty tobins_q_w cashflow_normalized_w yoy_growth_sales_w RGDP_Growth {
egen mean_`var' = mean(`var')
egen sd_`var' = sd(`var')
gen std_`var' = (`var' - mean_`var') / sd_`var'
}
// Generate the election indicator (1 for election years, 0 otherwise)
gen election_indicator = 0
replace election_indicator = 1 if inlist(year, 2016, 2020, 2024) // U.S. election years
// Create dependent variable leads (CAPX/TA) for 1 to 4 quarters
gen lead1_capex_normalized_w = F1.capex_normalized_w // Lead by 1 quarter
gen lead2_capex_normalized_w = F2.capex_normalized_w // Lead by 2 quarters
gen lead3_capex_normalized_w = F3.capex_normalized_w // Lead by 3 quarters
gen lead4_capex_normalized_w = F4.capex_normalized_w // Lead by 4 quarters
// Generate quarter dummies
tabulate qtrdate, generate(qtrdate_dummy)
eststo clear // Clear previous stored results
// Regression: CAPX/TA lead by 1 quarter
reghdfe lead1_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col1
// Regression: CAPX/TA lead by 2 quarters
reghdfe lead2_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col2
// Regression: CAPX/TA lead by 3 quarters
reghdfe lead3_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col3
// Regression: CAPX/TA lead by 4 quarters
reghdfe lead4_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col4
I’m working on a panel data model similar to the approach used in Gulen and Ion’s (2016) paper on Policy Uncertainty and Corporate Investment. I amanalyzing the effect of Policy Uncertainty (PU) on firm-level capital investment, measured by CAPX/Total Assets(CAPX/TA). In this model, CAPX/TA is the dependent variable, and I lead it by 1 to 4 quarters to capture the future impact of PU on investment. I am using Stata 18.5 on Mac.
The independent variables: Include Policy Uncertainty (PU), Tobin’s Q, Cash Flow, Sales Growth, GDP Growth, and an Election Indicator (which is a dummy variable indicating U.S. presidential election years [Either equals 0 or 1]).
I am using reghdfe to absorb firm fixed effects while including quarter dummies to account for time-specific effects. However, I’m encountering problems where the explanatory power (R²) becomes disproportionately high when the quarter dummies are included, and the coefficients of key independent variables like Policy Uncertainty and Tobin’s Q deviate significantly from baseline expectations. I believe that the quarter dummies are absorbing too much of the variation, which leads to this inflated R² and the unexpected behavior of my key variables.
My question is, how can I include quarter dummies without without causing inflated explanatory power or distorting the coefficients of key variables like Policy Uncertainty? Is there a recommended approach to this?
Any guidance or examples of how to structure this would be greatly appreciated, as I am very new to Stata and coding in general.
Thanks in advance for your help!
Code:
// Sorting the data by firm identifier and time variable
sort gvkey qtrdate
// Set panel data
tsset gvkey_num qtrdate
// Standardize relevant variables (normalize by their standard deviation)
foreach var in log_PU_Policy_Uncertainty tobins_q_w cashflow_normalized_w yoy_growth_sales_w RGDP_Growth {
egen mean_`var' = mean(`var')
egen sd_`var' = sd(`var')
gen std_`var' = (`var' - mean_`var') / sd_`var'
}
// Generate the election indicator (1 for election years, 0 otherwise)
gen election_indicator = 0
replace election_indicator = 1 if inlist(year, 2016, 2020, 2024) // U.S. election years
// Create dependent variable leads (CAPX/TA) for 1 to 4 quarters
gen lead1_capex_normalized_w = F1.capex_normalized_w // Lead by 1 quarter
gen lead2_capex_normalized_w = F2.capex_normalized_w // Lead by 2 quarters
gen lead3_capex_normalized_w = F3.capex_normalized_w // Lead by 3 quarters
gen lead4_capex_normalized_w = F4.capex_normalized_w // Lead by 4 quarters
// Generate quarter dummies
tabulate qtrdate, generate(qtrdate_dummy)
eststo clear // Clear previous stored results
// Regression: CAPX/TA lead by 1 quarter
reghdfe lead1_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col1
// Regression: CAPX/TA lead by 2 quarters
reghdfe lead2_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col2
// Regression: CAPX/TA lead by 3 quarters
reghdfe lead3_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col3
// Regression: CAPX/TA lead by 4 quarters
reghdfe lead4_capex_normalized_w std_log_PU_Policy_Uncertainty std_tobins_q_w std_cashflow_normalized_w std_yoy_growth_sales_w std_RGDP_Growth election_indicator qtrdate_dummy*, absorb(gvkey_num) cluster(gvkey_num qtrdate)
eststo col4