split sample vs. interaction?

Sang Won Han

Join Date: Apr 2023

Posts: 5
#1

split sample vs. interaction?

25 Mar 2024, 02:40

I am wondering under what conditions is it appropriate to use the average of the first variable over each individual's years and set a threshold to split the sample into two groups and compare the coefficients of the second variable in fixed-effects regression analysis, instead of interacting with the first and second variables as is in panel data consisting of individuals and years and how to justify the former approach statistically?

For example, the first variable is the organization's culture strength (i.e., culture) with the mean 0 and the second variable is the number of employees in an organization (i.e., num_employee). The dependent variable is the performance of the organization (i.e., performance). But then the first approach with Stata looks like the below:

Code:

gen culture_mean = . replace culture_mean = 1 if culture >= 0 replace culture_mean = 0 if culture < 0 xtreg performance num_employees i.year if culture_mean == 1, fe xtreg performance num_employees i.year if culture_mean == 0, fe

The second approach looks like the below:

Code:

xtreg performance c.num_employees##c.culture i.year, fe

Thank you in advance and please let me know if there is any part that I need to clarify more!
Tags: interaction term, split sample
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

25 Mar 2024, 08:48

Sang:
I would consider you last code (that is, the one with the interaction).
As an aside, if this your real analysis, with three predictors only you're not expected to go that far.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

split sample vs. interaction?

Comment