I am wondering under what conditions is it appropriate to use the average of the first variable over each individual's years and set a threshold to split the sample into two groups and compare the coefficients of the second variable in fixed-effects regression analysis, instead of interacting with the first and second variables as is in panel data consisting of individuals and years and how to justify the former approach statistically?
For example, the first variable is the organization's culture strength (i.e., culture) with the mean 0 and the second variable is the number of employees in an organization (i.e., num_employee). The dependent variable is the performance of the organization (i.e., performance). But then the first approach with Stata looks like the below:
The second approach looks like the below:
Thank you in advance and please let me know if there is any part that I need to clarify more!
For example, the first variable is the organization's culture strength (i.e., culture) with the mean 0 and the second variable is the number of employees in an organization (i.e., num_employee). The dependent variable is the performance of the organization (i.e., performance). But then the first approach with Stata looks like the below:
Code:
gen culture_mean = . replace culture_mean = 1 if culture >= 0 replace culture_mean = 0 if culture < 0 xtreg performance num_employees i.year if culture_mean == 1, fe xtreg performance num_employees i.year if culture_mean == 0, fe
The second approach looks like the below:
Code:
xtreg performance c.num_employees##c.culture i.year, fe
Thank you in advance and please let me know if there is any part that I need to clarify more!
Comment