Hello everyone,
I have a project in which I am trying to understand how the probability of someone becoming an entrepreneur in an industry is related with a series of variables. In order to do so, I am utilizing a fixed effects regression, and have constructed a few models to be able to interpret the results.
One of the variables which I am interested in analyzing is the median age of the industries. I have models that include the median age and a collection of other variables, and models that include the median age and its squared term, and the same collection of other variables.
The code itself is as follows:
Model1:
xtreg change_to_employer age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other, fe cluster(caem2)
and
Model2:
xtreg change_to_employer c.age_median##c.age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other , fe cluster(caem2)
Model 1:
Model 2:
The issue I am having interpreting is that the coefficient for age_median in Model1 is not significant, but then the coefficients for both age_median and c.age_median#c.age_median are both significant for Model2. As shown in:
Model1:
Model2:
How is it possible that one variable is not significant by itself, but then becomes significant when regressed together with its quadratic term? Can I then say that the median age of the industries has a significant impact of the probability of transition into entrepreneurship?
Thank you very much,
Rui
I have a project in which I am trying to understand how the probability of someone becoming an entrepreneur in an industry is related with a series of variables. In order to do so, I am utilizing a fixed effects regression, and have constructed a few models to be able to interpret the results.
One of the variables which I am interested in analyzing is the median age of the industries. I have models that include the median age and a collection of other variables, and models that include the median age and its squared term, and the same collection of other variables.
The code itself is as follows:
Model1:
xtreg change_to_employer age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other, fe cluster(caem2)
and
Model2:
xtreg change_to_employer c.age_median##c.age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other , fe cluster(caem2)
Model 1:
Code:
xtreg change_to_empregador age_median log_nemp_median gender numb_firms_div1000 higher_education vn_per_employee_median i.year high_tech low_tech KIS Other, fe cluster(caem2)
Model 2:
Code:
xtreg change_to_empregador c.age_median##c.age_median log_nemp_median gender numb_firms_div1000 higher_education vn_per_employee_median i.year high_tech low_tech KIS Other, fe cluster(caem2)
The issue I am having interpreting is that the coefficient for age_median in Model1 is not significant, but then the coefficients for both age_median and c.age_median#c.age_median are both significant for Model2. As shown in:
Model1:
Code:
Fixed-effects (within) regression Number of obs = 889 Group variable: caem2 Number of groups = 77 R-sq: Obs per group: within = 0.1832 min = 3 between = 0.0859 avg = 11.5 overall = 0.0853 max = 12 F(24,76) = 5.26 corr(u_i, Xb) = -0.2660 Prob > F = 0.0000 (Std. Err. adjusted for 77 clusters in caem2) ------------------------------------------------------------------------------------------ | Robust Change_to_empregador_f~e | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------------------+---------------------------------------------------------------- age_median | -.001125 .0028954 -0.39 0.699 -.0068918 .0046417 log_nemp_median | -.0015375 .0168405 -0.09 0.927 -.0350782 .0320033 gender | .0010117 .0016409 0.62 0.539 -.0022565 .0042798 numb_firms_div1000 | -.0086784 .0050413 -1.72 0.089 -.018719 .0013621 higher_education | .0009512 .0012174 0.78 0.437 -.0014735 .0033758 vn_per_employee_median | -.0005548 .000214 -2.59 0.011 -.000981 -.0001286
Model2:
Code:
Fixed-effects (within) regression Number of obs = 889 Group variable: caem2 Number of groups = 77 R-sq: Obs per group: within = 0.3109 min = 3 between = 0.1165 avg = 11.5 overall = 0.1408 max = 12 F(27,76) = 7.08 corr(u_i, Xb) = -0.1859 Prob > F = 0.0000 (Std. Err. adjusted for 77 clusters in caem2) ----------------------------------------------------------------------------------------------------------------------- | Robust change_to_empregador | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------------------------------------+---------------------------------------------------------------- age_median | .136472 .0677635 2.01 0.048 .0015094 .2714347 | c.age_median#c.age_median | -.001687 .0008374 -2.01 0.047 -.0033548 -.0000191 | log_nemp_median | -.0891614 .0408428 -2.18 0.032 -.1705069 -.0078159 gender | -.0019755 .0036136 -0.55 0.586 -.0091726 .0052216 numb_firms_div1000 | -.0235788 .0107627 -2.19 0.032 -.0450145 -.002143 higher_education | .0012232 .0029379 0.42 0.678 -.0046282 .0070745 vn_per_employee_median | -.0013087 .0004205 -3.11 0.003 -.0021462 -.0004713 | year |
How is it possible that one variable is not significant by itself, but then becomes significant when regressed together with its quadratic term? Can I then say that the median age of the industries has a significant impact of the probability of transition into entrepreneurship?
Thank you very much,
Rui
Comment