omitted dummy variable fixed effect regression

amjad naimah

Join Date: Dec 2024
Posts: 5

omitted dummy variable fixed effect regression

16 Dec 2024, 00:18

Hi everyone,

I am currently working on my PhD dissertation titled "The Impact of Biological Assets on Firm Performance in Malaysian Plantation Companies." I have two independent variables (IVs): biological asset valuation (using Historical Cost and Fair Value) and Disclosure. The moderating variable is Audit Committee, measured using a binary score (1 = has an audit committee, 0 = no audit committee).

The issue I am facing is that when I perform a regression analysis including the moderating variable, the independent variable "biological asset valuation - Fair Value" is omitted due to collinearity issues. I have attached my results for your reference.

Could you kindly suggest ways to address this collinearity problem and obtain results without the omission issue?

Thank you for your help!

Code:

. asdoc xtreg tobinq_w c.bafv_w##i.audcom3 c.disclosure##i.audcom3 fsz_w eps_w lnetinc_w lnage year,fe
(File Myfile.doc already exists, option append was assumed)
note: 1.audcom3 omitted because of collinearity.
note: 1.audcom3#c.bafv_w omitted because of collinearity.
note: 1.audcom3#c.disclosure omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =        183
Group variable: id                              Number of groups  =         40

R-squared:                                      Obs per group:
     Within  = 0.0947                                         min =          1
     Between = 0.0531                                         avg =        4.6
     Overall = 0.0330                                         max =          7

                                                F(7,136)          =       2.03
corr(u_i, Xb) = -0.9259                         Prob > F          =     0.0554

--------------------------------------------------------------------------------------
            tobinq_w | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
---------------------+----------------------------------------------------------------
              bafv_w |  -1.638312   .7686049    -2.13   0.035    -3.158275   -.1183489
           1.audcom3 |          0  (omitted)
                     |
    audcom3#c.bafv_w |
                  1  |          0  (omitted)
                     |
          disclosure |    .088598   .4744341     0.19   0.852    -.8496244     1.02682
                     |
audcom3#c.disclosure |
                  1  |          0  (omitted)
                     |
               fsz_w |  -.4486973   .1974606    -2.27   0.025    -.8391876    -.058207
               eps_w |   .1371329   .2063137     0.66   0.507    -.2708651    .5451308
           lnetinc_w |   .2081806   .1329133     1.57   0.120    -.0546634    .4710247
               lnage |   1.251934   1.094251     1.14   0.255    -.9120146    3.415882
                year |  -.0376564    .045522    -0.83   0.410    -.1276789     .052366
               _cons |   79.11343   89.99552     0.88   0.381    -98.85818     257.085
---------------------+----------------------------------------------------------------
             sigma_u |  .89030996
             sigma_e |  .38161032
                 rho |  .84479397   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
F test that all u_i=0: F(39, 136) = 3.01                     Prob > F = 0.0000
Click to Open File:  Myfile.doc

Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17711

16 Dec 2024, 00:47

Amjad:
welcome to this forum.
Some comments about your post:
1) if your variabe is time-invariant of collinear with the -panelid- fe, it will be wiped out. The former case is shown in the following toy-example:

Code:

. use "https://www.stata-press.com/data/r18/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage i.birth_yr, fe vce(cluster idcode)
note: 42.birth_yr omitted because of collinearity.
note: 43.birth_yr omitted because of collinearity.
note: 44.birth_yr omitted because of collinearity.
note: 45.birth_yr omitted because of collinearity.
note: 46.birth_yr omitted because of collinearity.
note: 47.birth_yr omitted because of collinearity.
note: 48.birth_yr omitted because of collinearity.
note: 49.birth_yr omitted because of collinearity.
note: 50.birth_yr omitted because of collinearity.
note: 51.birth_yr omitted because of collinearity.
note: 52.birth_yr omitted because of collinearity.
note: 53.birth_yr omitted because of collinearity.
note: 54.birth_yr omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-squared:                                      Obs per group:
     Within  = 0.0000                                         min =          1
     Between = 0.0050                                         avg =        6.1
                                                              max =         15

                                                F(0, 4710)        =          .
corr(u_i, Xb) =      .                          Prob > F          =          .

                             (Std. err. adjusted for 4,711 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    birth_yr |
         42  |          0  (omitted)
         43  |          0  (omitted)
         44  |          0  (omitted)
         45  |          0  (omitted)
         46  |          0  (omitted)
         47  |          0  (omitted)
         48  |          0  (omitted)
         49  |          0  (omitted)
         50  |          0  (omitted)
         51  |          0  (omitted)
         52  |          0  (omitted)
         53  |          0  (omitted)
         54  |          0  (omitted)
             |
       _cons |   1.674907          .        .       .            .           .
-------------+----------------------------------------------------------------
     sigma_u |  .42456905
     sigma_e |  .32028665
         rho |  .63731204   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

2) your R-sq within is very low. I would check if the data generating process is fully reported in the right-hand side of your regression equation;
3) with 40 panels you may want to consider cluster robust standard errors (Cameron_Miller_JHR_2015_February.pdf).

Kind regards,
Carlo
(Stata 19.0)

Comment

amjad naimah

Join Date: Dec 2024

Posts: 5
#3

16 Dec 2024, 16:44

Thank you for your helpful comments on my dissertation analysis. I appreciate your insights.
Regarding the low within R-squared, I will re-examine the variables included in my regression model to ensure that the data generating process is fully captured. I will consider adding any additional relevant variables that might help explain the variation in firm performance within the panels.

I also understand your suggestion to use cluster-robust standard errors due to the 40 panels in my dataset. I will apply this adjustment to account for the potential correlation of errors within each firm over time, as recommended in the Cameron & Miller (2015) paper.

Once again, thank you for your feedback, and I will incorporate these changes into my analysis.

Best regards,
AM
Comment
amjad naimah

Join Date: Dec 2024

Posts: 5
#4

16 Dec 2024, 17:21

Hi Carlo,
Could you please provide the command for computing cluster-robust standard errors?
Thank you very much. I really appreciate it.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#5

17 Dec 2024, 00:35

Amjad:

Code:

xtreg tobinq_w c.bafv_w##i.audcom3 c.disclosure##i.audcom3 fsz_w eps_w lnetinc_w lnage year,fe vce(cluster id)

Kind regards,
Carlo
(Stata 19.0)
Comment
amjad naimah

Join Date: Dec 2024

Posts: 5
#6

17 Dec 2024, 02:03

Hi Carlo,

Thank you for providing the command. I have another question regarding my analysis. When I used the fixed-effects regression (fe), the interaction terms audcom7#bioap_w and disclosure#audcom7 were both positive and statistically significant. However, after applying robust standard errors (fe, robust) or cluster-robust standard errors (fe, cluster(id)), the results changed. Specifically, the interaction term audcom7#bioap_w remained positive and significant, but disclosure#audcom7 lost its statistical significance. Is it possible for the results to change in this way after applying robust or cluster-robust standard errors? I have attached my results for your reference. Do you have any suggestions or insights on this issue? I would greatly appreciate your help.

[HT. xtreg tobinq_w c.bioap_w##i.audcom7 c.disclosure##i.audcom7 fsz_w eps_w lnetinc_w lnage year,fe

Fixed-effects (within) regression Number of obs = 515
Group variable: id Number of groups = 41

R-squared: Obs per group:
Within = 0.1936 min = 8
Between = 0.0897 avg = 12.6
Overall = 0.0190 max = 13

F(10,464) = 11.14
corr(u_i, Xb) = -0.8094 Prob > F = 0.0000

--------------------------------------------------------------------------------------
tobinq_w | Coefficient Std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
bioap_w | -1.367355 .5283424 -2.59 0.010 -2.405595 -.3291146
1.audcom7 | -.4658983 .2628195 -1.77 0.077 -.9823622 .0505656
|
audcom7#c.bioap_w |
1 | .9781371 .5327128 1.84 0.067 -.0686913 2.024966
|
disclosure | -.6139006 .4145281 -1.48 0.139 -1.428485 .2006844
|
audcom7#c.disclosure |
1 | .3086545 .4104525 0.75 0.452 -.4979215 1.11523
|
fsz_w | -.3639656 .0643691 -5.65 0.000 -.4904567 -.2374745
eps_w | .0816783 .096535 0.85 0.398 -.1080216 .2713782
lnetinc_w | .0888825 .0302474 2.94 0.003 .0294436 .1483213
lnage | .5285752 .1415803 3.73 0.000 .2503573 .8067931
year | -.0150397 .0085291 -1.76 0.079 -.0318001 .0017207
_cons | 36.54466 16.44207 2.22 0.027 4.234526 68.8548
---------------------+----------------------------------------------------------------
sigma_u | .68573341
sigma_e | .30294751
rho | .83669756 (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
F test that all u_i=0: F(40, 464) = 12.48 Prob > F = 0.0000
ML][/HT xtreg tobinq_w c.bioap_w##i.audcom7 c.disclosure##i.audcom7 fsz_w eps_w lnetinc_w lnage year,fe ro

Fixed-effects (within) regression Number of obs = 515
Group variable: id Number of groups = 41

R-squared: Obs per group:
Within = 0.1936 min = 8
Between = 0.0897 avg = 12.6
Overall = 0.0190 max = 13

F(10,40) = 6.46
corr(u_i, Xb) = -0.8094 Prob > F = 0.0000

(Std. err. adjusted for 41 clusters in id)
--------------------------------------------------------------------------------------
| Robust
tobinq_w | Coefficient std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
bioap_w | -1.367355 .6406996 -2.13 0.039 -2.662257 -.0724527
1.audcom7 | -.4658983 .3170308 -1.47 0.150 -1.106642 .1748449
|
audcom7#c.bioap_w |
1 | .9781371 .6793087 1.44 0.158 -.3947969 2.351071
|
disclosure | -.6139006 .3099119 -1.98 0.055 -1.240256 .0124547
|
audcom7#c.disclosure |
1 | .3086545 .3753058 0.82 0.416 -.4498669 1.067176
|
fsz_w | -.3639656 .0860589 -4.23 0.000 -.5378971 -.1900341
eps_w | .0816783 .0923019 0.88 0.381 -.1048708 .2682275
lnetinc_w | .0888825 .0336022 2.65 0.012 .0209698 .1567952
lnage | .5285752 .2225968 2.37 0.022 .0786903 .97846
year | -.0150397 .0124739 -1.21 0.235 -.0402504 .010171
_cons | 36.54466 23.81027 1.53 0.133 -11.57769 84.66701
---------------------+----------------------------------------------------------------
sigma_u | .68573341
sigma_e | .30294751
rho | .83669756 (fraction of variance due to u_i)
--------------------------------------------------------------------------------------

. xtreg tobinq_w c.bioap_w##i.audcom7 c.disclosure##i.audcom7 fsz_w eps_w lnetinc_w lnage year,fe vce(cluster id )

Fixed-effects (within) regression Number of obs = 515
Group variable: id Number of groups = 41

R-squared: Obs per group:
Within = 0.1936 min = 8
Between = 0.0897 avg = 12.6
Overall = 0.0190 max = 13

F(10,40) = 6.46
corr(u_i, Xb) = -0.8094 Prob > F = 0.0000

(Std. err. adjusted for 41 clusters in id)
--------------------------------------------------------------------------------------
| Robust
tobinq_w | Coefficient std. err. t P>|t| [95% conf. interval]
---------------------+----------------------------------------------------------------
bioap_w | -1.367355 .6406996 -2.13 0.039 -2.662257 -.0724527
1.audcom7 | -.4658983 .3170308 -1.47 0.150 -1.106642 .1748449
|
audcom7#c.bioap_w |
1 | .9781371 .6793087 1.44 0.158 -.3947969 2.351071
|
disclosure | -.6139006 .3099119 -1.98 0.055 -1.240256 .0124547
|
audcom7#c.disclosure |
1 | .3086545 .3753058 0.82 0.416 -.4498669 1.067176
|
fsz_w | -.3639656 .0860589 -4.23 0.000 -.5378971 -.1900341
eps_w | .0816783 .0923019 0.88 0.381 -.1048708 .2682275
lnetinc_w | .0888825 .0336022 2.65 0.012 .0209698 .1567952
lnage | .5285752 .2225968 2.37 0.022 .0786903 .97846
year | -.0150397 .0124739 -1.21 0.235 -.0402504 .010171
_cons | 36.54466 23.81027 1.53 0.133 -11.57769 84.66701
---------------------+----------------------------------------------------------------
sigma_u | .68573341
sigma_e | .30294751
rho | .83669756 (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
ML]
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#7

17 Dec 2024, 07:18

Amjad:
yes, it is a possible (and frequent) results of switching from default to cluster-robust standard errors, that is more reliable in your case, because of 41 panels.
In addition, please use CODE delimiters (as per FAQ) to share what you typed and what Stata gave you back. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
amjad naimah

Join Date: Dec 2024

Posts: 5
#8

17 Dec 2024, 08:38

Thank you again for your comment. I just want to clarify: Is it acceptable to use these results and report them in my thesis? Is it inappropriate to observe such a difference in results?

Best regards,
AM
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#9

18 Dec 2024, 06:05

Amjad:
yes, it is.
In your case, the deafult standard errors gave you the false perception of statistical significance.
In fact, it was misleading, as the cluster robust standar errors were the right ones to use.
In sum, go with cluster robust (and ignore default) standard errors.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

omitted dummy variable fixed effect regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment