Dear forum community,
I am having some troubles in carring out the results part of my thesis. Since I am very beginner on STATA and a more newby on this forum, I apologize in advance if something doesn't follow the guidelines, I tried to read through all of them and make effor to respect them.
The research was initally docused on a sample size of 300 biotech/pharma firms, due to lack of data the final sample size has 78 firms over a period of 6 years (2013-2018).
Let me provide some more context on the model I am trying to run. I have a dependent variable, namely Female Board Presence (FBP) that was calculated as 1-gender ratio (gender ratio = the number of male directors/the total of directors on board. i had to use this measure since with the tools provided by my univeristy i couldn't access the number of women on board so i decided to find this solution to solve the problem). the independent variable is represented by the number of shared patent filedy by company X in y year. i have to moderator which are educational diversity (calculated through Blau's index) and outside director presence (calculated as the ratio of outside directors/total of directors). control variables are: firm age, boardsize (total number of directors), and i've created a dummy variable that indicates the sector (there's 2 sector so sector1=1 or sector2=0).
here's a summary of the descriptives:
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
shared_pat~s | 468 2.728632 8.42482 0 108
bfp | 468 .1582265 .0977436 0 .42
ded | 468 .653953 .1531496 .18 .81
odp | 468 .8656197 .219407 0 1.4
firmage | 468 33.20513 39.73419 0 118
-------------+---------------------------------------------------------
boardsize | 468 9.138889 2.320181 5 16
Now, because of the nature of my countable dependent variable i want to run a Negative Binomial Regression with random effects.
The problem is that when I run the full model:
xtnbreg shared_patents firmage boardsize sector c.bfp##c.ded c.bfp##c.odp i get the error note: bfp omitted because of collinearity.
The way that i tried to approach this issue is to check for multicollinearity and try to spot what is causing this problem. I went ahead and checked VIF errors
. regress shared_patents firmage boardsize sector sector c.bfp##c.ded c.bfp##c.odp
note: sector omitted because of collinearity.
note: bfp omitted because of collinearity.
Source | SS df MS Number of obs = 468
-------------+---------------------------------- F(8, 459) = 12.72
Model | 6015.1853 8 751.898163 Prob > F = 0.0000
Residual | 27131.351 459 59.1096972 R-squared = 0.1815
-------------+---------------------------------- Adj R-squared = 0.1672
Total | 33146.5363 467 70.9775938 Root MSE = 7.6883
------------------------------------------------------------------------------
shared_pat~s | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
firmage | .0377138 .0116805 3.23 0.001 .0147599 .0606677
boardsize | .739291 .2035207 3.63 0.000 .3393433 1.139239
sector | .6688978 .8866354 0.75 0.451 -1.07347 2.411266
sector | 0 (omitted)
bfp | -4.11039 24.08792 -0.17 0.865 -51.44667 43.22589
ded | -2.111455 5.097674 -0.41 0.679 -12.12913 7.906217
|
c.bfp#c.ded | -38.42171 27.7684 -1.38 0.167 -92.99067 16.14725
|
bfp | 0 (omitted)
odp | -5.502319 3.071279 -1.79 0.074 -11.53783 .5331923
|
c.bfp#c.odp | 30.55852 17.83892 1.71 0.087 -4.497558 65.61459
|
_cons | .8158366 3.848849 0.21 0.832 -6.747713 8.379386
------------------------------------------------------------------------------
. vif
Variable | VIF 1/VIF
-------------+----------------------
firmage | 1.70 0.587612
boardsize | 1.76 0.567651
sector | 1.51 0.664055
bfp | 43.80 0.022833
ded | 4.82 0.207666
c.bfp#c.ded | 30.24 0.033071
odp | 3.59 0.278742
c.bfp#c.odp | 21.50 0.046515
-------------+----------------------
Mean VIF | 13.61
.
and i saw (don't know if this interpretation could be correct) that concering vif values are displayed for bfp, ded and odp. below the correlation matrix:
| shared~s bfp odp ded firmage boards~e sector
-------------+---------------------------------------------------------------
shared_pat~s | 1.0000
bfp | 0.1223 1.0000
odp | -0.0322 0.1396 1.0000
ded | -0.2820 -0.0048 0.1336 1.0000
firmage | 0.3380 0.3779 0.0759 -0.3637 1.0000
boardsize | 0.3112 0.5049 0.0053 -0.1704 0.5164 1.0000
sector | 0.2544 0.2086 -0.0894 -0.4237 0.4187 0.4305 1.0000
Now, I know there could be problem arising because many of my variables are represented as ratios that contain the same parameters (total number of directors ecc).Given these challenges and my limited STATA expertise, I seek guidance on navigating this issue without excluding key moderators. Any advanced insights or suggested directions would be immensely valuable.
Thank you for your time and expertise.
I am having some troubles in carring out the results part of my thesis. Since I am very beginner on STATA and a more newby on this forum, I apologize in advance if something doesn't follow the guidelines, I tried to read through all of them and make effor to respect them.
The research was initally docused on a sample size of 300 biotech/pharma firms, due to lack of data the final sample size has 78 firms over a period of 6 years (2013-2018).
Let me provide some more context on the model I am trying to run. I have a dependent variable, namely Female Board Presence (FBP) that was calculated as 1-gender ratio (gender ratio = the number of male directors/the total of directors on board. i had to use this measure since with the tools provided by my univeristy i couldn't access the number of women on board so i decided to find this solution to solve the problem). the independent variable is represented by the number of shared patent filedy by company X in y year. i have to moderator which are educational diversity (calculated through Blau's index) and outside director presence (calculated as the ratio of outside directors/total of directors). control variables are: firm age, boardsize (total number of directors), and i've created a dummy variable that indicates the sector (there's 2 sector so sector1=1 or sector2=0).
here's a summary of the descriptives:
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
shared_pat~s | 468 2.728632 8.42482 0 108
bfp | 468 .1582265 .0977436 0 .42
ded | 468 .653953 .1531496 .18 .81
odp | 468 .8656197 .219407 0 1.4
firmage | 468 33.20513 39.73419 0 118
-------------+---------------------------------------------------------
boardsize | 468 9.138889 2.320181 5 16
Now, because of the nature of my countable dependent variable i want to run a Negative Binomial Regression with random effects.
The problem is that when I run the full model:
xtnbreg shared_patents firmage boardsize sector c.bfp##c.ded c.bfp##c.odp i get the error note: bfp omitted because of collinearity.
The way that i tried to approach this issue is to check for multicollinearity and try to spot what is causing this problem. I went ahead and checked VIF errors
. regress shared_patents firmage boardsize sector sector c.bfp##c.ded c.bfp##c.odp
note: sector omitted because of collinearity.
note: bfp omitted because of collinearity.
Source | SS df MS Number of obs = 468
-------------+---------------------------------- F(8, 459) = 12.72
Model | 6015.1853 8 751.898163 Prob > F = 0.0000
Residual | 27131.351 459 59.1096972 R-squared = 0.1815
-------------+---------------------------------- Adj R-squared = 0.1672
Total | 33146.5363 467 70.9775938 Root MSE = 7.6883
------------------------------------------------------------------------------
shared_pat~s | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
firmage | .0377138 .0116805 3.23 0.001 .0147599 .0606677
boardsize | .739291 .2035207 3.63 0.000 .3393433 1.139239
sector | .6688978 .8866354 0.75 0.451 -1.07347 2.411266
sector | 0 (omitted)
bfp | -4.11039 24.08792 -0.17 0.865 -51.44667 43.22589
ded | -2.111455 5.097674 -0.41 0.679 -12.12913 7.906217
|
c.bfp#c.ded | -38.42171 27.7684 -1.38 0.167 -92.99067 16.14725
|
bfp | 0 (omitted)
odp | -5.502319 3.071279 -1.79 0.074 -11.53783 .5331923
|
c.bfp#c.odp | 30.55852 17.83892 1.71 0.087 -4.497558 65.61459
|
_cons | .8158366 3.848849 0.21 0.832 -6.747713 8.379386
------------------------------------------------------------------------------
. vif
Variable | VIF 1/VIF
-------------+----------------------
firmage | 1.70 0.587612
boardsize | 1.76 0.567651
sector | 1.51 0.664055
bfp | 43.80 0.022833
ded | 4.82 0.207666
c.bfp#c.ded | 30.24 0.033071
odp | 3.59 0.278742
c.bfp#c.odp | 21.50 0.046515
-------------+----------------------
Mean VIF | 13.61
.
and i saw (don't know if this interpretation could be correct) that concering vif values are displayed for bfp, ded and odp. below the correlation matrix:
| shared~s bfp odp ded firmage boards~e sector
-------------+---------------------------------------------------------------
shared_pat~s | 1.0000
bfp | 0.1223 1.0000
odp | -0.0322 0.1396 1.0000
ded | -0.2820 -0.0048 0.1336 1.0000
firmage | 0.3380 0.3779 0.0759 -0.3637 1.0000
boardsize | 0.3112 0.5049 0.0053 -0.1704 0.5164 1.0000
sector | 0.2544 0.2086 -0.0894 -0.4237 0.4187 0.4305 1.0000
Now, I know there could be problem arising because many of my variables are represented as ratios that contain the same parameters (total number of directors ecc).Given these challenges and my limited STATA expertise, I seek guidance on navigating this issue without excluding key moderators. Any advanced insights or suggested directions would be immensely valuable.
Thank you for your time and expertise.
Comment