Some country dummies omitted from cross-section due to collinearity

Beth Plunkett

Join Date: Feb 2022

Posts: 12
#1

Some country dummies omitted from cross-section due to collinearity

23 Feb 2022, 04:52

Hi everyone,

I'm running a regression on the World Values Survey, which is individual-level survey data. The data files contain country-level aggregated variables, and the explanatory variable I am focussing on is a policy variable (gender inequality). I am running OLS and ordered logit regressions with country dummies and clustered standard errors (at country level), which is what a majority of my literature does. However, they tend to use repeated cross-sections wereas I am using only one wave of the data (and I am not sure if that makes a difference).

The issue I am having is when I run my regression with the full list of explanatory variables, stata is omitting 3/4 countries due to collinearity. I'm not clear why it is these countries specifically and any help would be appreciated. When I run the regression with no other country-level variables (so excluding GDP per capita, unemployment and the Gini coefficient) this seems to solve the issue. It is common in my literature to include these other country-level variables. Does this seem like an issue with the data (Gender inequality is a composite measure) or am I misspecifying my regressions?

My code is:
reg Q46 genderinequality $X job_scare election_equality home_equality political_equality_perception GDPpercap2 unemploytotal giniWB i.ISO31661numericcode, vce(cluster ISO31661numericcode)

and my output is:

Attached Files
Tags: None
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

23 Feb 2022, 05:08

I can't read that table.

Please read the FAQ about how to ask a question.
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#3

23 Feb 2022, 05:35

Beth:
perfect collinearity relates to regression specification. Non-technically speakiing, if two predictors tell exactly the same thing, one of them will be omitted as the goal of any regression model is to explain the contribution of different predictors (when adjusted for the other ones) in explaining variation in the conditional mean of the regressand (if predictors are not informatively different, the regression goal cannot be reached).
In addition:
1) each research fields has its own tribal habits, so I cannot say whether one vs. many cross-sectional study(ies) makes any difference in yours;
2) as per FAQ, please shate what you typed and what Stata gave you back via CODE delimiters. Thanks.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Beth Plunkett

Join Date: Feb 2022
Posts: 12

23 Feb 2022, 07:28

Hi both - thank you very much for your patience! I hope the below is clearer to read!

Code:

reg Q46 genderinequality $X GDPpercap2 unemploytotal giniWB  i.ISO31661numericcode, vce(cluster ISO31661numericcode)

Code:

 reg Q46 genderinequality $X GDPpercap2 unemploytotal giniWB  i.ISO31661numericcode, vce(cluster ISO31661numericcode)
note: 792.ISO31661numericcode omitted because of collinearity.
note: 818.ISO31661numericcode omitted because of collinearity.
note: 840.ISO31661numericcode omitted because of collinearity.

Linear regression                               Number of obs     =     63,178
                                                F(0, 42)          =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0844
                                                Root MSE          =     .74314

                          (Std. err. adjusted for 43 clusters in ISO31661numericcode)
-------------------------------------------------------------------------------------
                    |               Robust
                Q46 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------+----------------------------------------------------------------
   genderinequality |    6.36574   .0000873  7.3e+04   0.000     6.365563    6.365916
         GDPpercap2 |   .0000283   2.63e-10  1.1e+05   0.000     .0000283    .0000283
      unemploytotal |   .0396832   2.36e-08  1.7e+06   0.000     .0396832    .0396833
             giniWB |  -.0001084   7.92e-07  -136.78   0.000      -.00011   -.0001068
                    |
ISO31661numericcode |
         Australia  |   .8787511   .0000183  4.8e+04   0.000     .8787142    .8787881
        Bangladesh  |  -.5367844   6.71e-06 -8.0e+04   0.000    -.5367979   -.5367709
           Bolivia  |    .210422   5.05e-06  4.2e+04   0.000     .2104118    .2104322
            Brazil  |  -.3413265   .0000149 -2.3e+04   0.000    -.3413566   -.3412964
           Myanmar  |  -.0788117   3.74e-07 -2.1e+05   0.000    -.0788124   -.0788109
             Chile  |   .6382956   4.06e-06  1.6e+05   0.000     .6382875    .6383038
             China  |   1.500499   .0000177  8.5e+04   0.000     1.500463    1.500534
          Colombia  |  -.6313892   .0000139 -4.5e+04   0.000    -.6314174   -.6313611
            Cyprus  |   1.266009   .0000244  5.2e+04   0.000     1.265959    1.266058
           Ecuador  |  -.1326241   5.39e-06 -2.5e+04   0.000     -.132635   -.1326132
          Ethiopia  |  -.2716529   6.41e-06 -4.2e+04   0.000    -.2716658     -.27164
           Germany  |    .994612   .0000203  4.9e+04   0.000      .994571     .994653
            Greece  |   1.268662   .0000221  5.8e+04   0.000     1.268617    1.268706
         Guatemala  |  -.4519446   .0000153 -3.0e+04   0.000    -.4519754   -.4519137
         Indonesia  |  -.5994294   8.80e-06 -6.8e+04   0.000    -.5994472   -.5994117
              Iran  |    .189704   5.13e-06  3.7e+04   0.000     .1896936    .1897144
              Iraq  |  -.9976433   9.31e-06 -1.1e+05   0.000    -.9976621   -.9976245
             Japan  |   1.221662   .0000219  5.6e+04   0.000     1.221618    1.221706
        Kazakhstan  |   .8386698   .0000218  3.8e+04   0.000     .8386258    .8387138
            Jordan  |  -1.555885   .0079464  -195.80   0.000    -1.571922   -1.539848
       South Korea  |   1.636746   .0000253  6.5e+04   0.000     1.636695    1.636797
        Kyrgyzstan  |  -.0385738   .0000116 -3324.86   0.000    -.0385972   -.0385504
           Lebanon  |   .0372411   2.20e-06  1.7e+04   0.000     .0372366    .0372455
          Malaysia  |   .7754089   5.05e-06  1.5e+05   0.000     .7753987    .7754191
            Mexico  |   .1070053   2.19e-06  4.9e+04   0.000     .1070009    .1070097
       New Zealand  |  -.4183225   .0079661   -52.51   0.000    -.4343987   -.4022464
         Nicaragua  |  -.1334123   8.23e-06 -1.6e+04   0.000    -.1334289   -.1333957
          Pakistan  |  -.7359316   7.64e-06 -9.6e+04   0.000    -.7359471   -.7359162
              Peru  |   .1562213   4.70e-06  3.3e+04   0.000     .1562118    .1562307
       Philippines  |   -.144329   8.01e-06 -1.8e+04   0.000    -.1443452   -.1443129
           Romania  |   .5934907   6.61e-06  9.0e+04   0.000     .5934774    .5935041
            Russia  |   .9000838   .0000106  8.5e+04   0.000     .9000623    .9001053
            Serbia  |   1.519032   .0000223  6.8e+04   0.000     1.518987    1.519077
           Vietnam  |   .7086685   .0000108  6.6e+04   0.000     .7086467    .7086903
          Zimbabwe  |   .1781948   .0000147  1.2e+04   0.000     .1781651    .1782246
        Tajikistan  |   .2499721    .000012  2.1e+04   0.000     .2499478    .2499963
          Thailand  |    .328192   1.98e-06  1.7e+05   0.000      .328188     .328196
           Tunisia  |   .6314995   .0000127  5.0e+04   0.000     .6314738    .6315252
            Turkey  |          0  (omitted)
           Ukraine  |   1.087876   .0000227  4.8e+04   0.000      1.08783    1.087922
             Egypt  |          0  (omitted)
     United States  |          0  (omitted)
                    |
              _cons |  -1.343943    .000067 -2.0e+04   0.000    -1.344078   -1.343808
-------------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#5

23 Feb 2022, 08:15

Beth:
thank for using CODE delimiters.
That said:
1) your enormous sample size justify statistical significance of everything in your -regress- outcome table;
2) however, your R-sq is really low. Are you sure that your model is correctly specified?
3) a more substantive issue: how can you easily and meaningfully disseminate the results of your regression as everything seems to reach statistical significance?

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Beth Plunkett

Join Date: Feb 2022

Posts: 12
#6

23 Feb 2022, 09:51

Hi Carlos,

Thank you very much for your comments! This is not my full model - as my dependent variable is individual-level data, the main set of controls are individual-level variables. This improves the R squared!
The reason I posted just the output with the country-level variables is because this is when I get the omission of some of the countries (Turkey, Eygpt and the US). If the issue is aggregate variables, I don't see why it should only be affecting a few countries? Unless this is common when including country dummy's and clustered standard errors (which seems ulikely).

Any more thoughts you had would be very much appreciated!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#7

23 Feb 2022, 09:54

Beth:
you may want to estimate the correlation between the omitted countries and the other four variables included in the right-hand side of your regression equation.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Beth Plunkett

Join Date: Feb 2022
Posts: 12

24 Feb 2022, 12:36

Originally posted by Carlo Lazzaro View Post

Beth:
you may want to estimate the correlation between the omitted countries and the other four variables included in the right-hand side of your regression equation.

Hi Carlos,

Thank you very much for all your helpful advice. I did as you advised and on reflection, I am going to drop most of the country-level variables. However, even including only my country-level variable of interest (gender inequality), one of my country dummies is still being excluded (US).

Code:

 reg Q46 genderinequality $X i.ISO31661numericcode, vce(cluster ISO31661numericcode)
note: 840.ISO31661numericcode omitted because of collinearity.

Linear regression                               Number of obs     =     63,178
                                                F(15, 42)         =          .
                                                Prob > F          =          .
                                                R-squared         =     0.2124
                                                Root MSE          =     .68929

                          (Std. err. adjusted for 43 clusters in ISO31661numericcode)
-------------------------------------------------------------------------------------
                    |               Robust
                Q46 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------+----------------------------------------------------------------
   genderinequality |   -.991285   .0886889   -11.18   0.000    -1.170266   -.8123035
               Q262 |   .0106367   .0016494     6.45   0.000      .007308    .0139653
               age2 |  -.0001266   .0000154    -8.24   0.000    -.0001577   -.0000956
             gender |  -.0437722   .0072117    -6.07   0.000    -.0583261   -.0292184
               Q275 |   .0014934   .0039952     0.37   0.710    -.0065692    .0095561
      maritalstatus |  -.1088563   .0102648   -10.60   0.000    -.1295714   -.0881411
       labourforces |   .0129881   .0087059     1.49   0.143    -.0045812    .0305573
               Q288 |  -.0167985    .003236    -5.19   0.000     -.023329    -.010268
                Q47 |   .3063173   .0288775    10.61   0.000     .2480401    .3645945
         noreligion |  -.0442345   .0433878    -1.02   0.314    -.1317945    .0433256
           catholic |   .0041439    .035226     0.12   0.907     -.066945    .0752328
       othcrhistian |  -.0049942   .0342768    -0.15   0.885    -.0741675    .0641792
           orthodox |   .0775668   .0374266     2.07   0.044     .0020368    .1530967
                jew |   .0490626   .0676765     0.72   0.472    -.0875141    .1856393
             muslim |   .0215776   .0373257     0.58   0.566    -.0537487    .0969039
              hindu |   .0693523   .0547739     1.27   0.212    -.0411859    .1798906
           buddhist |   .0984303   .0261631     3.76   0.001     .0456309    .1512297
                    |
ISO31661numericcode |
         Australia  |  -.1522482   .0154076    -9.88   0.000    -.1833419   -.1211544
        Bangladesh  |   .2760279   .0341349     8.09   0.000     .2071408     .344915
           Bolivia  |   .1464712   .0171383     8.55   0.000     .1118847    .1810576
            Brazil  |   .1025043   .0087669    11.69   0.000     .0848121    .1201965
           Myanmar  |   .1958841   .0372335     5.26   0.000     .1207439    .2710243
             Chile  |  -.0090477   .0035332    -2.56   0.014    -.0161779   -.0019175
             China  |  -.1033636   .0100916   -10.24   0.000    -.1237293    -.082998
          Colombia  |  -.0979537   .0139845    -7.00   0.000    -.1261756   -.0697318
            Cyprus  |   -.026828   .0237165    -1.13   0.264    -.0746898    .0210338
           Ecuador  |  -.2601825   .0092046   -28.27   0.000    -.2787581   -.2416069
          Ethiopia  |   .3506961   .0263275    13.32   0.000      .297565    .4038272
           Germany  |  -.1795991   .0157254   -11.42   0.000    -.2113341    -.147864
            Greece  |   .2882757   .0275062    10.48   0.000      .232766    .3437854
         Guatemala  |   .0525576   .0209117     2.51   0.016     .0103562    .0947591
         Indonesia  |   .0303813   .0261791     1.16   0.252    -.0224504    .0832129
              Iran  |   .5735258   .0290082    19.77   0.000     .5149848    .6320667
              Iraq  |   .5601941   .0374295    14.97   0.000     .4846583    .6357299
             Japan  |  -.3304101   .0133383   -24.77   0.000    -.3573279   -.3034923
        Kazakhstan  |  -.3027259   .0176917   -17.11   0.000    -.3384291   -.2670227
            Jordan  |   .3649826   .0278587    13.10   0.000     .3087615    .4212037
       South Korea  |   .1080536    .015124     7.14   0.000     .0775321    .1385752
        Kyrgyzstan  |   -.290065    .022929   -12.65   0.000    -.3363376   -.2437923
           Lebanon  |   .3234899   .0211169    15.32   0.000     .2808742    .3661057
          Malaysia  |   .1212032   .0179881     6.74   0.000     .0849016    .1575047
            Mexico  |  -.2376103   .0029128   -81.58   0.000    -.2434885   -.2317321
       New Zealand  |  -.3199828   .0198921   -16.09   0.000    -.3601267    -.279839
         Nicaragua  |   .0261222   .0131615     1.98   0.054    -.0004387    .0526832
          Pakistan  |   .1366724   .0310633     4.40   0.000     .0739841    .1993608
              Peru  |   .0847401   .0130574     6.49   0.000     .0583891     .111091
       Philippines  |  -.1404712   .0128462   -10.93   0.000    -.1663957   -.1145466
           Romania  |   .1633439   .0158021    10.34   0.000     .1314539    .1952339
            Russia  |  -.0802811   .0137571    -5.84   0.000    -.1080441   -.0525182
            Serbia  |  -.0264661   .0165333    -1.60   0.117    -.0598316    .0068995
           Vietnam  |  -.2643021   .0120918   -21.86   0.000    -.2887044   -.2398998
          Zimbabwe  |   .7015063   .0268754    26.10   0.000     .6472694    .7557431
        Tajikistan  |  -.2753251   .0224417   -12.27   0.000    -.3206144   -.2300359
          Thailand  |   .1382681   .0322071     4.29   0.000     .0732716    .2032646
           Tunisia  |   .2007133   .0254148     7.90   0.000     .1494241    .2520024
            Turkey  |   .1505178   .0225929     6.66   0.000     .1049234    .1961122
           Ukraine  |  -.0565045   .0166858    -3.39   0.002    -.0901778   -.0228312
             Egypt  |   .5448296   .0290499    18.75   0.000     .4862044    .6034547
     United States  |          0  (omitted)
                    |
              _cons |   1.369191   .0835426    16.39   0.000     1.200596    1.537787
-------------------------------------------------------------------------------------

I have read that stata will create dummy variables for all but one country so is that the US? However, I am concerned I am not getting an F-statistic.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#9

24 Feb 2022, 13:29

Beth:
no worries; trivial issues indeed.
1) Stata omits the reference category for -i.county- to avoid the so called dummy trap (search wiki for a quick description);
2) -help j_robust_singular- will reply to your second question about the missing F-statistic.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Beth Plunkett

Join Date: Feb 2022

Posts: 12
#10

28 Feb 2022, 04:23

Originally posted by Carlo Lazzaro View Post

Beth:
no worries; trivial issues indeed.
1) Stata omits the reference category for -i.county- to avoid the so called dummy trap (search wiki for a quick description);
2) -help j_robust_singular- will reply to your second question about the missing F-statistic.

Hi Carlos,

Thank you very much for the reassurance and for bringing that command to my attention. Your help has been much appreciated!

Thanks,
Beth
Comment

Announcement