unexpected results on year dummies

Oliver Gatland

Join Date: Mar 2019
Posts: 10

unexpected results on year dummies

25 Apr 2019, 03:50

Hi all,

im investigating the effects of corruption on literacy rates. I am getting some strange results on my year dummies, my results indicate that literacy rates have been decreasing over time, however all evidence online suggests that they have been increasing! Im not sure why this is happening/how to explain this when reporting my results?

Code:

reg newliteracy CPI newmarriage depend edspend mortality newfemteach urban lnGDP i.year

      Source |       SS           df       MS      Number of obs   =     1,063
-------------+----------------------------------   F(25, 1037)     =    170.47
       Model |   410877.86        25  16435.1144   Prob > F        =    0.0000
    Residual |  99977.4977     1,037   96.410316   R-squared       =    0.8043
-------------+----------------------------------   Adj R-squared   =    0.7996
       Total |  510855.358     1,062  481.031411   Root MSE        =    9.8189

------------------------------------------------------------------------------
 newliteracy |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         CPI |  -.1372762   .0336015    -4.09   0.000    -.2032109   -.0713415
 newmarriage |     .02767   .1030364     0.27   0.788    -.1745136    .2298536
      depend |  -.0887622   .0359464    -2.47   0.014    -.1592981   -.0182262
     edspend |   .6886944   .2494638     2.76   0.006     .1991831    1.178206
   mortality |  -.4913488   .0306147   -16.05   0.000    -.5514225    -.431275
 newfemteach |    .348379   .0238237    14.62   0.000     .3016309    .3951272
       urban |   .0750816   .0236254     3.18   0.002     .0287225    .1214407
       lnGDP |  -.9519875   .6635857    -1.43   0.152    -2.254111    .3501363
             |
        year |
       2001  |  -.6425476   1.808145    -0.36   0.722    -4.190589    2.905493
       2002  |  -1.515837   1.801596    -0.84   0.400    -5.051027    2.019352
       2003  |  -2.242362    1.81033    -1.24   0.216    -5.794689    1.309965
       2004  |  -2.686559   1.811094    -1.48   0.138    -6.240386    .8672682
       2005  |  -3.279528   1.812903    -1.81   0.071    -6.836905    .2778491
       2006  |  -3.853175    1.81347    -2.12   0.034    -7.411665   -.2946854
       2007  |  -4.300564   1.815933    -2.37   0.018    -7.863887   -.7372417
       2008  |  -4.726284   1.817127    -2.60   0.009    -8.291949   -1.160618
       2009  |  -5.506428   1.824631    -3.02   0.003    -9.086817   -1.926038
       2010  |  -5.612769     1.8244    -3.08   0.002    -9.192706   -2.032833
       2011  |  -5.727455   1.827278    -3.13   0.002    -9.313039   -2.141871
       2012  |  -5.049478   1.823489    -2.77   0.006    -8.627628   -1.471329
       2013  |  -5.019523   1.826761    -2.75   0.006    -8.604091   -1.434954
       2014  |  -4.768564   1.828538    -2.61   0.009    -8.356621   -1.180506
       2015  |  -4.529592   1.830729    -2.47   0.014    -8.121947   -.9372372
       2016  |  -4.134394   1.833842    -2.25   0.024    -7.732859    -.535929
       2017  |   -3.96663   1.836512    -2.16   0.031    -7.570333    -.362927
             |
       _cons |   92.32298   5.814951    15.88   0.000     80.91257    103.7334

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35405
#2

25 Apr 2019, 04:09

You have a bunch of other predictors there

Code:

newmarriage depend edspend mortality newfemteach urban lnGDP

and any (indeed all) of those could (will) serve as a proxy for time insofar as they change in time. With so many other predictors, it's essentially impossible for your year indicators to capture the trend in time (and nothing else). Your response variable is presumably bounded by zero and complete literacy. It's common to model change in time of literacy by a sigmoid (logistic) curve that respects those bounds. I would certainly expect literacy to change fairly smoothly in time, so you should be able to find a more parsimonious representation of any trend. I wouldn't throw linear regression at such data without circumspection. A generalised linear model would seem more natural. However, if your data are all for areas with literacy around 50% you are near the straightest part of the curve and that may not bite much.
1 like
Comment

Announcement

unexpected results on year dummies

Comment