Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • unexpected results on year dummies

    Hi all,

    im investigating the effects of corruption on literacy rates. I am getting some strange results on my year dummies, my results indicate that literacy rates have been decreasing over time, however all evidence online suggests that they have been increasing! Im not sure why this is happening/how to explain this when reporting my results?

    Code:
    reg newliteracy CPI newmarriage depend edspend mortality newfemteach urban lnGDP i.year
    
          Source |       SS           df       MS      Number of obs   =     1,063
    -------------+----------------------------------   F(25, 1037)     =    170.47
           Model |   410877.86        25  16435.1144   Prob > F        =    0.0000
        Residual |  99977.4977     1,037   96.410316   R-squared       =    0.8043
    -------------+----------------------------------   Adj R-squared   =    0.7996
           Total |  510855.358     1,062  481.031411   Root MSE        =    9.8189
    
    ------------------------------------------------------------------------------
     newliteracy |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             CPI |  -.1372762   .0336015    -4.09   0.000    -.2032109   -.0713415
     newmarriage |     .02767   .1030364     0.27   0.788    -.1745136    .2298536
          depend |  -.0887622   .0359464    -2.47   0.014    -.1592981   -.0182262
         edspend |   .6886944   .2494638     2.76   0.006     .1991831    1.178206
       mortality |  -.4913488   .0306147   -16.05   0.000    -.5514225    -.431275
     newfemteach |    .348379   .0238237    14.62   0.000     .3016309    .3951272
           urban |   .0750816   .0236254     3.18   0.002     .0287225    .1214407
           lnGDP |  -.9519875   .6635857    -1.43   0.152    -2.254111    .3501363
                 |
            year |
           2001  |  -.6425476   1.808145    -0.36   0.722    -4.190589    2.905493
           2002  |  -1.515837   1.801596    -0.84   0.400    -5.051027    2.019352
           2003  |  -2.242362    1.81033    -1.24   0.216    -5.794689    1.309965
           2004  |  -2.686559   1.811094    -1.48   0.138    -6.240386    .8672682
           2005  |  -3.279528   1.812903    -1.81   0.071    -6.836905    .2778491
           2006  |  -3.853175    1.81347    -2.12   0.034    -7.411665   -.2946854
           2007  |  -4.300564   1.815933    -2.37   0.018    -7.863887   -.7372417
           2008  |  -4.726284   1.817127    -2.60   0.009    -8.291949   -1.160618
           2009  |  -5.506428   1.824631    -3.02   0.003    -9.086817   -1.926038
           2010  |  -5.612769     1.8244    -3.08   0.002    -9.192706   -2.032833
           2011  |  -5.727455   1.827278    -3.13   0.002    -9.313039   -2.141871
           2012  |  -5.049478   1.823489    -2.77   0.006    -8.627628   -1.471329
           2013  |  -5.019523   1.826761    -2.75   0.006    -8.604091   -1.434954
           2014  |  -4.768564   1.828538    -2.61   0.009    -8.356621   -1.180506
           2015  |  -4.529592   1.830729    -2.47   0.014    -8.121947   -.9372372
           2016  |  -4.134394   1.833842    -2.25   0.024    -7.732859    -.535929
           2017  |   -3.96663   1.836512    -2.16   0.031    -7.570333    -.362927
                 |
           _cons |   92.32298   5.814951    15.88   0.000     80.91257    103.7334

  • #2
    You have a bunch of other predictors there

    Code:
     
     newmarriage depend edspend mortality newfemteach urban lnGDP
    and any (indeed all) of those could (will) serve as a proxy for time insofar as they change in time. With so many other predictors, it's essentially impossible for your year indicators to capture the trend in time (and nothing else). Your response variable is presumably bounded by zero and complete literacy. It's common to model change in time of literacy by a sigmoid (logistic) curve that respects those bounds. I would certainly expect literacy to change fairly smoothly in time, so you should be able to find a more parsimonious representation of any trend. I wouldn't throw linear regression at such data without circumspection. A generalised linear model would seem more natural. However, if your data are all for areas with literacy around 50% you are near the straightest part of the curve and that may not bite much.

    Comment

    Working...
    X