Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata calculating biased coefficient?

    Hello everyone,

    I have a question regarding a Stata coefficient.
    Our professor told us to set obs = 100, generate 2 normally distributed variables (x with (1,2) and u with (0,1)) and y which equals y = 2 - 0.5*x + u. Then to regress y on x.

    Now Stata gives us a biased coefficient, which is different every time we change the observations (thus not consistent).

    We do not understand why it is not providing us with the correct coefficient. It must be somehow biased, but why??

    I would really appreciate every help!

    Thanks so much in advance,

    Maja

  • #2
    The inaccuracy in the coefficient estimates produced by regress from your sample data come from the fact that y is created as a function of (the independent variable) x and (the error term) u, but the regression only includes data on y and x, not u. Datasets with different values of u will give different coefficient estimates. And the column with the Standard Error gives an estimate of the accuracy of the coefficient estimates.

    In the example below I first regress y on x, then I create z = 2 - 0.5*x (no error term y) and regress z on x, and the coefficient estimates are exactly accurate.
    Code:
    . set obs 100
    number of observations (_N) was 0, now 100
    
    . set seed 666
    
    . generate x = rnormal(1,2)
    
    . generate u = rnormal(0,1)
    
    . generate y = 2 - 0.5*x + u
    
    . regress y x
    
          Source |       SS           df       MS      Number of obs   =       100
    -------------+----------------------------------   F(1, 98)        =     63.70
           Model |  57.7903868         1  57.7903868   Prob > F        =    0.0000
        Residual |  88.9131984        98  .907277534   R-squared       =    0.3939
    -------------+----------------------------------   Adj R-squared   =    0.3877
           Total |  146.703585        99   1.4818544   Root MSE        =    .95251
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               x |  -.4156218   .0520764    -7.98   0.000    -.5189657    -.312278
           _cons |   1.884016   .1086711    17.34   0.000     1.668362     2.09967
    ------------------------------------------------------------------------------
    
    . generate z = 2 - 0.5*x
    
    . regress z x
    
          Source |       SS           df       MS      Number of obs   =       100
    -------------+----------------------------------   F(1, 98)        =         .
           Model |  83.6370917         1  83.6370917   Prob > F        =         .
        Residual |           0        98           0   R-squared       =    1.0000
    -------------+----------------------------------   Adj R-squared   =    1.0000
           Total |  83.6370917        99  .844819108   Root MSE        =         0
    
    ------------------------------------------------------------------------------
               z |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               x |        -.5          .        .       .            .           .
           _cons |          2          .        .       .            .           .
    ------------------------------------------------------------------------------
    
    .

    Comment


    • #3
      Thank you so much!! Now this makes much more sense, I really appreciate your help.

      Comment


      • #4
        Thank you so much!! Now this makes much more sense, I really appreciate your help.

        Comment

        Working...
        X