Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sign of coefficients change after standardising regressors in two-way fixed effects model

    Dear all

    I am studying a model of individual productivity spillovers in football. I have a panel data set of individual player performance attributes and team attributes across seasons.
    I use a two-way fixed effect model, controlling for individual fixed effects and team-by-season fixed effects. The dependent variable is a player performance index for player i and the independent variable of interest is a metric of the average productivity of his teammates (Let's call it avgprod).

    In one specification of the model I run: reghdfe individualperformance avgprod c.avgprod#qualitydummy, absorb(id team#season) where qualitydummy is a dummy for whether player is in 25-50th, 26th-75percentile or >76th percentile of players (Players in <25th percentile will hence be the base, to avoid the dummy variable trap)

    As the unit of avgprod is unknown, I choose to standardise it.

    However, the coefficients on all of the 3 c.avgprod#qualitydummy interaction terms all change signs to negative when I did the standardisation. Previously, they were all positive.

    Is there any reason that might happen? Standisation should naturally change the size of the coefficients, but I don't understand how the signs can. My best guess is that it has something to do with the fact that the "constant" in the two-way fixed effects model change dramatically in scale, but I hope someone can help with a clearer explanation of the behaviour and how to deal with the interpretation in this case.

    Thank you.

    Best
    Sam

  • #2
    Hi Sam,

    Have you figured this problem out? I started using reghdfe recently and have been very happy with it, but this behavior would be really concerning, so I'd be very interested to learn more.

    This is some small code in which reghdfe does exactly what you'd expect (the coefficient increases by the standard deviation of mpg, and the t-statistic, p-value, and other coefficients are unchanged).

    Code:
    clear
    sysuse auto
    
    drop if rep78 == .
    
    reghdfe price mpg gear_ratio, absorb(foreign rep78)
    
    sum mpg
    gen normmpg = (mpg - r(mean))/r(sd)
    
    reghdfe price normmpg gear_ratio, absorb(foreign rep78)
    
    di 170.0455*5.866408
    Is it possible that you "reverse standardized," for instance by doing:
    Code:
    gen avgprod = (r(mean) - origavgprod)/r(sd)
    instead of:
    Code:
    gen avgprod = (origavgprod - r(mean))/r(sd)
    ?

    -Mitch
    Mitch Downey, Grad student, UCSD Economics

    Comment


    • #3
      I don't have an answer to your question. However, you might try creating the interaction variables separately and run them instead of using factor. It might show you something. It might also have something to do with reghdfe . You might try xtreg or even reg with all the dummy variables and see what happens.

      Comment


      • #4
        Hi Mitch,

        What was going on in Sam's case was not related to reghdfe but more general about linear regressions: if you standardize the variables (eg: gen std_weight = (weight - r(mean))/r(sd)) and then multiply them with # (eg: foreign#c.std_weight), then you are not really standardizing the regressors of the model, so the tstats will change.

        Cheers,
        Sergio

        Comment


        • #5
          Wow. To get an understanding of this, I had to work through Sergio's example, which I've posted below. Using the auto data, I'm regressing mpg on foreign##c.weight, then on foreign##c.std_weight. What we see is that the two regressions have the same overall statistics, as you'd expect. Also, the t statistics for the estimates corresponding to continuous variables weight and foreign#c.weight are unchanged by standardization. But the t statistics for the constant and for foreign change substantially. I convinced myself that this made sense by writing down the first regression equation, in terms of weight, and then substituting std_weight*r(sd)+r(mean) for weight (inverting the formula for std_weight in terms of weight) and working through the details. While the slope coefficients (on weight and foreign#weight) change by a multiplicative factor, the constant and foreign coefficients both also change by additive factors.
          Code:
          . clear
          
          . sysuse auto
          (1978 Automobile Data)
          
          . 
          . drop if rep78 == .
          (5 observations deleted)
          
          . 
          . regress mpg foreign##c.weight
          
                Source |       SS       df       MS              Number of obs =      69
          -------------+------------------------------           F(  3,    65) =   45.90
                 Model |  1589.81532     3  529.938439           Prob > F      =  0.0000
              Residual |  750.387582    65  11.5444243           R-squared     =  0.6793
          -------------+------------------------------           Adj R-squared =  0.6646
                 Total |   2340.2029    68  34.4147485           Root MSE      =  3.3977
          
          ----------------------------------------------------------------------------------
                       mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -----------------+----------------------------------------------------------------
                   foreign |
                  Foreign  |   9.862811   5.376593     1.83   0.071    -.8749873    20.60061
                    weight |  -.0059986   .0007203    -8.33   0.000    -.0074373     -.00456
                           |
          foreign#c.weight |
                  Foreign  |  -.0047484   .0022042    -2.15   0.035    -.0091505   -.0003463
                           |
                     _cons |   39.74704   2.475435    16.06   0.000     34.80326    44.69083
          ----------------------------------------------------------------------------------
          
          . 
          . sum weight
          
              Variable |       Obs        Mean    Std. Dev.       Min        Max
          -------------+--------------------------------------------------------
                weight |        69    3032.029    792.8515       1760       4840
          
          . gen std_weight = (weight - r(mean))/r(sd)
          
          . 
          . regress mpg foreign##c.std_weight
          
                Source |       SS       df       MS              Number of obs =      69
          -------------+------------------------------           F(  3,    65) =   45.90
                 Model |   1589.8153     3  529.938434           Prob > F      =  0.0000
              Residual |  750.387595    65  11.5444245           R-squared     =  0.6793
          -------------+------------------------------           Adj R-squared =  0.6646
                 Total |   2340.2029    68  34.4147485           Root MSE      =  3.3977
          
          --------------------------------------------------------------------------------------
                           mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ---------------------+----------------------------------------------------------------
                       foreign |
                      Foreign  |  -4.534522   1.847473    -2.45   0.017     -8.22418   -.8448642
                    std_weight |  -4.756021   .5711285    -8.33   0.000    -5.896643   -3.615398
                               |
          foreign#c.std_weight |
                      Foreign  |  -3.764788   1.747599    -2.15   0.035    -7.254984   -.2745917
                               |
                         _cons |   21.55903   .5469888    39.41   0.000     20.46662    22.65144
          --------------------------------------------------------------------------------------
          
          .

          Comment

          Working...
          X