Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Modelling Fixed Effects for Sports Events

    I'm trying to control for both round and skater fixed effects within panel data.
    Data: I've attached a picture of the data I'm using below, where rnd is the specific round in a given competition in a given year; skater represents the individual who took part in that round; ntn is where they're from, hctry is whether it took part in their home country; cmptrtj is whether there's a judge of the same country in the judging panel; and tseg is the score they received in that round.

    I'm trying to see whether having a compatriot judge will have an impact on tseg however in the model that I'm basing this off of, they used round fixed effects and competitor fixed effects.
    To do this I used:
    - reg tseg hctry cmptrtj i.nskater i. nrnd (I encoded both rnd and skater as they're string variables)
    However I think it might be better to use panel fixed effects with
    - xtset nrnd
    - xtreg tseg cmptrtj hctry i.nskater, fe
    Doing this gives off the same outcome as the previous regression however I'm not sure if I'm doing either correctly and so would appreciate some guidance
    Click image for larger version

Name:	Screenshot 2022-03-24 at 10.42.50.png
Views:	2
Size:	288.4 KB
ID:	1656023
    .


    Attached Files

  • #2
    Anna:
    I fear both your codes are wrong for different reasons:
    1) pooled OLS (BTW: this should not be your first choice when dealing with panel datasets): you did not cluster the standard error on skaters (your -panelid-); otherwise, you assume that the observations belonging to the same panels are independent (which is not true, as you have panel data);
    2) you seemed to have -xtset- your dataset with -timevar- (rnrd) instead of -panelid-.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Carlo,

      Thank you for your response :D
      I've repeated the first method using: reg tseg cmptrtj hctry i.nrnd, vce(cl nskater)
      which has provided me with a coefficient for cmptrtj that is much more in line with what is to be expected.
      For two, I realised that it was indeed the wrong variable used, however I tried with the correct timevar and the results are still not as would be expected. Furthermore, I think I might be going about controlling for round fixed effects.
      I have attached a picture of my two attempts along with the outcomes below, and any comment would be much appreciated.

      If it's of interest, I'm trying to measure compatriot bias in figure skating scoring, both before and following a change in the judging system, through the use of a regression of Scr = b*jcr + ac + nr + ecr
      Scr represents the Score (tseg) of skater c in round r; b represents the judging bias; jcr is a dummy variable to show if a compatriot judge of the same country c is present in round r; ac represents the skater fixed effects; nr the round fixed effects, and ecr is the error term.
      Thus far I'm having issues with how to model both the round and skater fixed effects, as my data set contains various years of rounds both before and after the change, however I'm not the best with stata, and so any input with how to model them would also be appreciated.
      Attached Files

      Comment


      • #4
        Anna:
        1) please do notb post screenshots (they're impossible to elaborate on and terrrible to read).; use CODE delimiters instead (as per FAQ). Thanks.
        2) the result of your regerssions are inconsitent. You should have had the very same point estimates for the coefficients shared by the two regressions (please see the toy-example below):
        Code:
        . use "https://www.stata-press.com/data/r17/nlswork.dta"
        (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
        
        . reg ln_wage i.idcode i.year c.age##c.age if idcode<=3, vce(cluster idcode)
        
        Linear regression                               Number of obs     =         39
                                                        F(2, 2)           =          .
                                                        Prob > F          =          .
                                                        R-squared         =     0.8139
                                                        Root MSE          =     .21943
        
                                         (Std. err. adjusted for 3 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
              idcode |
                  2  |  -.4183815   .0165036   -25.35   0.002    -.4893909   -.3473721
                  3  |   .6579353   .7215294     0.91   0.458    -2.446555    3.762426
                     |
                year |
                 69  |   .3367906   .0914392     3.68   0.066    -.0566406    .7302218
                 70  |   .2089384   .2867011     0.73   0.542    -1.024637    1.442514
                 71  |   .3144116   .1619035     1.94   0.192     -.382203    1.011026
                 72  |   .5888124   .4958888     1.19   0.357    -1.544825     2.72245
                 73  |   .8912873   .5219448     1.71   0.230     -1.35446    3.137034
                 75  |   1.246958   .6073839     2.05   0.176    -1.366404     3.86032
                 77  |   1.560689   .8626802     1.81   0.212    -2.151125    5.272502
                 78  |   1.941522   1.278416     1.52   0.268    -3.559059    7.442103
                 80  |    2.34498   1.525965     1.54   0.264    -4.220718    8.910678
                 82  |   2.698954   1.663018     1.62   0.246    -4.456435    9.854344
                 83  |   2.994437    1.81452     1.65   0.241    -4.812813    10.80169
                 85  |   3.538578   2.210833     1.60   0.251    -5.973868    13.05102
                 87  |   3.965153   2.460506     1.61   0.248    -6.621548    14.55185
                 88  |    4.40786   2.688929     1.64   0.243    -7.161667    15.97739
                     |
                 age |   .0773019   .0106911     7.23   0.019     .0313017    .1233021
                     |
         c.age#c.age |  -.0045583    .002264    -2.01   0.182    -.0142995    .0051828
                     |
               _cons |   1.341224   .1489003     9.01   0.012     .7005575     1.98189
        ------------------------------------------------------------------------------
        
        . xtreg ln_wage i.year c.age##c.age if idcode<=3, fe vce(cluster idcode)
        
        Fixed-effects (within) regression               Number of obs     =         39
        Group variable: idcode                          Number of groups  =          3
        
        R-squared:                                      Obs per group:
             Within  = 0.7404                                         min =         12
             Between = 0.4068                                         avg =       13.0
             Overall = 0.4014                                         max =         15
        
                                                        F(4,2)            =          .
        corr(u_i, Xb) = -0.8560                         Prob > F          =          .
        
                                         (Std. err. adjusted for 3 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
                year |
                 69  |   .3367906   .0871839     3.86   0.061    -.0383313    .7119126
                 70  |   .2089384   .2733588     0.76   0.525    -.9672295    1.385106
                 71  |   .3144116   .1543689     2.04   0.179    -.3497843    .9786076
                 72  |   .5888124   .4728115     1.25   0.339    -1.445531    2.623156
                 73  |   .8912873   .4976548     1.79   0.215    -1.249948    3.032523
                 75  |   1.246958   .5791178     2.15   0.164    -1.244785    3.738701
                 77  |   1.560689   .8225333     1.90   0.198    -1.978387    5.099764
                 78  |   1.941522   1.218922     1.59   0.252    -3.303077    7.186121
                 80  |    2.34498   1.454951     1.61   0.248    -3.915167    8.605128
                 82  |   2.698954   1.585626     1.70   0.231    -4.123442     9.52135
                 83  |   2.994437   1.730077     1.73   0.226    -4.449484    10.43836
                 85  |   3.538578   2.107946     1.68   0.235    -5.531183    12.60834
                 87  |   3.965153      2.346     1.69   0.233     -6.12887    14.05918
                 88  |    4.40786   2.563793     1.72   0.228    -6.623251    15.43897
                     |
                 age |   .0773019   .0101936     7.58   0.017     .0334424    .1211613
                     |
         c.age#c.age |  -.0045583   .0021586    -2.11   0.169    -.0138461    .0047294
                     |
               _cons |   1.465543   .3990418     3.67   0.067    -.2513952    3.182481
        -------------+----------------------------------------------------------------
             sigma_u |  .54258328
             sigma_e |  .21942548
                 rho |  .85944136   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        .
        3) with 160 panel, cluster-robust standard errors are mandatory for -xtreg, fe-, too;
        4) the within R_sq of your -xtreg- is dramatically low. I would check whether misspecification is actually an issue: Elaborating on my previous example (-xtreg,fe- code):
        Code:
        . predict fitted, xb
        (24 missing values generated)
        
        . g sq_fitted=fitted^2
        (24 missing values generated)
        
        . xtreg ln_wage   fitted sq_fitted if idcode<=3, fe vce(cluster idcode)
        
        Fixed-effects (within) regression               Number of obs     =         39
        Group variable: idcode                          Number of groups  =          3
        
        R-squared:                                      Obs per group:
             Within  = 0.7466                                         min =         12
             Between = 0.4010                                         avg =       13.0
             Overall = 0.4359                                         max =         15
        
                                                        F(2,2)            =       5.56
        corr(u_i, Xb) = -0.8142                         Prob > F          =     0.1524
        
                                         (Std. err. adjusted for 3 clusters in idcode)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        -------------+----------------------------------------------------------------
              fitted |   .3239846   .8507574     0.38   0.740    -3.336529    3.984498
           sq_fitted |   .1650838   .2584668     0.64   0.588     -.947009    1.277177
               _cons |   .6150055   .6950832     0.88   0.470    -2.375696    3.605707
        -------------+----------------------------------------------------------------
             sigma_u |  .46679608
             sigma_e |  .16629261
                 rho |  .88738332   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        .
        As -sq_fitted- coefficient does not reach statistical significance, the functional form of the regressand is not misspecified.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment

        Working...
        X