Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Accessing individual fixed effects with xtreg?

    Consider the panel fixed effects model
    yit = xitb + vi + eit

    Using "xtreg, fe" the reported intercept is the average value of the fixed effects v_i under the constraint c1:
    Σ
    i=1
    Ti
    Σ
    t=1
    vi = 0 (c1)
    See: https://www.stata.com/support/faqs/s...effects-model/


    However I would like to obtain each of the individual fixed effects v_i, not just the average.

    Is there a way to access the v_i's using xtreg?

    If not, how can I use a dummy variable fixed effects approach with the standard "regress" command so that I get the same v_i's that xtreg implicitly computes under the constraint c1?

    thanks for any advice

  • #2
    However I would like to obtain each of the individual fixed effects v_i, not just the average.
    So given the model

    $$ y_{it} = a + x_{it}b + \eta_{i} + e_{it} \;\; (i= 1, \cdots, N;\; t=1, \cdots, T)$$

    where \(\eta_{i}\) are the individual effects, Bill Gould has explained to you in the link that you provide that the in the absence of constraints, the parameters \(a\) and \(\eta_{i}\) do not have a unique solution. So taking into account the constraints that Stata places on the system (also illustrated in the link), you can predict the individual effects using

    Code:
    predict uhat, u
    Alternatively, you can predict the fixed effects residuals and average them across individuals in the panel

    Code:
    predict res, r
    bys id: egen uhat2= mean(res)
    If not, how can I use a dummy variable fixed effects approach with the standard "regress" command so that I get the same v_i's that xtreg implicitly computes under the constraint c1
    From the Frisch-Waugh theorem, we know that the within estimator is equivalent to including a set of \(N-1\) individual dummy variables in the regression (Least Squares Dummy Variables or LSDV). Provided that the panel is balanced (i.e., the same number of observations for each individual with no holes), the residuals from LSDV are exactly the deviations from the individual means. See the example below

    Code:
    webuse grunfeld, clear
    quietly xtreg invest mvalue kstock, fe
    predict uhat, u
    predict res_fe, r
    bys company: egen uhat2= mean(res_fe)
    quietly reg invest mvalue kstock i.company
    predict res_lsdv, r
    *SHOW THAT res_lsdv =  res_fe - mean(res_fe)
    gen resfe_dev =  res_fe - uhat2
    list uhat uhat2 res_fe res_lsdv resfe_dev if time<4, sepby(company)
    Code:
    . list uhat uhat2 res_fe res_lsdv resfe_dev if time<4, sepby(company)
    
         +-----------------------------------------------------------+
         |      uhat       uhat2      res_fe    res_lsdv   resfe_dev |
         |-----------------------------------------------------------|
      1. | -11.55275   -11.55275    36.45965     48.0124     48.0124 |
      2. | -11.55275   -11.55275   -79.12964   -67.57689   -67.57689 |
      3. | -11.55275   -11.55275   -172.5532   -161.0005   -161.0005 |
         |-----------------------------------------------------------|
     21. |  160.6498    160.6497    101.9297   -58.72001      -58.72 |
     22. |  160.6498    160.6497    199.3809    38.73115    38.73116 |
     23. |  160.6498    160.6497    197.3009    36.65113    36.65114 |
         |-----------------------------------------------------------|
     41. | -176.8279   -176.8279   -67.39137    109.4365    109.4365 |
     42. | -176.8279   -176.8279   -150.6144    26.21345    26.21346 |
     43. | -176.8279   -176.8279   -209.3538   -32.52593   -32.52592 |
         |-----------------------------------------------------------|
     61. |  30.93464    30.93464    49.80156    18.86692    18.86692 |
     62. |  30.93464    30.93464    36.07955    5.144909    5.144909 |
     63. |  30.93464    30.93464    16.90624    -14.0284    -14.0284 |
         |-----------------------------------------------------------|
     81. | -55.87288   -55.87288    24.25344    80.12632    80.12632 |
     82. | -55.87288   -55.87288    27.73082     83.6037     83.6037 |
     83. | -55.87288   -55.87288    38.56563    94.43851    94.43851 |
         |-----------------------------------------------------------|
    101. |  35.58264    35.58264    55.39412    19.81148    19.81148 |
    102. |  35.58264    35.58264    56.66586    21.08323    21.08323 |
    103. |  35.58264    35.58264     51.5265    15.94386    15.94386 |
         |-----------------------------------------------------------|
    121. | -7.809542   -7.809542     36.9083    44.71784    44.71784 |
    122. | -7.809542   -7.809542    21.15999    28.96953    28.96953 |
    123. | -7.809542   -7.809542    24.23362    32.04316    32.04316 |
         |-----------------------------------------------------------|
    141. |  1.198279    1.198279    50.02711    48.82883    48.82883 |
    142. |  1.198279    1.198279      27.572    26.37372    26.37372 |
    143. |  1.198279    1.198279     11.2192    10.02092    10.02092 |
         |-----------------------------------------------------------|
    161. | -28.47834   -28.47834    3.141369    31.61971    31.61971 |
    162. | -28.47834   -28.47834   -3.874476    24.60386    24.60386 |
    163. | -28.47834   -28.47834   -4.239498    24.23884    24.23884 |
         |-----------------------------------------------------------|
    181. |  52.17609    52.17609    52.07976   -.0963299   -.0963287 |
    182. |  52.17609    52.17609    49.59924   -2.576852   -2.576851 |
    183. |  52.17609    52.17609    50.46476   -1.711331    -1.71133 |
         +-----------------------------------------------------------+
    Last edited by Andrew Musau; 10 Apr 2018, 03:47.

    Comment


    • #3
      Cool this is great thanks!

      I have another question tho - what has happened the constant a from the model that you post?

      What is the expected value of a given the constraint?

      Is there a way to retrieve the actual value of a using postestimation?




      For example, to estimate the impact on fixed effects of a firm characteristic k, I am thinking of using two distinct regressions, a baseline (without characteristic k) and the baseline with characteristic k:
      yit = (a + vi) + xitb + eit // baseline without k
      yitk = (ak + vik) + xitkb + kitkbk + eitk // baseline plus k where a and a_k are constants, and v_i and v_ik are the fixed effects. I would then measure impact of characteristic k as the difference between the intercepts:
      Impactk = (a + vi) - (ak + vik) Does this make sense? Can I assume the constants a and a_k cancel out? Or can I retrieve a and a_k using postestimation and use them in the Impact_k equation?

      Comment


      • #4
        <repost second part of previous msg with better formatting >



        For example, to estimate the impact on fixed effects of a firm characteristic k, I am thinking of using two distinct regressions, a baseline (without characteristic k) and the baseline with characteristic k:

        yit = (a + vi) + xitb + eit // baseline without k

        yitk = (ak + vik) + xitkb + kitkbk + eitk // baseline plus k

        where a and a_k are constants, and v_i and v_ik are the fixed effects. I would then measure impact of characteristic k as the difference between the intercepts:

        Impactk = (a + vi) - (ak + vik)


        Does this make sense? Can I assume the constants a and a_k cancel out? Or can I retrieve a and a_k using postestimation and use them in the Impact_k equation?

        Comment


        • #5
          It should be added that in a typical large-N, small-T setup the individual fixed effects cannot be estimated consistently and any attempt to do so is essentially meaningless. Estimates of individual fixed effects are not reliable and should not be used for statistical comparisons.
          https://twitter.com/Kripfganz

          Comment


          • #6
            Originally posted by Sebastian Kripfganz View Post
            It should be added that in a typical large-N, small-T setup the individual fixed effects cannot be estimated consistently and any attempt to do so is essentially meaningless. Estimates of individual fixed effects are not reliable and should not be used for statistical comparisons.
            OK, what about comparing average fixed effects in a panel regression with, say, 1000 individuals over 250 months?

            I would like to say something like, "the average fixed effects for the baseline model are x points greater than the average fixed effects for the baseline model plus characteristic k, therefore characteristic k has an impact on average fixed effects."

            Comment


            • #7
              Are you referring to adding a variable to your model? The constant and estimates of the individual fixed effects are meaningless. What you are interested in are the estimates of the time varying regressors and you should concentrate on how these change across different model specifications.

              Comment


              • #8
                Thanks for the replies - this is all very interesting.

                I understand that individual fixed effects are meaningless, but the reason I am interested in how individual fixed effects are estimated is so I can correctly estimate the average fixed effects. Average fixed effects have been used comparatively in some top journals (eg Journal of Financial Economics).

                By the way, I notice that the intercept reported by xtreg does not seem to correspond with the average fixed effects estimated using the "predict" command - for example, if I execute this code:

                Code:
                webuse grunfeld, clear
                quietly xtreg invest mvalue kstock, fe
                gen intercept=_b[_cons] // average fixed effects from xtreg
                predict uhat, u
                predict res_fe, r
                bys company: egen uhat2= mean(res_fe)
                egen meanuhat = mean(uhat) // average fixed effects using predict uhat, u
                egen meanuhat2 = mean(res_fe) // average fixed effects using predict res_fe, r
                bys company: gen first=1 if _n==1
                egen meanuhat3 = mean(uhat) if first==1
                
                list uhat uhat2 res_fe intercept meanuhat meanuhat2 meanuhat3 if time&lt;4, sepby(company)
                Then I get the output below where it is clear that the reported intercept is not the same as the mean of the predicted fixed effects.

                So maybe the intercept reported by xtreg is mean(a + v_i), while the predicted fixed effects are just the v_i (under the constraint used by xtreg described in the previously linked article)?


                Code:
                     +---------------------------------------------------------------------------------+
                     |      uhat       uhat2      res_fe   intercept   meanuhat   meanuhat2   meanuh~3 |
                     |---------------------------------------------------------------------------------|
                  1. | -11.55275   -11.55275    36.45965   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                  2. | -11.55275   -11.55275   -79.12964   -58.74393   6.32e-07   -7.42e-08          . |
                  3. | -11.55275   -11.55275   -172.5532   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                 21. |  160.6498    160.6497    101.9297   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                 22. |  160.6498    160.6497    199.3809   -58.74393   6.32e-07   -7.42e-08          . |
                 23. |  160.6498    160.6497    197.3009   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                 41. | -176.8279   -176.8279   -67.39137   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                 42. | -176.8279   -176.8279   -150.6144   -58.74393   6.32e-07   -7.42e-08          . |
                 43. | -176.8279   -176.8279   -209.3538   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                 61. |  30.93464    30.93464    49.80156   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                 62. |  30.93464    30.93464    36.07955   -58.74393   6.32e-07   -7.42e-08          . |
                 63. |  30.93464    30.93464    16.90624   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                 81. | -55.87288   -55.87288    24.25344   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                 82. | -55.87288   -55.87288    27.73082   -58.74393   6.32e-07   -7.42e-08          . |
                 83. | -55.87288   -55.87288    38.56563   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                101. |  35.58264    35.58264    55.39412   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                102. |  35.58264    35.58264    56.66586   -58.74393   6.32e-07   -7.42e-08          . |
                103. |  35.58264    35.58264     51.5265   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                121. | -7.809542   -7.809542     36.9083   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                122. | -7.809542   -7.809542    21.15999   -58.74393   6.32e-07   -7.42e-08          . |
                123. | -7.809542   -7.809542    24.23362   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                141. |  1.198279    1.198279    50.02711   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                142. |  1.198279    1.198279      27.572   -58.74393   6.32e-07   -7.42e-08          . |
                143. |  1.198279    1.198279     11.2192   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                161. | -28.47834   -28.47834    3.141369   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                162. | -28.47834   -28.47834   -3.874476   -58.74393   6.32e-07   -7.42e-08          . |
                163. | -28.47834   -28.47834   -4.239498   -58.74393   6.32e-07   -7.42e-08          . |
                     |---------------------------------------------------------------------------------|
                181. |  52.17609    52.17609    52.07976   -58.74393   6.32e-07   -7.42e-08   6.32e-07 |
                182. |  52.17609    52.17609    49.59924   -58.74393   6.32e-07   -7.42e-08          . |
                183. |  52.17609    52.17609    50.46476   -58.74393   6.32e-07   -7.42e-08          . |
                     +---------------------------------------------------------------------------------+

                Comment


                • #9
                  I understand that individual fixed effects are meaningless, but the reason I am interested in how individual fixed effects are estimated is so I can correctly estimate the average fixed effects. Average fixed effects have been used comparatively in some top journals (eg Journal of Financial Economics)
                  Fine, you should consult someone who is an expert in that field.

                  Then I get the output below where it is clear that the reported intercept is not the same as the mean of the predicted fixed effects.

                  So maybe the intercept reported by xtreg is mean(a + v_i), while the predicted fixed effects are just the v_i (under the constraint used by xtreg described in the previously linked article)?
                  Again, from the FAQ in #1 it is explained that the constraint xtreg,fe places on the system is that the individual effects sum to 0 across all observations. The implication of this is that the average value of the fitted values equals the average value of the outcome. So from the estimates \(\hat{a}\) and \(\hat{b}\) in the Equation in #2, the estimates of the individual effects are obtained as

                  $$u_{i} = \bar{y}_{i}- \hat{a} - \bar{x}_{i}\hat{b}$$

                  So once you have estimates of the individual effects and knowing \(\bar{u}_{i} =0\), you can back out the estimated intercept

                  Code:
                  webuse grunfeld
                  xtreg invest mvalue kstock, fe
                  
                  local vars invest mvalue kstock
                  foreach var in `vars'{
                  egen m`var'= mean(`var')
                  }
                  
                  gen intercept= minvest- (_b[mvalue]*mmvalue + _b[kstock]*mkstock)
                  sum intercept

                  Code:
                    _cons |  -58.74393   12.45369    -4.72   0.000    -83.31086     -34.177
                  _________________________________________________________
                  
                  . sum intercept
                  
                      Variable |        Obs        Mean    Std. Dev.       Min        Max
                  -------------+---------------------------------------------------------
                     intercept |        200   -58.74393           0  -58.74393  -58.74393
                  Last edited by Andrew Musau; 11 Apr 2018, 12:38.

                  Comment


                  • #10
                    Very clear thanks!

                    There is just one thing I am still troubled by which is the relationship between the intercept and the mean of the average individual fixed effects, where individual fixed effects are estimated using predict as described in post #2. The FAQ says the reported intercept is the average value of the fixed effects, yet the mean of the average individual fixed effects (estimated using predict) is not the same as the reported intercept (see post #8). I guess there is some relationship but I dont see it - would it be possible to clarify this?

                    Comment


                    • #11
                      I will answer my own question - the average individual fixed effects (estimated using predict) is 0 by construction.

                      The reported intercept then includes the average "true" fixed effects,
                      Last edited by Maurice McCourt; 12 Apr 2018, 00:29.

                      Comment


                      • #12

                        There is just one thing I am still troubled by which is the relationship between the intercept and the mean of the average individual fixed effects, where individual fixed effects are estimated using predict as described in post #2. The FAQ says the reported intercept is the average value of the fixed effects, yet the mean of the average individual fixed effects (estimated using predict) is not the same as the reported intercept (see post #8). I guess there is some relationship but I dont see it - would it be possible to clarify this?
                        This implied separability of the "intercept" and the fixed effects is artificial, it results from the constraint that you place prior to estimation. It is very important that you understand this point, otherwise you might start assigning meaning to these estimates. The chosen constraint is arbitrary and If I impose a different constraint, I will get a new set of values for the intercept and the fixed effects. So without separating these, predict with option xbu gives you the prediction including the individual effect, i.e., \(a +x_{it}b + u_{i}\). If you define \(\gamma = a + u_{i}\), then the average value of \(\gamma \) will be what you will call an intercept in xtreg, fe. See below


                        Code:
                        webuse grunfeld
                        xtreg invest mvalue kstock, fe
                        predict xbu, xbu
                        gen xb= _b[mvalue]*mvalue + _b[kstock]*kstock
                        gen gamma = xbu-xb
                        sum gamma


                        Code:
                        . xtreg invest mvalue kstock, fe
                        
                        Fixed-effects (within) regression               Number of obs     =        200
                        Group variable: company                         Number of groups  =         10
                        
                        R-sq:                                           Obs per group:
                             within  = 0.7668                                         min =         20
                             between = 0.8194                                         avg =       20.0
                             overall = 0.8060                                         max =         20
                        
                                                                        F(2,188)          =     309.01
                        corr(u_i, Xb)  = -0.1517                        Prob > F          =     0.0000
                        
                        ------------------------------------------------------------------------------
                              invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                              mvalue |   .1101238   .0118567     9.29   0.000     .0867345    .1335131
                              kstock |   .3100653   .0173545    17.87   0.000     .2758308    .3442999
                               _cons |  -58.74393   12.45369    -4.72   0.000    -83.31086     -34.177
                        -------------+----------------------------------------------------------------
                             sigma_u |  85.732501
                             sigma_e |  52.767964
                                 rho |  .72525012   (fraction of variance due to u_i)
                        ------------------------------------------------------------------------------
                        F test that all u_i=0: F(9, 188) = 49.18                     Prob > F = 0.0000
                        
                        . 
                        . predict xbu, xbu
                        
                        . 
                        . gen xb= _b[mvalue]*mvalue + _b[kstock]*kstock
                        
                        . 
                        . gen gamma = xbu-xb
                        
                        . 
                        . sum gamma
                        
                            Variable |        Obs        Mean    Std. Dev.       Min        Max
                        -------------+---------------------------------------------------------
                               gamma |        200   -58.74393    81.53709  -235.5718   101.9059

                        Comment


                        • #13
                          Excellent explanations and examples, many thanks.

                          Comment

                          Working...
                          X