Fixed-Effects (FE) Panel Regression Model with 'reg' 'reghdfe'

Jason Rhee

Join Date: Feb 2025

Posts: 6
#1

Fixed-Effects (FE) Panel Regression Model with 'reg' 'reghdfe'

21 Feb 2025, 23:05

For my assignment, my sample dataset structure is : i (user:1-500) - j (platform=0/1) - t (time:1-24). Number of obs 23,450 (not perfectly structured)
I was going to estimate the impact using Fixed-Effects (FE) panel regression model.

If I try to use 'xtreg' : I would get an error "repeated time values within panel" when run 'xtset user time' because there are rows for users 1-500 when platform = 0 and 1 respectively. So if I run 'egen panel_id = group(user platform)' and then 'xtset panel_id time', I think this method is not right either.
I used 'reg dependent independent i.user i.time, robust' and 'reghdfe dependent independent, absorb (user time) vce (cluster user)

However, I received a comment that the panel data setup is wrong and should come up with method properly estimated using panel data.
Even when I ask ChatGPT: "reghdfe is indeed a fixed effects estimator, just implemented in a more flexible and efficient way.".

Am I missing something here?

Last edited by Jason Rhee; 21 Feb 2025, 23:07.
Tags: fixed effects, panel data, reghdfe, regression, xtreg
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17614
#2

22 Feb 2025, 02:07

Jason;
welcome to this forum.
As far as I can see, what you're missing https://www.statalist.org/forums/help#adviceextras #4 .
That said, just challenge yourself a bit more with -xtreg- abd related stuff.
Then come back to the list with what you typed and what Stata gave you back.

Kind regards,
Carlo
(StataNow 18.5)
Comment
Jason Rhee

Join Date: Feb 2025

Posts: 6
#3

22 Feb 2025, 10:21

Carlo:
I assumed there would not be any issue since it was not part of the grading and was already completed—I was just personally curious. That said, thank you for your comment!
Comment
George Ford

Join Date: Aug 2014

Posts: 3081
#4

22 Feb 2025, 14:31

You can use reghdfe for pooled data. As for as the comment about panel data setup being wrong, it's not a panel at all.
Comment
Jason Rhee

Join Date: Feb 2025

Posts: 6
#5

22 Feb 2025, 20:04

Originally posted by George Ford View Post

You can use reghdfe for pooled data. As for as the comment about panel data setup being wrong, it's not a panel at all.

When you say "it is not a panel at all." Do you mean because number of observations are not 500 x 24 x 2 = 24,000 ? Since almost 500 users all have each row at 24 times points at j=0 and 1, I thought it is a panel data but just unbalanced.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17614
#6

23 Feb 2025, 03:37

Jason:
the (hopefully) useful advice was to read the FAQ before posting.
I would still be interested in what you typed and what Stata gave you back.

Kind regards,
Carlo
(StataNow 18.5)
Comment
George Ford

Join Date: Aug 2014

Posts: 3081
#7

23 Feb 2025, 08:47

I saw it was unbalanced and xtset wouldn't work, so I concluded it was a pool.

You are stacking 2 separate panels of the same users, each with 24 periods?
Comment
Jason Rhee

Join Date: Feb 2025

Posts: 6
#8

24 Feb 2025, 14:53

Originally posted by George Ford View Post

I saw it was unbalanced and xtset wouldn't work, so I concluded it was a pool.

You are stacking 2 separate panels of the same users, each with 24 periods?

That is correct. I used another panel dataset with i (user) – t (time) level. When I run the stata codes :

reg dependent independent i.user i.time, robust
xtset user time -> xtreg dependent independent i.time, fe
reghdfe dependent independent, absorb (user time) vce (cluster user)

I got the same result (coefficient). So, going back to the original dataset where i (user:1-500) - j (platform=0/1) - t (time:1-24), where I stacked 2 separate panels of the same users, each with 24 periods, isn't okay I run the code:

reg dependent independent i.user i.time, robust
reghdfe dependent independent, absorb (user time) vce (cluster user)

and claim "estimate the impact using Fixed-Effects (FE) panel regression model" ? I think that I did come up with method properly estimated using panel data.
Comment
Jason Rhee

Join Date: Feb 2025

Posts: 6
#9

24 Feb 2025, 14:56

Originally posted by Carlo Lazzaro View Post

Jason;
welcome to this forum.
As far as I can see, what you're missing https://www.statalist.org/forums/help#adviceextras #4 .
That said, just challenge yourself a bit more with -xtreg- abd related stuff.
Then come back to the list with what you typed and what Stata gave you back.

I used another panel dataset with i (user) – t (time) level. When I run the stata codes :

reg dependent independent i.user i.time, robust
xtset user time -> xtreg dependent independent i.time, fe
reghdfe dependent independent, absorb (user time) vce (cluster user)

I got the same result (coefficient). So, going back to the original dataset where i (user:1-500) - j (platform=0/1) - t (time:1-24), where I stacked 2 separate panels of the same users, each with 24 periods, isn't okay I run the code:

reg dependent independent i.user i.time, robust
reghdfe dependent independent, absorb (user time) vce (cluster user)

and claim "estimate the impact using Fixed-Effects (FE) panel regression model" ? I think that I did come up with method properly estimated using panel data. I do not think fixed effect panel data regression should be done only with xtset-xtreg, am I missing something here?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 9969
#10

24 Feb 2025, 15:35

Originally posted by Jason Rhee View Post

I assumed there would not be any issue since it was not part of the grading and was already completed—I was just personally curious.

Doesn't the instructor provide guidance on completed assignments? It is your right to request it, given that he or she is paid to do so.
1 like
Comment
George Ford

Join Date: Aug 2014

Posts: 3081
#11

24 Feb 2025, 15:59

I'm confused about stacking two panels of the same users. What's the difference platform implies? I suspect you may want to know what that might be. In such case, you'd need to estimate coefficients separately for each platform. If you think the coef are the same, then absorb platform too in reghdfe.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17614

#12

25 Feb 2025, 02:33

Jason:
thanks for clarifying a bit more what you are after.
Some comments follow:
1) -reg dependent independent i.user i.time, robust- does not take within-panel autocorrelation of the epsilon into account. In fact, -robust- in -regress- accounts for heteroskedasticity only. You should impose -vce(cluster panelid)- standard errors instead. Conversely, in -xtreg- both -robust- and -vce(cluster panelid)- call the cluster-robust standard errors (put differently, they do the very same job).
That said, assuming you want to go -fe- with -regress-, your code should have been:

Code:

reg dependent independent i.user i.time, vce(cluster user)

You will get the same coefficients that you got with your code and the correct standard errors (assuming that you have at least 30 panels, and not 3 as in the following toy-example).

2) if you code the same with -xtreg,fe-, you get:

Code:

. xtreg ln_wage i.year if idcode<=3, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =         39
Group variable: idcode                          Number of groups  =          3

R-squared:                                      Obs per group:
     Within  = 0.5446                                         min =         12
     Between = 0.2670                                         avg =       13.0
     Overall = 0.3678                                         max =         15

                                                F(3, 2)           =          .
corr(u_i, Xb) = -0.0356                         Prob > F          =          .

                                 (Std. err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
         69  |    .208967   3.41e-08  6.1e+06   0.000     .2089668    .2089671
         70  |  -.2747772   .2552143    -1.08   0.394    -1.372876    .8233215
         71  |  -.3613911   .3640359    -0.99   0.425    -1.927711    1.204929
         72  |  -.2056973   .1967664    -1.05   0.406    -1.052315      .64092
         73  |  -.0310461   .0967648    -0.32   0.779    -.4473915    .3852993
         75  |   .0416271   .1575174     0.26   0.816    -.6361157      .71937
         77  |   .0358937   .1303686     0.28   0.809    -.5250371    .5968246
         78  |   .2433199   .1906609     1.28   0.330    -.5770276    1.063667
         80  |   .2726139   .2105344     1.29   0.325    -.6332423     1.17847
         82  |   .1747839   .0767088     2.28   0.150    -.1552673    .5048351
         83  |   .2924489    .129739     2.25   0.153    -.2657727    .8506706
         85  |   .3712589   .1848931     2.01   0.182    -.4242719     1.16679
         87  |   .2960361   .2044639     1.45   0.285    -.5837012    1.175773
         88  |   .3038639   .1462331     2.08   0.173    -.3253264    .9330542
             |
       _cons |   1.659677   .0055719   297.86   0.000     1.635703    1.683651
-------------+----------------------------------------------------------------
     sigma_u |  .24956596
     sigma_e |  .27711004
         rho |  .44784468   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

Therefore, if your statement is that you can get the same regression coefficient for the -fe- estimator using different Stata codes, you're right.
However:
1) standard errors calculation differ;
2) -xtreg,fe- gives you back more information than -regress- with -i.panelid- and -i.timevar- can do.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Jason Rhee

Join Date: Feb 2025
Posts: 6

#13

01 Mar 2025, 01:50

Originally posted by Carlo Lazzaro View Post

Code:

reg dependent independent i.user i.time, vce(cluster user)

Code:

. xtreg ln_wage i.year if idcode<=3, fe vce(cluster idcode)

Fixed-effects (within) regression Number of obs = 39
Group variable: idcode Number of groups = 3

R-squared: Obs per group:
Within = 0.5446 min = 12
Between = 0.2670 avg = 13.0
Overall = 0.3678 max = 15

F(3, 2) = .
corr(u_i, Xb) = -0.0356 Prob > F = .

(Std. err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
year |
69 | .208967 3.41e-08 6.1e+06 0.000 .2089668 .2089671
70 | -.2747772 .2552143 -1.08 0.394 -1.372876 .8233215
71 | -.3613911 .3640359 -0.99 0.425 -1.927711 1.204929
72 | -.2056973 .1967664 -1.05 0.406 -1.052315 .64092
73 | -.0310461 .0967648 -0.32 0.779 -.4473915 .3852993
75 | .0416271 .1575174 0.26 0.816 -.6361157 .71937
77 | .0358937 .1303686 0.28 0.809 -.5250371 .5968246
78 | .2433199 .1906609 1.28 0.330 -.5770276 1.063667
80 | .2726139 .2105344 1.29 0.325 -.6332423 1.17847
82 | .1747839 .0767088 2.28 0.150 -.1552673 .5048351
83 | .2924489 .129739 2.25 0.153 -.2657727 .8506706
85 | .3712589 .1848931 2.01 0.182 -.4242719 1.16679
87 | .2960361 .2044639 1.45 0.285 -.5837012 1.175773
88 | .3038639 .1462331 2.08 0.173 -.3253264 .9330542
|
_cons | 1.659677 .0055719 297.86 0.000 1.635703 1.683651
-------------+----------------------------------------------------------------
sigma_u | .24956596
sigma_e | .27711004
rho | .44784468 (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

Thank you so much!!! Helped me a lot to understand more perfectly. Again, thank you!!!

Announcement