Suest after estimation with too many FE

Julia Cage

Join Date: Jan 2017

Posts: 8
#1

Suest after estimation with too many FE

10 Jul 2018, 08:33

Hi,
I need to estimate the following two models :

xi : reg ln_nb_facebook_share doc_rank doc_reactivity doc_originality doc_size i.id_news i.doc_date i.event_id if isolation==1, cluster(event_id)
estimates store iequalone

xi : reg ln_nb_facebook_share doc_rank doc_reactivity doc_originality doc_size i.id_news i.doc_date i.event_id if isolation==0, cluster(event_id)
estimates store iequalzero

suest iequalone iequalzero
test [iequalone_mean]doc_originality-[iequalzero_mean]doc_originality = 0

The issue is that I have 25,000 different values for event_id and so cannot perform the estimation because of the matsize limit (11,000).

Usually I use areg to estimate this model but suest does not work after areg.

Could you please let me know which alternative specifications I should use?
Thanks!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

10 Jul 2018, 09:24

I don't think you can get around this. But it appears that your overall purpose here is to contrast the effect of doc_originality on ln_nb_facebook_share between the isolation = 0 and isolation = 1 groups. This can be done without -suest- by using interaction terms.

Code:

xtset event_id xtreg ln_nb_facebook_share i.isolation##(doc_rank doc_reactivity doc_originality/// doc_size i.id_news i.doc_date), fe vce(ccluster event_id)

The coefficient of 1.isolation#doc_originality and the associated statistical test will be the answer to your question.

Note, by the way, that -xi:- is largely obsolete at this point, having been replaced by factor variable notation. Read -help fvvarlist- for more information about this. There are a few situations where -xi:- is still needed, but they are mostly in connection with old commands that have been superseded by newer commands that subsume the older one's functionality. So stick -xi- into a dusty corner of your mind and, more or less forget you ever knew it.
1 like
Comment
Julia Cage

Join Date: Jan 2017

Posts: 8
#3

10 Jul 2018, 10:15

Thanks!
Comment
Julia Cage

Join Date: Jan 2017

Posts: 8
#4

11 Jul 2018, 02:41

I have an additional question: I followed your comment regarding the factor variable notation, but then got the following error message

doc_reactivity: factor variables may not contain noninteger values

The majority of my variables of interest do contain noninteger values. Does it imply that I cannot use the factor variable notation and need to keep using -xi-?

Thanks!
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17709

11 Jul 2018, 03:04

Julia:
the trivial fix consists in prefixing the culprit with a -c.-, as you can see from the following toy-example:

Code:

use "http://www.stata-press.com/data/r15/nlswork.dta"
. xtreg ln_wage i.race##(ttl_exp), fe
ttl_exp:  factor variables may not contain noninteger values
r(452);


. xtreg ln_wage i.race##(c.ttl_exp), fe
note: 2.race omitted because of collinearity
note: 3.race omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.1359                                         min =          1
     between = 0.2629                                         avg =        6.1
     overall = 0.1798                                         max =         15

                                                F(3,23820)        =    1248.30
corr(u_i, Xb)  = 0.1708                         Prob > F          =     0.0000

--------------------------------------------------------------------------------
       ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
          race |
        black  |          0  (omitted)
        other  |          0  (omitted)
               |
       ttl_exp |   .0315865   .0006039    52.30   0.000     .0304028    .0327702
               |
race#c.ttl_exp |
        black  |  -.0019977   .0011202    -1.78   0.075    -.0041933    .0001979
        other  |  -.0066778   .0049487    -1.35   0.177    -.0163777     .003022
               |
         _cons |   1.482451   .0036062   411.08   0.000     1.475382    1.489519
---------------+----------------------------------------------------------------
       sigma_u |  .37852369
       sigma_e |  .29775494
           rho |   .6177516   (fraction of variance due to u_i)
--------------------------------------------------------------------------------
F test that all u_i=0: F(4710, 23820) = 7.61                 Prob > F = 0.0000

Last edited by Carlo Lazzaro; 11 Jul 2018, 03:07.

Kind regards,
Carlo
(Stata 19.0)

Comment

Julia Cage

Join Date: Jan 2017

Posts: 8
#6

11 Jul 2018, 07:19

Thanks Carlo.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#7

11 Jul 2018, 14:54

Yes, sorry. That was my mistake. I mistakenly assumed that doc_originality was a categorical variable. But it is evidently a continuous variable, as, I now see on a more careful reading, are doc_rank and doc_reactivity. With factor-variable notation, in an interaction (and only in an interaction) Stata assumes by default that variables are cateogrical unless prefixed with a c. (for continuous). (Outside of interactions, the default is the opposite: they are continuous unless prefixed with i.)

The simplest fix is to put a c. in front of (doc_rank doc_reactivity doc_originality) as a whole: Stata will "distribute" the c. to each of them automatically.
Comment

Announcement

Suest after estimation with too many FE

Comment

Comment

Comment

Comment

Comment

Comment