New on SSC: -aextlogit- Average elasticities for fixed effects logit

Chris McDonald

Join Date: Oct 2021

Posts: 10
#46

28 Oct 2021, 04:31

Thanks @Joao Santos Silva,

That's a great suggestion.

Thanks and kind regards,
Chris
Comment
Chris McDonald

Join Date: Oct 2021

Posts: 10
#47

31 Oct 2021, 22:11

Dear @Joao Santos Silva,

I run a aextlogit with firm FEs and include 400 dummies for funds and 30 dummies for years as follows:

quietly tabulate (year), gen (year)
quietly tabulate (fund_id),gen (fund)

xtset firm_id

quietly aextlogit Binary_LHS control_variables year* fund*, nolog
esttab, drop (year* fund*)

It has taken STATA half of a day to run this command without outputting the results yet. Do we have any way to speed up the process? In addition, can we include a robust standard errors option in aextlogit?

Many thanks,
Chris
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#48

01 Nov 2021, 01:41

Dear Chris McDonald,

Estimation of a fixed effects logit with 30 time periods is always going to take a very long time, if you add over 400 dummies, it will take a very, very long time. If you have another computer, you may want to try a standard logit with dummies for firms funds and time, but it will also take a very long time.

The current version of the command supports clustered robust standard errors; please check the help file.

Best wishes,

Joao
Comment
Chris McDonald

Join Date: Oct 2021

Posts: 10
#49

01 Nov 2021, 04:03

Dear @Joao Santos Silva,

Thanks for your explanation.

Kind regards,
Chris
Comment
Chris McDonald

Join Date: Oct 2021

Posts: 10
#50

11 Nov 2021, 05:44

Dear @Joao Santos Silva,

I have pooled data with 2500 funds, 2200 firms, and 30 time periods. Each fund could hold shares at many firms in the sample of the 2200 firms and in multiple time periods. Also, each firm can attract many funds from the pool of 2500 funds in multiple time periods. I would like to use the REGHDFE to run OLS regressions with fixed effects and clustered standard errors. An option I am looking at is as follows:

reghdfe LHS RHS, absorb(fund_id firm_id time) vce(cluster fund_id firm_id time)

This regression controls for funds, firms, and time-fixed effects. Do you think it would be too extreme to use the 3-way clustering by funds, firms, and times after I had controlled fixed effects for these three clusters? Could you please suggest a proper level of clustered standard errors that I should go for?

Thanks and kind regards,
Chris
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#51

11 Nov 2021, 05:55

Dear Chris McDonald,

This is a totally different topic; please start a new thread so that all can contribute.

Best wishes,

Joao
Comment
Chris McDonald

Join Date: Oct 2021

Posts: 10
#52

11 Nov 2021, 18:22

Thanks @Joao Santos Silva,

If you have any suggestions for me, please find my post following the link below:

https://www.statalist.org/forums/for...tandard-errors

Many thanks,
Chris
Comment
Isaac Seo

Join Date: May 2022

Posts: 1
#53

17 May 2022, 20:31

Originally posted by Joao Santos Silva View Post

I am afraid that is not the interpretation of a semi-elasticity. The right interpretation is that when age increases one year, on average the probability of being unionised goes up by 5.5%.

Best wishes,

Joao

Dear Joao, how to interpret the coefficient for the interaction term? For example, coef = .0205552 below. Thanks!

south#c.year | 1 | .0205552 .0064763 3.17 0.002 .007862 .0332484
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#54

17 May 2022, 23:36

It is the usual interpretation of a semi-elasticity.

Best wishes,

Joao
Comment

Ashi Choi

Join Date: Nov 2023
Posts: 1

#55

19 Nov 2023, 04:53

Dear Joao Santos Silva ,

Your aexlogit function is very helpful. However, I have two questions. Thanks in advance for your help!

1). How should I interpret the semi-elasticity greater than 1?
2). In my case, I have more than 2000 alternatives. The results of the semi-elasticity is extremely similar to that of the original coefficients (betas). Why is that the case? Please see below for my Stata output.

Code:

. aextlogit chosen home school distance vehicle, b nolog

Conditional fixed-effects logistic regression   Number of obs      =   2197596
Group variable: sampno                          Number of groups   =       998
                                                Obs per group: min =      2202
                                                               avg =    2202.0
Log likelihood  = -4418.7312                                   max =      2202
------------------------------------------------------------------------------
      chosen |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        home |   1.998844    .103529    19.31   0.000      1.79593    2.201757
      school |    4.66244    .126472    36.87   0.000      4.41456    4.910321
    distance |  -2.208651   .0699369   -31.58   0.000    -2.345724   -2.071577
     vehicle |    .171495   .0721387     2.38   0.017     .0301058    .3128842
------------------------------------------------------------------------------

                   Average (semi) elasticities of Pr(y=1|x,u)
------------------------------------------------------------------------------
      chosen |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        home |   1.997936    .103482    19.31   0.000     1.795115    2.200757
      school |   4.660323   .1264146    36.87   0.000     4.412555    4.908091
    distance |  -2.207648   .0699052   -31.58   0.000    -2.344659   -2.070636
     vehicle |   .1714171   .0721059     2.38   0.017     .0300921    .3127421
------------------------------------------------------------------------------
Average of chosen = .00045413 (Number of obs = 2197596)

Dear

Last edited by Ashi Choi; 19 Nov 2023, 04:57.

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#56

20 Nov 2023, 23:11

Dear Ashi Choi

As far as I understand you are estimating a conditional logit model, and the method implemented in this command only works for the binary case with fixed effects.

Best wishes,

Joao
Comment

Giovanna Ortolani

Join Date: Jun 2023
Posts: 10

#57

30 May 2024, 03:25

Dear Joao Santos Silva,
I am running a series of Logit models (pooled Logit, panel Logit FE, panel Logit RE; as discussed in this post) to estimate the probability of transitioning to retirement conditional on one's probability of falling into poverty, among other things. Doing -margins- after -xtlogit, fe- would not calculate results for categorical variables and Erik Ruzek proposed to use -aextlogit- instead (see post). I understand that the command provide semi-elasticities (eydx) and not marginal effects (dydx), but I am surprised by how much results changed.
More specifically, the semi-elasticity of risk of poverty is -0.57, while the marginal effect obtained from -xtlogit, fe vce (bootstap)- is -0.018 and from -xtlogit, re vce(cl mergeid)- is -0.013 (not reported below). Could you help me interpreting the massive difference in these results, please? I am considering to present all the results in the paper version to be submitted to a journal, and I had not expected such a divergence in results.
Thank you in advance!

Code:

. qui xtlogit trans $cov_pov_risk2 i.wave i.country if insample==1, fe vce(bootstrap) 

. margins, dydx($cov_pov_risk2) post

Average marginal effects                                 Number of obs = 9,474
Model VCE: Bootstrap

Expression: Pr(trans|fixed effect is 0), predict(pu0)
dy/dx wrt:  pov_risk_t_1 educ male0 2.age_grp 3.age_grp 4.age_grp 5.age_grp 6.age_grp hhsize_eqh_sr sphus_poor
            2.work_type 3.work_type 2.marital_status 3.marital_status hhmemb_work

--------------------------------------------------------------------------------------------
                           |            Delta-method
                           |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
---------------------------+----------------------------------------------------------------
              pov_risk_t_1 |  -.0184606   .0098897    -1.87   0.062     -.037844    .0009227
                      educ |          0  (omitted)
                     male0 |          0  (omitted)
                           |
                   age_grp |
                  55-59yo  |          .  (not estimable)
                  60-64yo  |          .  (not estimable)
                  65-69yo  |          .  (not estimable)
                  70-74yo  |          .  (not estimable)
                    75+yo  |          .  (not estimable)
                           |
             hhsize_eqh_sr |  -.0426886   .0329658    -1.29   0.195    -.1073004    .0219233
                sphus_poor |  -.0000665   .0056464    -0.01   0.991    -.0111333    .0110003
                           |
                 work_type |
2. Public sector employee  |          .  (not estimable)
         3. Self-employed  |          .  (not estimable)
                           |
            marital_status |
         2. Never married  |          .  (not estimable)
      3. Divorced/widowed  |          .  (not estimable)
                           |
               hhmemb_work |    .219421   .0747288     2.94   0.003     .0729554    .3658867
--------------------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Code:

 
. aextlogit trans $cov_pov_risk2 i.wave i.country if insample==1, vce(cl mergeid)
note: multiple positive outcomes within groups encountered.
note: 17,997 groups (30,530 obs) omitted because of all positive or
      all negative outcomes.
note: educ omitted because of no within-group variance.
note: 1.male0 omitted because of no within-group variance.
note: 12.country omitted because of no within-group variance.
note: 13.country omitted because of no within-group variance.
note: 14.country omitted because of no within-group variance.
note: 15.country omitted because of no within-group variance.
note: 16.country omitted because of no within-group variance.
note: 17.country omitted because of no within-group variance.
note: 18.country omitted because of no within-group variance.
note: 19.country omitted because of no within-group variance.
note: 23.country omitted because of no within-group variance.
note: 28.country omitted because of no within-group variance.
note: 29.country omitted because of no within-group variance.
note: 31.country omitted because of no within-group variance.
note: 32.country omitted because of no within-group variance.
note: 33.country omitted because of no within-group variance.
note: 34.country omitted because of no within-group variance.
note: 35.country omitted because of no within-group variance.
note: 47.country omitted because of no within-group variance.
note: 48.country omitted because of no within-group variance.
note: 51.country omitted because of no within-group variance.
note: 53.country omitted because of no within-group variance.
note: 55.country omitted because of no within-group variance.
note: 57.country omitted because of no within-group variance.
note: 59.country omitted because of no within-group variance.
note: 61.country omitted because of no within-group variance.
note: 63.country omitted because of no within-group variance.

Iteration 0:  Log pseudolikelihood = -690.60966  
Iteration 1:  Log pseudolikelihood = -362.32573  
Iteration 2:  Log pseudolikelihood = -317.67908  
Iteration 3:  Log pseudolikelihood = -316.07311  
Iteration 4:  Log pseudolikelihood =  -316.0633  
Iteration 5:  Log pseudolikelihood =  -316.0633  

Conditional fixed-effects logistic regression   Number of obs      =      9474
Group variable: panel                           Number of groups   =      3396
                                                Obs per group: min =         2
                                                               avg =       2.8
Log likelihood  = -316.0633                                    max =         6

                   Average (semi) elasticities of Pr(y=1|x,u)
                                          (Std. err. adjusted for 3,396 clusters in mergeid)
--------------------------------------------------------------------------------------------
                           |               Robust
                     trans | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
---------------------------+----------------------------------------------------------------
              pov_risk_t_1 |  -.5747288   .2290313    -2.51   0.012    -1.023622   -.1258358
                      educ |          0  (omitted)
                   1.male0 |          0  (omitted)
                           |
                   age_grp |
                  55-59yo  |  -.7974483   .5122496    -1.56   0.120    -1.801439    .2065426
                  60-64yo  |    .314952   .5661015     0.56   0.578    -.7945865     1.42449
                  65-69yo  |   1.547367   .6606971     2.34   0.019     .2524245    2.842309
                  70-74yo  |   -.001874   .7274934    -0.00   0.998    -1.427735    1.423987
                    75+yo  |  -1.910432   .8650688    -2.21   0.027    -3.605935   -.2149279
                           |
             hhsize_eqh_sr |  -1.329011   .6351388    -2.09   0.036     -2.57386   -.0841617
                sphus_poor |  -.0020709   .2089311    -0.01   0.992    -.4115682    .4074265
                           |
                 work_type |
2. Public sector employee  |  -.5654131   .3649396    -1.55   0.121    -1.280682    .1498554
         3. Self-employed  |  -.5671904   .7642716    -0.74   0.458    -2.065135    .9307543
                           |
            marital_status |
         2. Never married  |   .9879512   .6608452     1.49   0.135    -.3072815    2.283184
      3. Divorced/widowed  |   1.362225   .6392871     2.13   0.033     .1092452    2.615205
                           |
             1.hhmemb_work |   6.831168   .7453935     9.16   0.000     5.370224    8.292113
                           |
                      wave |
         Wave 4 (2011/12)  |   3.621432   .4230681     8.56   0.000     2.792234     4.45063
            Wave 5 (2013)  |    5.96335   .5746864    10.38   0.000     4.836986    7.089715
            Wave 6 (2015)  |   8.174807   .6702245    12.20   0.000     6.861191    9.488423
         Wave 7 (2017/18)  |   10.30982   .7512174    13.72   0.000     8.837458    11.78218
         Wave 8 (2019/20)  |      13.06   .8502479    15.36   0.000     11.39354    14.72645
                           |
                   country |
                  Germany  |          0  (omitted)
                   Sweden  |          0  (omitted)
              Netherlands  |          0  (omitted)
                    Spain  |          0  (omitted)
                    Italy  |          0  (omitted)
                   France  |          0  (omitted)
                  Denmark  |          0  (omitted)
                   Greece  |          0  (omitted)
                  Belgium  |          0  (omitted)
           Czech Republic  |          0  (omitted)
                   Poland  |          0  (omitted)
               Luxembourg  |          0  (omitted)
                  Hungary  |          0  (empty)
                 Portugal  |          0  (empty)
                 Slovenia  |          0  (omitted)
                  Estonia  |          0  (omitted)
                  Croatia  |          0  (omitted)
                Lithuania  |          0  (empty)
                 Bulgaria  |          0  (empty)
                   Cyprus  |          0  (empty)
                  Finland  |          0  (empty)
                   Latvia  |          0  (empty)
                    Malta  |          0  (empty)
                  Romania  |          0  (empty)
                 Slovakia  |          0  (empty)
--------------------------------------------------------------------------------------------
Average of trans = .15548445 (Number of obs = 40004)

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#58

30 May 2024, 04:25

Dear Giovanna Ortolani,

The problem is that the results reported by margins after xtlogit fe are meaningless, as explained here. That explains the difference...

Best wishes,

Joao
Comment
Adriana Cardozo

Join Date: Jun 2021

Posts: 1
#59

07 Oct 2024, 08:26

Dear @Joao Santos Silva

I am currently working with a panel dataset and would greatly appreciate your advice on using the aextlogit model for my analysis.

Context:
The dataset consists of 26,500 observations (large N) (a longitudinal panel, unbalanced) observed between 1996 and 2017 (T = 22).

The dependent variable is binary, indicating whether an individual perceived discrimination or not in the past two years.

Given the large N and small T, I would like to understand the most appropriate modeling approach to account for unobserved individual heterogeneity.

My Questions:
Should I use aextlogit in this scenario, and why?
From my understanding, aextlogit is designed to reduce bias from the incidental parameter problem, particularly when dealing with large N and small T in fixed-effects logit models. Given that my time dimension is not extremely small but not large either, is aextlogit the best approach to use here?

If aextlogit is not the recommended approach and I use xtlogit fe instead, how should I interpret the resulting coefficients if margins cannot be calculated? Or should I better use xtreg, fe?
I understand that interpreting coefficients directly from a fixed-effects logit model can be challenging, especially if marginal effects are unavailable. Could you provide guidance on how best to interpret the coefficients in this case?

Many thanks in advance for your answer.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2996
#60

08 Oct 2024, 07:53

Dear Adriana Cardozo,

First of all, please note that aextlogit is not an estimator; it is just a command that estimates the model using xtlogit FE and presents the results differently.

With T=22, it should be reasonably safe to estimate a logit including the dummies for each ID, but that is a logit with over 26,500 parameters and you may struggle to estimate that. So, I think that you would need the FE logit to facilitate the estimation; I expect the results of the two methods to be similar. If you use this approach, you can then use aextlogit to estimate the model by FE logit (instead of xtlogit FE), and obtain results that are easier to interpret.

Depending on what you want to do, you may also use a less common approach: estimate the model using xtlogit FE, and then use the results of that and the first order conditions of a logit to estimate the fixed effects. You can then combine the two sets of results to compute marginal effects, but it won't be straightforward to compute the standard errors and therefore you may not want to implement this.

Best wishes,

Joao
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment