Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    Thanks @Joao Santos Silva,

    That's a great suggestion.

    Thanks and kind regards,
    Chris

    Comment


    • #47
      Dear @Joao Santos Silva,

      I run a aextlogit with firm FEs and include 400 dummies for funds and 30 dummies for years as follows:

      quietly tabulate (year), gen (year)
      quietly tabulate (fund_id),gen (fund)

      xtset firm_id

      quietly aextlogit Binary_LHS control_variables year* fund*, nolog
      esttab, drop (year* fund*)

      It has taken STATA half of a day to run this command without outputting the results yet. Do we have any way to speed up the process? In addition, can we include a robust standard errors option in aextlogit?

      Many thanks,
      Chris

      Comment


      • #48
        Dear Chris McDonald,

        Estimation of a fixed effects logit with 30 time periods is always going to take a very long time, if you add over 400 dummies, it will take a very, very long time. If you have another computer, you may want to try a standard logit with dummies for firms funds and time, but it will also take a very long time.

        The current version of the command supports clustered robust standard errors; please check the help file.

        Best wishes,

        Joao

        Comment


        • #49
          Dear @Joao Santos Silva,

          Thanks for your explanation.

          Kind regards,
          Chris

          Comment


          • #50
            Dear @Joao Santos Silva,

            I have pooled data with 2500 funds, 2200 firms, and 30 time periods. Each fund could hold shares at many firms in the sample of the 2200 firms and in multiple time periods. Also, each firm can attract many funds from the pool of 2500 funds in multiple time periods. I would like to use the REGHDFE to run OLS regressions with fixed effects and clustered standard errors. An option I am looking at is as follows:

            reghdfe LHS RHS, absorb(fund_id firm_id time) vce(cluster fund_id firm_id time)

            This regression controls for funds, firms, and time-fixed effects. Do you think it would be too extreme to use the 3-way clustering by funds, firms, and times after I had controlled fixed effects for these three clusters? Could you please suggest a proper level of clustered standard errors that I should go for?

            Thanks and kind regards,
            Chris

            Comment


            • #51
              Dear Chris McDonald,

              This is a totally different topic; please start a new thread so that all can contribute.

              Best wishes,

              Joao

              Comment


              • #52
                Thanks @Joao Santos Silva,

                If you have any suggestions for me, please find my post following the link below:

                https://www.statalist.org/forums/for...tandard-errors

                Many thanks,
                Chris

                Comment


                • #53
                  Originally posted by Joao Santos Silva View Post
                  I am afraid that is not the interpretation of a semi-elasticity. The right interpretation is that when age increases one year, on average the probability of being unionised goes up by 5.5%.

                  Best wishes,

                  Joao
                  Dear Joao, how to interpret the coefficient for the interaction term? For example, coef = .0205552 below. Thanks!

                  south#c.year | 1 | .0205552 .0064763 3.17 0.002 .007862 .0332484

                  Comment


                  • #54
                    It is the usual interpretation of a semi-elasticity.

                    Best wishes,

                    Joao

                    Comment


                    • #55
                      Dear Joao Santos Silva ,

                      Your aexlogit function is very helpful. However, I have two questions. Thanks in advance for your help!

                      1). How should I interpret the semi-elasticity greater than 1?
                      2). In my case, I have more than 2000 alternatives. The results of the semi-elasticity is extremely similar to that of the original coefficients (betas). Why is that the case? Please see below for my Stata output.

                      Code:
                      . aextlogit chosen home school distance vehicle, b nolog
                      
                      Conditional fixed-effects logistic regression   Number of obs      =   2197596
                      Group variable: sampno                          Number of groups   =       998
                                                                      Obs per group: min =      2202
                                                                                     avg =    2202.0
                      Log likelihood  = -4418.7312                                   max =      2202
                      ------------------------------------------------------------------------------
                            chosen |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                              home |   1.998844    .103529    19.31   0.000      1.79593    2.201757
                            school |    4.66244    .126472    36.87   0.000      4.41456    4.910321
                          distance |  -2.208651   .0699369   -31.58   0.000    -2.345724   -2.071577
                           vehicle |    .171495   .0721387     2.38   0.017     .0301058    .3128842
                      ------------------------------------------------------------------------------
                      
                                         Average (semi) elasticities of Pr(y=1|x,u)
                      ------------------------------------------------------------------------------
                            chosen |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                              home |   1.997936    .103482    19.31   0.000     1.795115    2.200757
                            school |   4.660323   .1264146    36.87   0.000     4.412555    4.908091
                          distance |  -2.207648   .0699052   -31.58   0.000    -2.344659   -2.070636
                           vehicle |   .1714171   .0721059     2.38   0.017     .0300921    .3127421
                      ------------------------------------------------------------------------------
                      Average of chosen = .00045413 (Number of obs = 2197596)
                      Dear
                      Last edited by Ashi Choi; 19 Nov 2023, 04:57.

                      Comment


                      • #56
                        Dear Ashi Choi

                        As far as I understand you are estimating a conditional logit model, and the method implemented in this command only works for the binary case with fixed effects.

                        Best wishes,

                        Joao

                        Comment


                        • #57
                          Dear Joao Santos Silva,
                          I am running a series of Logit models (pooled Logit, panel Logit FE, panel Logit RE; as discussed in this post) to estimate the probability of transitioning to retirement conditional on one's probability of falling into poverty, among other things. Doing -margins- after -xtlogit, fe- would not calculate results for categorical variables and Erik Ruzek proposed to use -aextlogit- instead (see post). I understand that the command provide semi-elasticities (eydx) and not marginal effects (dydx), but I am surprised by how much results changed.
                          More specifically, the semi-elasticity of risk of poverty is -0.57, while the marginal effect obtained from -xtlogit, fe vce (bootstap)- is -0.018 and from -xtlogit, re vce(cl mergeid)- is -0.013 (not reported below). Could you help me interpreting the massive difference in these results, please? I am considering to present all the results in the paper version to be submitted to a journal, and I had not expected such a divergence in results.
                          Thank you in advance!

                          Code:
                          . qui xtlogit trans $cov_pov_risk2 i.wave i.country if insample==1, fe vce(bootstrap) 
                          
                          . margins, dydx($cov_pov_risk2) post
                          
                          Average marginal effects                                 Number of obs = 9,474
                          Model VCE: Bootstrap
                          
                          Expression: Pr(trans|fixed effect is 0), predict(pu0)
                          dy/dx wrt:  pov_risk_t_1 educ male0 2.age_grp 3.age_grp 4.age_grp 5.age_grp 6.age_grp hhsize_eqh_sr sphus_poor
                                      2.work_type 3.work_type 2.marital_status 3.marital_status hhmemb_work
                          
                          --------------------------------------------------------------------------------------------
                                                     |            Delta-method
                                                     |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
                          ---------------------------+----------------------------------------------------------------
                                        pov_risk_t_1 |  -.0184606   .0098897    -1.87   0.062     -.037844    .0009227
                                                educ |          0  (omitted)
                                               male0 |          0  (omitted)
                                                     |
                                             age_grp |
                                            55-59yo  |          .  (not estimable)
                                            60-64yo  |          .  (not estimable)
                                            65-69yo  |          .  (not estimable)
                                            70-74yo  |          .  (not estimable)
                                              75+yo  |          .  (not estimable)
                                                     |
                                       hhsize_eqh_sr |  -.0426886   .0329658    -1.29   0.195    -.1073004    .0219233
                                          sphus_poor |  -.0000665   .0056464    -0.01   0.991    -.0111333    .0110003
                                                     |
                                           work_type |
                          2. Public sector employee  |          .  (not estimable)
                                   3. Self-employed  |          .  (not estimable)
                                                     |
                                      marital_status |
                                   2. Never married  |          .  (not estimable)
                                3. Divorced/widowed  |          .  (not estimable)
                                                     |
                                         hhmemb_work |    .219421   .0747288     2.94   0.003     .0729554    .3658867
                          --------------------------------------------------------------------------------------------
                          Note: dy/dx for factor levels is the discrete change from the base level.
                          Code:
                           
                          . aextlogit trans $cov_pov_risk2 i.wave i.country if insample==1, vce(cl mergeid)
                          note: multiple positive outcomes within groups encountered.
                          note: 17,997 groups (30,530 obs) omitted because of all positive or
                                all negative outcomes.
                          note: educ omitted because of no within-group variance.
                          note: 1.male0 omitted because of no within-group variance.
                          note: 12.country omitted because of no within-group variance.
                          note: 13.country omitted because of no within-group variance.
                          note: 14.country omitted because of no within-group variance.
                          note: 15.country omitted because of no within-group variance.
                          note: 16.country omitted because of no within-group variance.
                          note: 17.country omitted because of no within-group variance.
                          note: 18.country omitted because of no within-group variance.
                          note: 19.country omitted because of no within-group variance.
                          note: 23.country omitted because of no within-group variance.
                          note: 28.country omitted because of no within-group variance.
                          note: 29.country omitted because of no within-group variance.
                          note: 31.country omitted because of no within-group variance.
                          note: 32.country omitted because of no within-group variance.
                          note: 33.country omitted because of no within-group variance.
                          note: 34.country omitted because of no within-group variance.
                          note: 35.country omitted because of no within-group variance.
                          note: 47.country omitted because of no within-group variance.
                          note: 48.country omitted because of no within-group variance.
                          note: 51.country omitted because of no within-group variance.
                          note: 53.country omitted because of no within-group variance.
                          note: 55.country omitted because of no within-group variance.
                          note: 57.country omitted because of no within-group variance.
                          note: 59.country omitted because of no within-group variance.
                          note: 61.country omitted because of no within-group variance.
                          note: 63.country omitted because of no within-group variance.
                          
                          Iteration 0:  Log pseudolikelihood = -690.60966  
                          Iteration 1:  Log pseudolikelihood = -362.32573  
                          Iteration 2:  Log pseudolikelihood = -317.67908  
                          Iteration 3:  Log pseudolikelihood = -316.07311  
                          Iteration 4:  Log pseudolikelihood =  -316.0633  
                          Iteration 5:  Log pseudolikelihood =  -316.0633  
                          
                          Conditional fixed-effects logistic regression   Number of obs      =      9474
                          Group variable: panel                           Number of groups   =      3396
                                                                          Obs per group: min =         2
                                                                                         avg =       2.8
                          Log likelihood  = -316.0633                                    max =         6
                          
                                             Average (semi) elasticities of Pr(y=1|x,u)
                                                                    (Std. err. adjusted for 3,396 clusters in mergeid)
                          --------------------------------------------------------------------------------------------
                                                     |               Robust
                                               trans | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                          ---------------------------+----------------------------------------------------------------
                                        pov_risk_t_1 |  -.5747288   .2290313    -2.51   0.012    -1.023622   -.1258358
                                                educ |          0  (omitted)
                                             1.male0 |          0  (omitted)
                                                     |
                                             age_grp |
                                            55-59yo  |  -.7974483   .5122496    -1.56   0.120    -1.801439    .2065426
                                            60-64yo  |    .314952   .5661015     0.56   0.578    -.7945865     1.42449
                                            65-69yo  |   1.547367   .6606971     2.34   0.019     .2524245    2.842309
                                            70-74yo  |   -.001874   .7274934    -0.00   0.998    -1.427735    1.423987
                                              75+yo  |  -1.910432   .8650688    -2.21   0.027    -3.605935   -.2149279
                                                     |
                                       hhsize_eqh_sr |  -1.329011   .6351388    -2.09   0.036     -2.57386   -.0841617
                                          sphus_poor |  -.0020709   .2089311    -0.01   0.992    -.4115682    .4074265
                                                     |
                                           work_type |
                          2. Public sector employee  |  -.5654131   .3649396    -1.55   0.121    -1.280682    .1498554
                                   3. Self-employed  |  -.5671904   .7642716    -0.74   0.458    -2.065135    .9307543
                                                     |
                                      marital_status |
                                   2. Never married  |   .9879512   .6608452     1.49   0.135    -.3072815    2.283184
                                3. Divorced/widowed  |   1.362225   .6392871     2.13   0.033     .1092452    2.615205
                                                     |
                                       1.hhmemb_work |   6.831168   .7453935     9.16   0.000     5.370224    8.292113
                                                     |
                                                wave |
                                   Wave 4 (2011/12)  |   3.621432   .4230681     8.56   0.000     2.792234     4.45063
                                      Wave 5 (2013)  |    5.96335   .5746864    10.38   0.000     4.836986    7.089715
                                      Wave 6 (2015)  |   8.174807   .6702245    12.20   0.000     6.861191    9.488423
                                   Wave 7 (2017/18)  |   10.30982   .7512174    13.72   0.000     8.837458    11.78218
                                   Wave 8 (2019/20)  |      13.06   .8502479    15.36   0.000     11.39354    14.72645
                                                     |
                                             country |
                                            Germany  |          0  (omitted)
                                             Sweden  |          0  (omitted)
                                        Netherlands  |          0  (omitted)
                                              Spain  |          0  (omitted)
                                              Italy  |          0  (omitted)
                                             France  |          0  (omitted)
                                            Denmark  |          0  (omitted)
                                             Greece  |          0  (omitted)
                                            Belgium  |          0  (omitted)
                                     Czech Republic  |          0  (omitted)
                                             Poland  |          0  (omitted)
                                         Luxembourg  |          0  (omitted)
                                            Hungary  |          0  (empty)
                                           Portugal  |          0  (empty)
                                           Slovenia  |          0  (omitted)
                                            Estonia  |          0  (omitted)
                                            Croatia  |          0  (omitted)
                                          Lithuania  |          0  (empty)
                                           Bulgaria  |          0  (empty)
                                             Cyprus  |          0  (empty)
                                            Finland  |          0  (empty)
                                             Latvia  |          0  (empty)
                                              Malta  |          0  (empty)
                                            Romania  |          0  (empty)
                                           Slovakia  |          0  (empty)
                          --------------------------------------------------------------------------------------------
                          Average of trans = .15548445 (Number of obs = 40004)

                          Comment


                          • #58
                            Dear Giovanna Ortolani,

                            The problem is that the results reported by margins after xtlogit fe are meaningless, as explained here. That explains the difference...

                            Best wishes,

                            Joao

                            Comment


                            • #59
                              Dear @Joao Santos Silva

                              I am currently working with a panel dataset and would greatly appreciate your advice on using the aextlogit model for my analysis.

                              Context:
                              • The dataset consists of 26,500 observations (large N) (a longitudinal panel, unbalanced) observed between 1996 and 2017 (T = 22).
                              • The dependent variable is binary, indicating whether an individual perceived discrimination or not in the past two years.
                              • Given the large N and small T, I would like to understand the most appropriate modeling approach to account for unobserved individual heterogeneity.

                              My Questions:
                              1. Should I use aextlogit in this scenario, and why?
                                From my understanding, aextlogit is designed to reduce bias from the incidental parameter problem, particularly when dealing with large N and small T in fixed-effects logit models. Given that my time dimension is not extremely small but not large either, is aextlogit the best approach to use here?
                              2. If aextlogit is not the recommended approach and I use xtlogit fe instead, how should I interpret the resulting coefficients if margins cannot be calculated? Or should I better use xtreg, fe?
                                I understand that interpreting coefficients directly from a fixed-effects logit model can be challenging, especially if marginal effects are unavailable. Could you provide guidance on how best to interpret the coefficients in this case?

                              Many thanks in advance for your answer.

                              Comment


                              • #60
                                Dear Adriana Cardozo,

                                First of all, please note that aextlogit is not an estimator; it is just a command that estimates the model using xtlogit FE and presents the results differently.

                                With T=22, it should be reasonably safe to estimate a logit including the dummies for each ID, but that is a logit with over 26,500 parameters and you may struggle to estimate that. So, I think that you would need the FE logit to facilitate the estimation; I expect the results of the two methods to be similar. If you use this approach, you can then use aextlogit to estimate the model by FE logit (instead of xtlogit FE), and obtain results that are easier to interpret.

                                Depending on what you want to do, you may also use a less common approach: estimate the model using xtlogit FE, and then use the results of that and the first order conditions of a logit to estimate the fixed effects. You can then combine the two sets of results to compute marginal effects, but it won't be straightforward to compute the standard errors and therefore you may not want to implement this.

                                Best wishes,

                                Joao

                                Comment

                                Working...
                                X