Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error with robust errors in random effects xtprobit or xtlogit

    Hi everyone,

    i've been away for a while getting settled into new position. I sincerely apologize. I have an issue with some estimations and would like to share it with you and see if I can understand what is driving it. To start I am using Stata version 13.1 in a MacBook Pro running Mac OS 10.10.5, although I believe that has very little to do with the issue. I now describe the nature of the data: unbalanced panel data of household mortgage information and demographics, with a total of 7,183 observations, 994 groups (clusters) and from 2 to 8 observations per group (this is where I'm thinking the problem lies but I want to see if you agree). I have tried running xtprobit and xtlogit with random effects and robust errors, and I get the following error:
    Code:
    calculation of robust standard errors failed
    r(198);
    My understanding is that the vce(robust) option is the same as calculating a variance covariance matrix that if you're using vce(cluster panelvar), where panelvar is the variable that identifies the groups (clusters) for the panel dataset. This is from the [XT] part of the reference manual in the mehtods and formulas section for xtprobit:
    Specifying vce(robust) or vce(cluster clustvar) causes the Huber/White/sandwich VCE estimator to be calculated for the coefficients estimated in this regression....
    Clustering on the panel variable produces a consistent VCE estimator when the disturbances are not identically distributed over the panels or there is serial correlation in it.
    The cluster–robust VCE estimator requires that there are many clusters and the disturbances are uncorrelated across the clusters. The panel variable must be nested within the cluster variable because of the within-panel correlation that is generally induced by the random-effects transform when there is heteroskedasticity or within-panel serial correlation in the idiosyncratic errors.
    A similar explanation can be found for xtlogit estimation of the robust standard errors when estimating random effects in the manual. My understanding is then that the robust standard error estimator tries to be robust to heteroskedasticity across panels and within-panel serial correlation. Since there aren't so many data points within the panels (min 2, max 8) I can see how it can be very difficult to account for any serial correlation within the panels. Is this the reason why the esitmator is having a problem to come up with the sandwich estimator in this case? Could there be another problem?

    Furthermore, cmp (user-written command available on SSC) provides robust and clustered standard errors when doing a random effects estimation. My doubt here is that cmp is not really setup to take into account the time correlation component of a panel, so I wonder if those estimates are still quite not robust to serial correlation within the panels. Which then brings the question of whether should I actually worry about standard errors that are robust to serial correlation when I don't have that many observations in some panels (minimum 2), and hence still use cmp's estimates to do inference better than an xtprobit random effects estimation without robust errors.

    Thanks for any light you can throw on this, I appreciate it.
    Last edited by Alfonso Sánchez-Peñalver; 10 Sep 2015, 13:41.
    Alfonso Sanchez-Penalver

  • #2
    Dear Alfonso,

    Please e-mail [email protected] and I will take a closer look at your problem. It would be really helpful if you could send your do-file and data.

    Best,

    Enrique

    Comment


    • #3
      Done!

      Gracias Enrique!
      Alfonso Sanchez-Penalver

      Comment


      • #4
        Hello Alfonso and all of those interested,

        If you are experiencing a similar problem, a solution is to use meprobit and melogit with robust standard errors. These multilevel mixed-effects estimators are equivalent to the random effects panel data counterparts. To get exactly the same result as you would from the xt models you would type:

        meprobit y x || panelvar : , vce(robust) intpoints(12)

        On the mean time, we are looking into the problem pointed out by Alfonso (thanks again for pointing this out).

        Best,

        Enrique



        Comment


        • #5
          Thank you again Enrique.
          Alfonso Sanchez-Penalver

          Comment


          • #6
            Enrique Pinzon (StataCorp) how do I go about estimating average marginal effects of the different regressors on the probability of y = 1? I thought of
            Code:
            margins, dydx(*) predict(pr fixedonly)
            since my random effects specification only has a constant, and thus there are no random components on the coefficients. When I try that, however, it took a very long time and I stopped it. When using xtprobit I simply used
            Code:
            margins, dydx(*)
            Thanks!
            Alfonso Sanchez-Penalver

            Comment


            • #7
              Hola Alfonso,

              I was trying to use margins with Stata 13.1 and it did not allow me to do some of the cool stuff you can do with Stata 14 for the meprobit estimator. However, there is a manual way to go after your average marginal effects, however, it implies doing it separately for each prediction. Here is what I would type to get the average marginal effects :

              margins, expression(normalden(predict(xb))*_b[x1])
              margins, expression(normalden(predict(xb)*_b[x2]))

              And so forth for each one of your predictions. This if for the continuous regressors for the discrete the expression is the difference of the levels evaluated at the normal c.d.f. This would give you what you would get with margins and pu0 and xtprobit when you use meprobit.
              Last edited by Enrique Pinzon (StataCorp); 14 Sep 2015, 10:21.

              Comment


              • #8
                Hi Enrique,

                thank you for your reply. I found that with my data margins after meprobit takes an excessive time to compute (an hour and it hadn't finished computing). Something very odd because after xtprobit it doesn't take as long. I ended up using cmp (available SSC) because the computation of margins after was pretty fast. Richard Williams was able to point some other differences between margins after meprobit and after xtprobit so I was able to reconcile some values I was getting with all three estimators (meprobit, xtprobit, and cmp). He also told me that margins had been very much improved in Stata 14. Unfortunately the department won't buy me an upgrade so I'll have to stick with version 13.1 (which I bought out of pocket), unless you guys at Stata would like to upgrade me for free hehehehehe.

                Enrique, thank you so much for your help on this. Let me know if you find what is going on with the xtpbrobit estimation with robust or cluster errors. By the way, as a suggestion, it is very confusing that for certain estimators the default predict option are the fitted values, xb, for others the probability, pr, and yet for others you have the probability as the expected mean, mu. I know that many different estimators have many different types of prediction, but it would be useful if there was some kind of stricter convention as to what the default is for the different estimators. At least for those that are related. For example the default prediction value for xtprobit is xb, and for meprobit is mu...
                Alfonso Sanchez-Penalver

                Comment


                • #9
                  Hi Enrique,

                  I am experiencing a similar issue as Alfonso was but am afraid that the meprobit solution will not work as needed in my case.

                  For my analysis I have data on Merger&Acquisition transactions. I am now trying to use logistics panel regression to determine the likelihood of such a transaction deal being completed (my binary DV). Of course, some firms make multiple transactions over the observed 20-year time window. Since I am not interested in looking at the change over time, I chose to only set one panel variable and no time variable (hence, only the ID number of the acquiring firms). In a next step, I run fixed effects logistics regression (xtlogit y x1 x2 x3 x4..., fe nolog iterate(100) no omitted noemptycells) and then plan on running the same model with random effect and robust standard errors (xtlogit y x1 x2 x3 x4... i.Acqu2SIC Years, re vce(cluster AcquIPCUSIP)). Nonetheless, when trying to use random effects with clustered means, I also get the following error message:
                  calculation of robust standard errors failed r(198) The goal here is to conduct a Hausman test afterwards to choose between fixed and random effect. I have about 16 variables that need to be included in my model.
                  When following your suggestion of using the Mixed Effects model (I replaced the meprobit with melogit to make it applicable to my data and kept everything else constant), I get strange results and the random effect or clustered means are never mentioned.
                  My question now, am I even supposed to use the melogit model or would it be more appropriate for me to stick to the xtlogit (fe, re)? If so, how can I address the error message (or is the clustering even necessary in my sample)?
                  I hope I was able to explain my case somewhat and hope you can help me.
                  Thank you VERY much in advance!
                  Best,
                  Theresa

                  Comment


                  • #10
                    Hello Theresa,

                    In the next update we will address the issue you are having with xtlogit. In the meantime, if you could email [email protected] with your data and do-file I would like to look at your melogit and meprobit commands which should work appropriately with cluster robust standard errors.

                    Best,

                    Enrique

                    Comment


                    • #11
                      Hello Alfonso,

                      I am having the exactly same problem as yours, which is
                      calculation of robust standard errors failed r(198); Do you solve it now? Would you please tell me how do you deal with it? I am using the Stata 13, too.

                      Many thanks!

                      Best wishes,
                      Jing Du

                      Comment


                      • #12
                        I mention in one of my posts here that I ended up using cmp, a user written command that you can find in SSC.
                        Alfonso Sanchez-Penalver

                        Comment


                        • #13
                          Dear Enrique,
                          I am using STATA 14 and I had the same problems as reported above (calculation of robust standard errors failed) for two different datasets using xtlogit, re and xtprobit, re.
                          Could you tell us what are the reasons and whether STATA could fix them ?

                          BTW: the problem only occurs when inserting time and industry dummies.

                          Best,
                          Florian
                          Last edited by Florian Seliger; 17 Dec 2015, 06:43.

                          Comment


                          • #14
                            Florian,

                            Is the industry the panel variable? Then I believe that would be the source of the problem. You're in fact estimating a random effects model with fixed effects. If you're going to include industry dummies use a population averaged estimation with xtgee. Estimation of panel binary models with fixed effects is problematic. I recommend you read about what is called the incidental parameters problem if you decide to go that way.
                            Alfonso Sanchez-Penalver

                            Comment


                            • #15
                              Dear Alfonso,
                              the panel variable is firm ID.

                              I tried "meprobit" as suggested above and with this command my model runs.

                              This makes me believe that my model is not necessarily wrong, but that there could be rather a "bug" or something else in the STATA commands.

                              For me, it is a bit strange that the "normal" commands do not work and I would like to ask STATA if they could fix these problems.


                              Comment

                              Working...
                              X