Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • predict after xtlogit with fixed effect

    I am running a two-stage regression to account for the endogeneity issue. In the first stage I use xtlogit with fixed effects. Here are the codes.

    Code:
    xtlogit y x1 x2, fe
    predict yhat
    In this stage, some observations are dropped because in some ID groups, the dependent variables are all 0 or 1. But the predict function still gives predicted values for all observations. I am wondering whether this should be the case? In the second stage, I will use yhat as the independent variable. The inconsistency between the number of observations used in the first regression and the number of predicted values will make the number of observations not consistent in the first stage and second stage. Should this be a concern?

  • #2
    Code:
    predict yhat if e(sample)
    as described in the output of help predict.

    Comment


    • #3
      Originally posted by William Lisowski View Post
      Code:
      predict yhat if e(sample)
      as described in the output of help predict.
      Thanks.

      Comment


      • #4
        Dear Tracy Yang

        In addition to William's helpful advice, please note that the prediction you get is the "probability of a positive outcome conditional on one positive outcome within group," which may not be what you want.

        Best wishes,

        Joao

        Comment


        • #5
          Joao Santos Silva Thanks for the note. I am wondering what is the difference between pc1 (the default) and pu0?

          pc1 predicted probability of a positive outcome conditional on one positive outcome within group; the default
          pu0 probability of a positive outcome assuming that the fixed effect is zero

          Comment


          • #6
            Joao Santos Silva Also linear predictioin (which is xb) will give values out of 0 and 1 range, I assume?

            Comment


            • #7
              Dear Tracy Yang,

              pc1 and pc0 do exaclly as described, but none of them is interesting; you are right in saying that xb may give you values outside that range and there is no way to transform them into something useful.

              How long is your panel?

              Best wishes,

              Joao

              Comment


              • #8
                Dear Joao Santos Silva, would you please explain what does it mean of pc1 and pu0? Say what is the meaning of assuming zero fixed effect in pu0 and conditional on one positive outome within group in pc1? Why do you say none of them is interesting?

                My dataset is cars running in different locations at different time. I would like to run regression with time and location fixed effects. Since there are multiple cars within location and time group, I am unable to set
                Code:
                xtset location time
                .

                Instead I create an id
                Code:
                egen id = group(location time)
                and then set that id as the panel id, which is
                Code:
                xtset id
                This id has 1.7 million unique values. And the total dataset is around 3 million.

                Thanks

                Tracy

                Comment


                • #9
                  Dear Tracy Yang,

                  Please check the definition of pc1 and pc0 in the documentation. I do not think your approach is suitable. What are the dimensions your panel?

                  Best wishes,

                  Joao

                  Comment


                  • #10
                    Hi Joao Santos Silva, what do you mean by dimensions of the panel?

                    Thanks

                    Tracy

                    Comment


                    • #11
                      How many ID groups and how many time periods?

                      Comment


                      • #12
                        Hi Joao Santos Silva, we would like to control time and location fixed effects. If we consider location as ID, it has 46K groups. If we consider time as time, it has 730 days (i.e., two year sample). But this does not work, as multiple cars can run in different locations on different time periods. So Stata gives error "repeated time values within panel"

                        That is why I turn to combine time and location, and set the combined id as the panel ID. In this case, it has 1.7 million groups and no time periods. The total data volume is around 3 million. Could you please let me know why this approach is not suitable? And do you have any other suggestions on alternative ways?

                        thanks

                        Tracy

                        Comment


                        • #13
                          There are a bunch of messy statistical issues with these estimates due to the incidental parameters problem, but if you just want to have predictions of the probabilities, it might work to just xtset the location. Then, put in dummies for each of the days.

                          Code:
                          xtset location
                          xtlogit y x1 ... xk i.day, fe
                          predict yhat
                          If you have all zeros or all ones then your prediction is a zero or a one, respectively. That's the deal when trying to predict a binary outcome using fixed effects logit.


                          Your approach, with not even two cars per ID on average, will have even a lot messier statistical problems than usual.

                          Comment


                          • #14
                            By the way, I bet a linear model estimated by FE would give similar results in the end. You'll probably get some negative fitted values and some above one, but I would just bring those into the unit interval -- a lot like you'd predict a zero probability or probability of one in the FE logit case.

                            Comment


                            • #15
                              Thanks Prof. Jeff Wooldridge . I wll try that.

                              Comment

                              Working...
                              X