Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled probit or Fixed Effects probit

    Dear professors,
    Before proceeding with the coding I have the following theoretical question:
    I have a longitudinal dataset in which each country i is observed at different points in time. My dependent variable is a dummy variable, and I want to estimate the probability of success, say yi =1, conditional to a set of predictors. I also add a full set of time dummies to the equation to estimate. The problem is that for the majority of high-income countries in the sample the dependent variable always take value 0. In this case the country dummy will perfectly predict the outcome variable. If I use country fixed effects what happens? Is reasonable to use instead regions dummies?




  • #2
    Unconditional fixed effects (FE) probit suffers from the incidental parameters problem (see https://gburtch.github.io/posts/2021/03/logit-ipp/ for a description). While there is no conditional FE probit estimator available, a conditional FE logit estimator exists where the fixed effects are conditioned out of the likelihood. See

    Code:
    help xtlogit

    FE estimation requires that your variables vary within units (countries). If there is sufficient variation in a number of countries, then these will provide informative data for the likelihood calculation. You can disregard countries that exhibit no variation as they are uninformative. However, if you find that you are losing a significant portion of the sample, you may want to consider correlated random effects (Mundlak) estimation. For more details, refer to https://conference.iza.org/conferenc...nonlin_iza.pdf.

    Comment


    • #3
      Dear Professor Musau, thank you very much for your prompt response.
      My idea is to estimate the following pooled probit model:
      P(yit=1| xit , tt , regionr) = Phi (xit b + tt + regionr )
      where xit is a vector of predictors that vary across countries i and time t. Rather than controlling for the time-invariant unobserved heterogeneity across countries i, I control for the time-invariant heterogeneity across regions r (Americas, Europe, Africa...). I do this because for a relevant number of countries, the dependent variable is always zero. For instance, for Germany the dependent variable is always zero, for France is the same, and so on.
      This idea comes from the following statement that I was reading on an article:
      <An important disadvantage of a fixed effects model is the effect it has on the sample size. That is, countries for which the outcome variable is always zero are excluded from the sample, because the country dummy will perfectly predict the outcome variable in this case.>>
      To me, it's not entirely clear why countries for which the outcome variable is always zero are excluded from the sample. Could you kindly provide an intuition about it, so that I can decide which estimator is more appropriate before coding in STATA?
      Thanks
      Last edited by Frank Giaquinto; 24 Mar 2024, 15:10.

      Comment


      • #4
        Originally posted by Frank Giaquinto View Post
        This idea comes from the following statement that I was reading on an article:
        <An important disadvantage of a fixed effects model is the effect it has on the sample size. That is, countries for which the outcome variable is always zero are excluded from the sample, because the country dummy will perfectly predict the outcome variable in this case.>>
        To me, it's not entirely clear why countries for which the outcome variable is always zero are excluded from the sample. Could you kindly provide an intuition about it, so that I can decide which estimator is more appropriate before coding in STATA?
        Thanks
        The authors make a case defending their use of pooled probit with region dummies, but this also comes at the cost of ignoring country-level unobserved heterogeneity. You cannot blame them as they need to "sell" their model to the reviewers. If you still have a large enough sample, I would still favor conditional FE logit over the authors' approach. Additionally, I would recommend CRE probit over the authors' model.

        Comment


        • #5
          Thank you very much!

          Comment


          • #6
            Dear Professor,
            I have one final question to clarify any lingering theoretical doubts.
            If I estimate:
            P(yit=1| xit , tt , countryi) = Phi (xit b + tt + countryi )
            Then, countries for which the outcome variable is always zero are excluded from the sample because the regression is estimated by maximum likelihood and the maximum likelihood estimate of the coefficient of a perfect predictor (country dummy) is infinite. So such an estimation cannot converge. Is that correct?

            Comment

            Working...
            X