Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fractional response with censored zeros

    Dear Statalist community,

    I am doing research using insurance claim data, where the dependent variable of interest is the loss-cost ratio, namely indemnity amount divided by the total liability. Naturally, it is a fractional variable bounded between [0,1]. However, it has excessive zeros, due to deductibles, and my understanding is these zeros are essentially censored because "zero" can mean positive actual loss. So simply put, the dependent variable is a fractional response with censored zeros. There are several alternative modeling approaches I can think of, but each of them misses certain aspects if I understand them correctly:

    1. Fractional response model as in Papke and Wooldridge (1996): may not be best when the number of zero observations is large; in this case also misses the censoring nature at zeros.
    2. Two-limit Tobit: misses the fractional nature of the variable; strong distributional assumptions.
    3. Zero-inflated beta model as in Cook et al. (2011): does not account for the censoring nature of zeros.
    4. Two-part fractional response model as in Ramalho and Ramalho (2011): due to some reasons, we want to analyze a balanced panel, but the two-part model essentially uses a subsample containing (0,1) observations in the second part which results in unbalanced data in estimation. Hence we prefer not to use this.
    5. Augmenting fractional response model by modeling heteroskedasticity as in Wooldridge slides page 7: honestly I don't understand why this works, I'd appreciate it if anyone could explain; but also it doesn't reflect the censoring nature of zeros.

    So my questions are:
    (1) why is pproach 5 above able to account for excessive zeros?
    (2) what would be the best approach to model my dependent variable described above, i.e., a fractional variable with excessive censored zeros, while estimating a balanced panel?

    Besides, if I misunderstood anything, please feel free to point it out, thanks!

    Much appreciated,
    Zhenni


  • #2
    I'm not sure I understand the nature of the censoring. You mean if y = 0 that can actually mean the underlying value is negative? The two-part model could work here (although it's not intended for true data censoring). The fact that one of the parts is estimated using an unbalanced panel is not an issue. Allowing for heteroskedasticity is intended to handle any particular problem other than generalizing functional form.

    Comment


    • #3
      Originally posted by Jeff Wooldridge View Post
      I'm not sure I understand the nature of the censoring. You mean if y = 0 that can actually mean the underlying value is negative? The two-part model could work here (although it's not intended for true data censoring). The fact that one of the parts is estimated using an unbalanced panel is not an issue. Allowing for heteroskedasticity is intended to handle any particular problem other than generalizing functional form.
      I appreciate your reply, Jeff. To clarify:

      1. yes, y=0 can mean negative underlying values.
      2. we want to use correlated random effects to control for the unobserved heterogeneity. I am aware of your paper introducing CRE under unbalanced data, but personally prefer dealing with balanced data. If we use the two-part model, for the second part estimation we will have to apply the unbalanced CRE, is that correct?
      3. do you mean that allowing heteroskedasticity is only for generalization purposes and is not really a solution for excessive zeros?
      4. I understand that the two-part model is not intended for data censoring but didn't find a better solution for our case. Do you perhaps have better models in mind?

      Thank you very much,
      Zhenni

      Comment

      Working...
      X