Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cmp Tobit: Estimation problem

    Dear David Roodman
    I wonder if you could help me figure out a problem I encountered using cmp for the estimation of a tobit model. It may even be a bug.
    To put the problem in context. Im trying to estimate a multivariate tobit model, specifically to capture the variance-covariance matrix across the latent errors, and I need to repeat this process for various subsamples (half of the original sample)
    I was able to estimate the model using the full sample, but for some reason, i cant seem to be able to do it for some of the smaller ones. And surprisingly, cmp has problems even estimating a single equation tobit model.
    Specifically, if you use the attached dataset, and try to estimate the following tobit model using cmp
    Code:
    cmp (time_core1 = age educ i.sex avghrswk works rooms age_ychild  nchild* adult* emp_adul* avg_yrs_educ hhsize i.region )      [pw=wgt] if smp5_==0, ///
                     ind("cond(time_core1>0, $cmp_cont, $cmp_left)"      ) 
    
    Fitting individual models as starting point for full model fit.
    Note: For programming reasons, these initial estimates may deviate from your specification.
          For exact fits of each equation alone, run cmp separately on each.
    (sum of wgt is 96,679,959)
    starting values not feasible
    numerical overflow
    numerical overflow
    it will produce an error. However, this doesn't happen when using a tobit model.

    I wonder if this is infact a bug within CMP, or something caused by the nature of the data.

    Thank you.

    Fernando

    Attached Files

  • #2
    Thanks for sharing this example, Fernando. But when I run it I get "variable time_core1 not found".

    Comment


    • #3
      Sorry! that was my typo, it should say

      Code:
      cmp (time_proc = age educ i.sex avghrswk works rooms age_ychild  nchild* adult* emp_adul* avg_yrs_educ hhsize i.region )      [pw=wgt] if smp5_==0, ///
                       ind("cond(time_core1>0, $cmp_cont, $cmp_left)"      )
      Thank you!

      Comment


      • #4
        The error is actually coming from tobit, which cmp runs to get the starting point for the full model fit. Here's a simple example on that data set:

        Code:
        program test
          version 14
          tobit time_proc 4.region [aweight = wgt], ul
        end
        test
        For me this crashes with "starting values not feasible" in Stata 16 and 17. But not earlier versions of Stata. And not if I drop the weights. Or just prefix the tobit command with "version 14:".

        In calling tobit, I have cmp include both the ll and ul options, to allow for censoring from both sides. That's how the ul option appears. I guess I could make cmp check for whether you only have lower censoring and leave out ul in that case. Whether this would completely remove the bug, I don't know. We may not have found the only combination of circumstances that triggers it. I have written to Stata tech support about this.
        Last edited by David Roodman; 05 Jan 2022, 11:54.

        Comment


        • #5
          Thank you for checking!
          And this is good to know, i ll run the command under version control then.
          Best wishes
          Fernando

          Comment


          • #6
            I have pushed out a work-around--that I think works at least in your case. It can be installed with:
            Code:
            net install cmp, replace from(https://raw.github.com/droodman/cmp/v8.6.7)
            I'll have this posted on SSC.
            This does slightly affect results from tobit-including cmp models.
            Last edited by David Roodman; 05 Jan 2022, 14:59.

            Comment


            • #7
              I've gained a fuller understanding of the problem from Stata Corp. and have posted a different fix in version 8.6.8. The issue is that ml-based commands such as probit, oprobit, tobit, and intreg take an iterations() option; that the default for the option changed in Stata 16.1 from 16000 to 300; that this change is not under regular version control; and that this exemption is not documented. So even though cmp has "version 11" declarations to stabilize its results as Stata changes, the behavior of the call to tobit in this example was changing. Version 8.6.8 of cmp just imposes iter(16000) on all the calls to probit, oprobit, tobit, and intreg to form the starting point for the full model fit. I think this solves your problem.

              Stata's change to iter() is under user version control, a fact that I think will soon be documented.
              Last edited by David Roodman; 07 Jan 2022, 08:31.

              Comment


              • #8
                Thank you very much. I really appreciate this

                Comment

                Working...
                X