Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Censored Regression for GLM

    Hi All,

    Is it possible to incorporate a command to allow censoring from below for GLM models?

    For Example:

    glm depvar indvar covariates, family(nbinomial ml) link(identity) [any command for censoring]

  • #2
    Not that I'm aware of. But you can program it yourself!
    Here's a simple example with a left-censored Poisson distribution.
    Observations are left-censored if y<=4 and uncensored otherwise.
    I've used the -mlexp- command (help mlexp) which in this simple case is quite handy.

    Code:
    cap program drop censPoi
    program censPoi, rclass
    clear
    set obs 1000
    gen y = rpoisson(5)
    
    gen yc = y
    replace yc = 4 if y<= 4
    gen d = y > 4
    
    glm y, family(poisson) // original uncensored y
    local m1 = exp(_b[_cons])
    
    glm yc, family(poisson) // censored yc - wrong!
    local m2 = exp(_b[_cons])
    
    mlexp (d*log(poissonp(exp({theta}), yc)) + (1-d)*log(poisson(exp({theta}), yc))) // censored y - OK
    local m3 = exp(_b[/theta])
    
    return scalar m1 = `m1'
    return scalar m2 = `m2'
    return scalar m3 = `m3'
    end
    
    
    simulate m1 = r(m1) m2 = r(m2) m3 = r(m3), reps(200) seed(987): censPoi
    su m*

    Code:
    . su m*
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
              m1 |        200    4.998685    .0701953       4.83      5.133
              m2 |        200    5.436375    .0547442      5.297      5.545
              m3 |        200    4.997207    .0742204   4.813146   5.140038
    Last edited by Andrea Discacciati; 19 Feb 2022, 04:55.

    Comment


    • #3
      thank you for your response. I am trying to conceptualize this as the censored models include those observations being censored but censors at the specified value. I am looking at the observations set at 1000 and ourput of 200. Is this a truncated command? Because truncated regressions omits those observations set a the specified ranges

      Comment


      • #4
        Code:
        mlexp (d*log(poissonp(exp({theta}), yc)) + (1-d)*log(poisson(exp({theta}), yc)))
        This command fits a simple Poisson model whose parameter lambda is assumed to be constant for all observations (no covariates) = exp(theta) (so that it's constrained to be >0)
        It does not exclude left-censored observations (see the output below).
        The uncensored observations (d==1) contribute to the log-likelihood with the (Poisson) log-probability distribution function evaluated at yc [d*log(poissonp(exp({theta}), yc)], while the left-censored observations (d==0) contribute to the likelihood with the (Poisson) log-cumulative density function evaluated at yc [(1-d)*log(poisson(exp({theta}), yc)].
        With minimal effort, you can make the parameter lambda to depend on covariates (like gender, age, etc) like in a "regular" Poisson regression (glm), see the Examples at the bottom of -help mlexp-.

        In #2, I ran a simple simulation to show you that, given that specific data generating mechanism, the model coded with mlexp estimated on average (over 200 simulations) the true value of the parameter lambda (5).

        Code:
        clear
        set seed 1
        set obs 1000
        gen y = rpoisson(5)
        
        gen yc = y
        replace yc = 4 if y<= 4
        gen d = y > 4
        
        mlexp (d*log(poissonp(exp({theta}), yc)) + (1-d)*log(poisson(exp({theta}), yc)))
        lincom /theta, eform
        Code:
        initial:       log likelihood = -4968.3986
        alternative:   log likelihood = -3502.8649
        rescale:       log likelihood = -2029.4556
        Iteration 0:   log likelihood = -2029.4556  
        Iteration 1:   log likelihood = -1636.3881  
        Iteration 2:   log likelihood =  -1633.891  
        Iteration 3:   log likelihood = -1633.8901  
        Iteration 4:   log likelihood = -1633.8901  
        
        Maximum likelihood estimation
        
        Log likelihood = -1633.8901                     Number of obs     =      1,000
        
        ------------------------------------------------------------------------------
                     |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
              /theta |   1.614441   .0148087   109.02   0.000     1.585416    1.643465
        ------------------------------------------------------------------------------
        
        . lincom /theta, eform
        
         ( 1)  [theta]_cons = 0
        
        ------------------------------------------------------------------------------
                     |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 (1) |   5.025076   .0744147   109.02   0.000     4.881322    5.173064
        ------------------------------------------------------------------------------
        Last edited by Andrea Discacciati; 22 Feb 2022, 01:09.

        Comment


        • #5
          is this code applicable for identity function instead of on a log-based scale?

          Comment


          • #6
            With the necessary modifications.
            Moving away form this toy example, the usual caveats about using an identity link to model a parameter constrained to be >0 apply.

            Code:
            clear
            set seed 12
            set obs 1000
            gen x = rbinomial(1, 0.5)
            gen y = rpoisson(1 + 2*x) // true means: 1 and 3
            
            gen yc = y
            replace yc = 4 if y<= 4
            gen d = y > 4
            
            glm y i.x, family(poisson) link(identity) // completely observed y
            margins i.x
            
            glm yc i.x, family(poisson) link(identity) // left-censored y -- wrong model!
            margins i.x
            
            mlexp (d*log(poissonp({theta: i.x _cons}, yc)) + (1-d)*log(poisson({theta:}, yc))) // left-censored y -- correct model
            margins i.x

            Comment

            Working...
            X