Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generalized Poisson regression not converging, Poisson and NBReg do

    Dear Stata sages,

    For a research project, I am in the process of predicting the number of correct multiple-choice responses (a count variable, range 0-13) from a set of demographic and personality variables. The data are underdispersed, so I am attempting Generalized Poisson Regression, using gpoisson. I am running Stata 16.0.

    The model is specified as such:

    Code:
    gpoisson DV_SurveyQ_Ncorrect Age CRT_Ncorrect TrustSci_SC Punitive_SC if filter == 1
    When I run the model it does not converge and runs to infinity:

    Iteration 0: log likelihood = -8391.1075 (not concave)
    Iteration 1: log likelihood = -4627.1766 (not concave)
    Iteration 2: log likelihood = -3341.3464 (not concave)
    Iteration 3: log likelihood = -3077.9947 (not concave)
    Iteration 4: log likelihood = -3070.8619 (not concave)
    Iteration 5: log likelihood = -3070.6492 (not concave)
    Iteration 6: log likelihood = -3070.6489 (not concave)
    Iteration 7: log likelihood = -3070.6489 (not concave)
    (etc)
    The same model does converge with both regular poisson and nbreg:

    Code:
    . poisson DV_SurveyQ_Ncorrect Age CRT_Ncorrect TrustSci_SC Punitive_SC if filter == 1, vce(robust)
    
    Iteration 0:   log pseudolikelihood = -1879.6068  
    Iteration 1:   log pseudolikelihood = -1879.6068  
    
    
    Poisson regression                              Number of obs     =        998
                                                    Wald chi2(4)      =      49.81
                                                    Prob > chi2       =     0.0000
    Log pseudolikelihood = -1879.6068               Pseudo R2         =     0.0073
    
    -------------------------------------------------------------------------------------
                        |               Robust
    DV_SurveyQ_Ncorrect |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
                    Age |  -.0022428   .0007199    -3.12   0.002    -.0036538   -.0008318
           CRT_Ncorrect |   .0180007   .0099078     1.82   0.069    -.0014182    .0374196
            TrustSci_SC |   .0319436    .015171     2.11   0.035      .002209    .0616782
            Punitive_SC |  -.0581381   .0135921    -4.28   0.000     -.084778   -.0314981
                  _cons |   1.509413   .0896253    16.84   0.000      1.33375    1.685075
    -------------------------------------------------------------------------------------
    
    
    . estat ic
    
    Akaike's information criterion and Bayesian information criterion
    
    -----------------------------------------------------------------------------
           Model |          N   ll(null)  ll(model)      df        AIC        BIC
    -------------+---------------------------------------------------------------
               . |        998  -1893.419  -1879.607       5   3769.214   3793.742
    -----------------------------------------------------------------------------
    Note: BIC uses N = number of observations. See [R] BIC note.
    
    .
    . estat gof
    
             Deviance goodness-of-fit =  575.1992
             Prob > chi2(993)         =    1.0000
    
             Pearson goodness-of-fit  =  541.9197
             Prob > chi2(993)         =    1.0000
    I have used ppml to identify possible superfluous variables but all are retained. gpoisson will also not converge if the predictors are tested separately in models containing just one predictor, or when they are centered to reduce discrepancies in scaling.

    Do you have any suggestions as to how to resolve the nonconvergence? Or alternative analyses that might work better for these (underdispersed) data?

    I thank you in advance for your trouble. Your advice would be deeply appreciated.
    Last edited by Chris Reinders Folmer; 25 Oct 2022, 05:40.

  • #2
    1. If the data are truly underdispersed relative to the conditional-on-x Poisson distribution it will often be the case that negative binomial will have difficulty converging. Is the marginal distribution of the outcome unerdispersed or the conditional-on-x distribution? These are different things but the latter is what matters in your context.

    2. Since your dependent variable seems to be bounded have you considered alternatives like binomial regression (via -glm-), beta-binomial regression (-help betabin-), or fractional regression (-help fracreg-)?

    Comment


    • #3
      Thank you for your response John, this is much appreciated!

      1. The conditional-on-x, I presume. This is the code (which I nicked from here) and the result:

      Code:
      glm DV_SurveyQ_Ncorrect Age CRT_Ncorrect TrustSci_SC Punitive_SC if filter == 1, family(poisson)
      
      Iteration 0:   log likelihood = -1880.2613  
      Iteration 1:   log likelihood = -1879.6068  
      Iteration 2:   log likelihood = -1879.6068  
      
      Generalized linear models                         Number of obs   =        998
      Optimization     : ML                             Residual df     =        993
                                                        Scale parameter =          1
      Deviance         =  575.1991834                   (1/df) Deviance =    .579254
      Pearson          =  541.9196794                   (1/df) Pearson  =   .5457399
      
      Variance function: V(u) = u                       [Poisson]
      Link function    : g(u) = ln(u)                   [Log]
      
                                                        AIC             =   3.776767
      Log likelihood   =  -1879.60677                   BIC             =  -6282.214
      
      -------------------------------------------------------------------------------------
                          |                 OIM
      DV_SurveyQ_Ncorrect |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      --------------------+----------------------------------------------------------------
                      Age |  -.0022428   .0009978    -2.25   0.025    -.0041984   -.0002871
             CRT_Ncorrect |   .0180007   .0131552     1.37   0.171    -.0077829    .0437843
              TrustSci_SC |   .0319436   .0199639     1.60   0.110     -.007185    .0710722
              Punitive_SC |  -.0581381   .0177046    -3.28   0.001    -.0928385   -.0234376
                    _cons |   1.509413   .1151504    13.11   0.000     1.283722    1.735103
      -------------------------------------------------------------------------------------
      . 
      estimates store poisson
      
      scalar phi = e(deviance_p)/e(df)
      
      di phi       
      .54573986
      It was my understanding that values >1 on this test indicate overdispersion, and values <1 underdispersion. It is for this reason that I was looking at generalized Poisson models.

      2. No, I have not yet considered any of these alternative approaches, but will do so. This project is taking me into uncharted territory in that I have not worked with bounded counts before, or with multinomial logistic regression (which I am using for the next step, to analyze responses to the individual multiple-choice items). I think the questions we are looking at really are quite simple (just main effects of the predictors on the DV), but finding the suitable model for the data has been hard. Your suggestions for alternatives are very welcome. Is there any of these that you would particularly recommend for underdispersed bounded count data?

      Thank you again!

      Comment

      Working...
      X