Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • order of independent variables in panel negative binomial regression

    Hello,

    I am using Stata 18 on Windows 10.

    I am running a negative binomial regression on a panel dataset, and the model does not always converge when I change the order of the independent variables.

    Does the order of the independent variables matter in a count data model?

    The data is from a travel-cost contigent behaviour survey. I have 422 observations from 211 individuals. The dependent variable is the number of trips to a recreational site, which are from the current period and a future period. The variance of the dependent variable is greater than the mean, hence the negative binomial form.

    This is the code

    Code:
    xtset _index periods
    xtnbreg trips_cb tc_1 tcs periods dog_walking age female third_level if trips < 300 & trips > 0 & distance_1 < 31, nolog
    trips_cb - annual trips, integer
    tc_1 - travel cost to the recreational site
    tcs - travel cost to the substitute site
    periods - dummy variable indicating 1 for contingent/hypothetical future visits and 0 for current visits
    dog_walker - dummy = 1 if person waking their dog
    age - continuous
    female - binary
    third_level - dummy for 3rd level education.

    (I am limiting the observations used in the model to those that fit with the theory underpinning travel cost models). There are 24 ways of arranging the four variables dog_walking age female third_level. Although I think they are the same regression, it does not always converge.

    e.g some output
    Code:
    . xtnbreg trips_cb tc_1 tcs periods dog_walking third_level female age if trips < 300 & trips > 0 &  distance_1 < 31, nolog
    convergence not achieved
    
    Random-effects negative binomial regression          Number of obs    =    422
    Group variable: _index                               Number of groups =    211
    
    Random effects u_i ~ Beta                            Obs per group:
                                                                      min =      2
                                                                      avg =    2.0
                                                                      max =      2
    
                                                         Wald chi2(7)     =  71.50
    Log likelihood = -1736.7896                          Prob > chi2      = 0.0000
    
    ------------------------------------------------------------------------------
        trips_cb | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            tc_1 |  -.2335243   .0337367    -6.92   0.000    -.2996469   -.1674016
             tcs |   .0499543   .0217355     2.30   0.022     .0073536    .0925551
         periods |  -.0281283   .0122978    -2.29   0.022    -.0522316    -.004025
     dog_walking |   .3941446   .1736135     2.27   0.023     .0538684    .7344208
     third_level |   -.227061   .1725194    -1.32   0.188    -.5651929    .1110708
          female |  -.0678351   .1517262    -0.45   0.655     -.365213    .2295428
             age |   .0117346   .0053414     2.20   0.028     .0012656    .0222036
           _cons |   17.67167   128.6094     0.14   0.891    -234.3981    269.7415
    -------------+----------------------------------------------------------------
           /ln_r |   13.47252   128.6087                      -238.596     265.541
           /ln_s |  -.1777657    .086382                     -.3470713     -.00846
    -------------+----------------------------------------------------------------
               r |   709641.7   9.13e+07                      2.4e-104    2.1e+115
               s |   .8371386   .0723137                      .7067549    .9915757
    ------------------------------------------------------------------------------
    LR test vs. pooled: chibar2(01) = 773.67               Prob >= chibar2 = 0.000
    convergence not achieved
    r(430);
    
    . xtnbreg trips_cb tc_1 tcs periods female age dog_walking third_level if trips < 300 & trips > 0 &  distance_1 < 31, nolog
    
    Random-effects negative binomial regression          Number of obs    =    422
    Group variable: _index                               Number of groups =    211
    
    Random effects u_i ~ Beta                            Obs per group:
                                                                      min =      2
                                                                      avg =    2.0
                                                                      max =      2
    
                                                         Wald chi2(7)     =  71.47
    Log likelihood = -1736.7895                          Prob > chi2      = 0.0000
    
    ------------------------------------------------------------------------------
        trips_cb | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            tc_1 |  -.2335191   .0337442    -6.92   0.000    -.2996565   -.1673817
             tcs |   .0499605   .0217411     2.30   0.022     .0073488    .0925722
         periods |  -.0281259   .0122978    -2.29   0.022    -.0522291   -.0040226
          female |  -.0677467   .1517559    -0.45   0.655    -.3651828    .2296895
             age |   .0117311   .0053425     2.20   0.028       .00126    .0222022
     dog_walking |   .3942596   .1736496     2.27   0.023     .0539127    .7346065
     third_level |  -.2272131   .1725566    -1.32   0.188    -.5654177    .1109916
           _cons |   18.59189   173.4156     0.11   0.915    -321.2964    358.4802
    -------------+----------------------------------------------------------------
           /ln_r |   14.39218   173.4152                     -325.4954    354.2797
           /ln_s |  -.1779298    .086387                     -.3472453   -.0086143
    -------------+----------------------------------------------------------------
               r |    1780095   3.09e+08                      4.4e-142    7.3e+153
               s |   .8370012   .0723061                       .706632    .9914227
    ------------------------------------------------------------------------------
    LR test vs. pooled: chibar2(01) = 773.67               Prob >= chibar2 = 0.000


  • #2
    It should not matter how you order the RHS variables. But convergence problems are common in Neg Bin regressions and there is no strong reason to prefer this estimator to Poisson (more to follow). You can use the estimates from the converged model as starting values for the model with convergence problems.

    Code:
    xtnbreg trips_cb tc_1 tcs periods female age dog_walking third_level if trips < 300 & trips > 0 & distance_1 < 31, nolog
    mat b= e(b)
    xtnbreg trips_cb tc_1 tcs periods dog_walking third_level female age if trips < 300 & trips > 0 & distance_1 < 31, nolog from(b, skip)

    The variance of the dependent variable is greater than the mean, hence the negative binomial form.
    While the usual Poisson MLE standard errors are wrong if the data are overdispersed, you can correct for this by clustering on the cluster unit. In this way, the estimator is fully robust.
    Last edited by Andrew Musau; 03 Oct 2023, 10:49.

    Comment


    • #3
      coef/t are the same. it's just the appendages that are different and that's likely due to non-convergence.

      I agree with Andrew--use poisson with robust errors.

      Comment


      • #4
        Thanks very much George and Andrew, using possion with vce(robust) did solve the problem of non-convergence.

        Usually the standard in the travel cost modelling literature is to use a negative binomial when the trips count data is overdispersed. I'm wondering why that's the case when poisson with robust errors works as well.

        Comment


        • #5
          A common confusion with negative binomial regression models is that "overdispersion relative to a Poisson probability model" means Var(y|x)>E(y|x) not Var(y)>E(y). The sample descriptive statistics will show the latter but not the former. (It may be helpful to recall the decomposition Var(y)=VarxE(y|x)+ExVar(y|x).)

          In my experience it is often the case that nonconvergence arises when Var(y|x)<E(y|x), which can happen even when Var(y)>E(y) but which cannot in general be accommodated by a negative binomial specification.

          Comment


          • #6
            Similar discussion here.

            HTML Code:
            https://www.statalist.org/forums/forum/general-stata-discussion/general/1587040-why-do-poisson-and-negative-binomial-regressions-yield-the-same-result
            I think people use nbreg mainly because they learned just enough about over dispersion to point them that way. A few papers published, it becomes standard, and you have to use it in journals to get past referees.

            Comment


            • #7
              Thanks everyone. I will probably end up having to use nbreg for journal submission as you say, but knowing the issues with it relative to the poisson is really useful.

              Comment

              Working...
              X