Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel count data

    Hi everyone,

    I am new to the forum, so please let me know if posting the message here is not appropriate.

    I am currently analyzing panel count data (illness cases) depending on socioeconomic variables. I am naturally oriented towards a poisson regression.

    I have used the
    Code:
    xtpoisson chol std_pr gdp_cap pop_dens pol_stab, pa corr(exchangeable)
    which gives me the results I needed that fit with my literature review. However, given the fact that I am dealing with panel data, I am unsure whether I need to do a poisson fixed or random effect rather than population averaged, which from what I understand, is a pooled regression. (Please note that both the FE and the RE fail to give me results that I can adequately interpret, although the models present a good fit)

    A second point is that my variance is significantly higher than my mean. I understand that my data is overdispersed and that I should consider a negative binomial model instead of a Poisson. In this case, I also wonder whether I should apply FE or RE. Would it be acceptable for me to limit my analysis to the population averaged model?
    Code:
    xtnbreg chol std_pr gdp_cap pop_dens pol_stab, pa corr(exchangeable)
    A final question is how to interpret the model's coefficients?

    Thank you for your assistance. I really appreciate it.

    Best,

  • #2
    The xt glossary defines population averaged as:
    population averaged model. A population-averaged model is used for panel data in which the
    parameters measure the effects of the regressors on the outcome for the average individual in the
    population. The panel-specific errors are treated as uncorrelated random variables drawn from a
    population with zero mean and constant variance, and the parameters measure the effects of the
    regressors on the dependent variable after integrating over the distribution of the random effects.

    See also: https://www.stata.com/support/faqs/s...tion-averaged/
    https://www.ncbi.nlm.nih.gov/pubmed/20220526
    http://www.biostat.jhsph.edu/~fdomin...A.FAQ.2005.doc

    I don't see a lot of population averaged models reported but they seem more common in epidemiology. FE and RE are the most common in my area. I suspect PA does assume the panel effects are uncorrelated with the x's which may be problematic but it is not exactly a pooled regression.

    If you look for postings on the Forum by Silva, you'll find some interesting comments on poisson versus negative binomial estimators.

    I'd look at how folks in your field tend to do this - a lot of discipline-based differences in preferences appear in these matters.

    Comment


    • #3
      Minerva:
      welcome to this forum.
      As an aside to Phil's helpful comments, please note:
      - in addition to random effect specification, due to incidental parameter bias, -xtpoisson- and -xtnbreg- allow conditional fixed effect specification, which differs fron fixed effect specification under -xtreg-;
      - it's difficult to advise on which specification should be applied to your data; as Phil highlighted, much depends on the literature and/or customary laws in your research field;
      - a good first step to learn how to interpret the coefficients of any regression model is the Stata .pdf manual. Obviously Stata manual cannot replace any decent textbook on econometrics of count data (see https://www.springer.com/la/book/9783540776482; https://www.stata.com/bookstore/regr...-count-data/);
      - eventually, how can interested listers help you out in this respect if you do not share the outcome table of your regression model(s) (via CODe delimiters please; see the FAQ. Thanks)?
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        As Carlo and Phil pointed out, without the output it can be difficult to "interpret the model's coefficients". That being said,I wish to underline two points: 1) Maybe there is some need to adjust for the population., and for this you may check the "exposure" option; 2) I wonder whether you are dealing with data from individuals as well as data from countries (gdp, perhaps?). Being this so, please beware of the ecological fallacy. If all data is epidemiologic data, I believe the PA models tend to fit perfectly.
        Best regards,

        Marcos

        Comment


        • #5
          Thank you Phil, Carlo and Marcos for your answers and for the useful links. I appreciate it.

          Marcos, my panel data comprises the number of deaths by country (y is a positive integer that could be equal to zero) and environmental variables (rainfall, gdp per capita, etc.).

          Only the population-averaged regression gives me coefficients with signs that fit the literature (particularly the negative binomial). Would the assumption of errors as uncorrelated random variable with zero mean and constant variance present a bias? I'm failing to find much literature on the population-averaged.

          Also, is there a test that helps decide which effect to select (pa, fe or re)? Would the Hausman test be applicable for a Poisson or negative binomial regression?

          Please see below the results of my poisson regressions and the negative binomial with population-averaged:
          • Poisson, pa:
          Code:
          xtpoisson chol std_pr gdp_cap pop_dens pol_stab, pa corr(exchangeable)
          
          Iteration 1: tolerance = .05816268
          Iteration 2: tolerance = .01638857
          Iteration 3: tolerance = .00080752
          Iteration 4: tolerance = .00020115
          Iteration 5: tolerance = .00007675
          Iteration 6: tolerance = .0000185
          Iteration 7: tolerance = 3.874e-06
          Iteration 8: tolerance = 7.622e-07
          
          GEE population-averaged model                   Number of obs     =      1,552
          Group variable:                     Code_c      Number of groups  =         97
          Link:                                  log      Obs per group:
          Family:                            Poisson                    min =         16
          Correlation:                  exchangeable                    avg =       16.0
                                                                        max =         16
                                                          Wald chi2(4)      =   12408.93
          Scale parameter:                         1      Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                  chol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                std_pr |   .2361613   .0040234    58.70   0.000     .2282755    .2440471
               gdp_cap |  -.0002727   4.40e-06   -61.98   0.000    -.0002813    -.000264
              pop_dens |  -.0003182   .0000432    -7.36   0.000     -.000403   -.0002335
              pol_stab |  -.3975866   .0062278   -63.84   0.000    -.4097929   -.3853804
                 _cons |   3.809115   .0125154   304.35   0.000     3.784586    3.833645
          ------------------------------------------------------------------------------
          • Poisson, FE:
          Code:
          xtpoisson chol std_pr gdp_cap pop_dens pol_stab, fe
          note: 29 groups (464 obs) dropped because of all zero outcomes
          
          Iteration 0:   log likelihood = -70752.235  
          Iteration 1:   log likelihood = -68208.909  
          Iteration 2:   log likelihood =  -68199.31  
          Iteration 3:   log likelihood = -68199.309  
          
          Conditional fixed-effects Poisson regression    Number of obs     =      1,088
          Group variable: Code_c                          Number of groups  =         68
          
                                                          Obs per group:
                                                                        min =         16
                                                                        avg =       16.0
                                                                        max =         16
          
                                                          Wald chi2(4)      =    4913.19
          Log likelihood  = -68199.309                    Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                  chol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                std_pr |   .2421363   .0041086    58.93   0.000     .2340835    .2501891
               gdp_cap |  -.0000815   7.59e-06   -10.74   0.000    -.0000964   -.0000666
              pop_dens |   .0149868    .000355    42.21   0.000     .0142909    .0156827
              pol_stab |   .0909319   .0125593     7.24   0.000      .066316    .1155478
          ------------------------------------------------------------------------------
          • Poisson, RE:
          Code:
          xtpoisson chol std_pr gdp_cap pop_dens pol_stab, re
          
          Fitting Poisson model:
          
          Iteration 0:   log likelihood = -141658.09  
          Iteration 1:   log likelihood = -137301.99  
          Iteration 2:   log likelihood = -137157.55  
          Iteration 3:   log likelihood = -137157.01  
          Iteration 4:   log likelihood = -137157.01  
          
          Fitting full model:
          
          Iteration 0:   log likelihood =  -74757.94  
          Iteration 1:   log likelihood = -69141.478  
          Iteration 2:   log likelihood = -68773.213  
          Iteration 3:   log likelihood = -68765.433  
          Iteration 4:   log likelihood = -68765.265  
          Iteration 5:   log likelihood = -68765.265  
          
          Random-effects Poisson regression               Number of obs     =      1,552
          Group variable: Code_c                          Number of groups  =         97
          
          Random effects u_i ~ Gamma                      Obs per group:
                                                                        min =         16
                                                                        avg =       16.0
                                                                        max =         16
          
                                                          Wald chi2(4)      =    4868.37
          Log likelihood  = -68765.265                    Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                  chol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                std_pr |   .2418868   .0041084    58.88   0.000     .2338345    .2499392
               gdp_cap |  -.0000813   7.58e-06   -10.73   0.000    -.0000961   -.0000664
              pop_dens |   .0148124   .0003552    41.70   0.000     .0141162    .0155086
              pol_stab |   .0907163   .0125466     7.23   0.000     .0661255    .1153071
                 _cons |   3.100793   .3240657     9.57   0.000     2.465636     3.73595
          -------------+----------------------------------------------------------------
              /lnalpha |    2.31623   .1301783                      2.061085    2.571375
          -------------+----------------------------------------------------------------
                 alpha |   10.13739   1.319667                       7.85449     13.0838
          ------------------------------------------------------------------------------
          LR test of alpha=0: chibar2(01) = 1.4e+05              Prob >= chibar2 = 0.000
          • Negative binomial, pa:
          Code:
          xtnbreg chol std_pr gdp_cap pop_dens pol_stab, pa corr(exchangeable)
          
          Iteration 1: tolerance = .01271372
          Iteration 2: tolerance = .01200527
          Iteration 3: tolerance = .01479449
          Iteration 4: tolerance = .02042497
          Iteration 5: tolerance = .02270934
          Iteration 6: tolerance = .014262
          Iteration 7: tolerance = .00501106
          Iteration 8: tolerance = .00134181
          Iteration 9: tolerance = .00033693
          Iteration 10: tolerance = .00008735
          Iteration 11: tolerance = .00003572
          Iteration 12: tolerance = .00001644
          Iteration 13: tolerance = 7.324e-06
          Iteration 14: tolerance = 3.211e-06
          Iteration 15: tolerance = 1.397e-06
          Iteration 16: tolerance = 6.061e-07
          
          GEE population-averaged model                   Number of obs     =      1,552
          Group variable:                     Code_c      Number of groups  =         97
          Link:                                  log      Obs per group:
          Family:             negative binomial(k=1)                    min =         16
          Correlation:                  exchangeable                    avg =       16.0
                                                                        max =         16
                                                          Wald chi2(4)      =     732.86
          Scale parameter:                         1      Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                  chol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                std_pr |   .2328292   .0261713     8.90   0.000     .1815344    .2841239
               gdp_cap |  -.0003345   .0000151   -22.21   0.000    -.0003641    -.000305
              pop_dens |  -.0000909   .0002443    -0.37   0.710    -.0005698    .0003881
              pol_stab |   -.446677   .0433751   -10.30   0.000    -.5316906   -.3616634
                 _cons |   3.728103   .0701301    53.16   0.000      3.59065    3.865555
          ------------------------------------------------------------------------------
          I apologize for posting such a long code. I'm really grateful for your help!

          Comment


          • #6
            Dear Minerva,

            You did not tell us what you want to do with your data, but assuming that you just want to see the relation between the dependent variable and the regressors, I would say that Poisson with FE and robust standard errors should be your benchmark because that is the most robust of all the estimators being considered (and it is very robust).

            Best wishes,

            Joao

            Comment


            • #7
              Hi Joao,

              Thank you for your reply.

              I am indeed trying to analyse the relationship between my dependent and independent variables. I wouldn't be doing any projections at this point. I have tried the Poisson with FE and robust option :

              Code:
              xtpoisson chol std_pr gdp_cap pop_dens pol_stab, fe robust
              
              Iteration 0:   log pseudolikelihood = -70678.851  
              Iteration 1:   log pseudolikelihood = -68106.434  
              Iteration 2:   log pseudolikelihood = -68101.059  
              Iteration 3:   log pseudolikelihood = -68101.058  
              
              Conditional fixed-effects Poisson regression    Number of obs     =        928
              Group variable: Code_c                          Number of groups  =         58
              
                                                              Obs per group:
                                                                            min =         16
                                                                            avg =       16.0
                                                                            max =         16
              
                                                              Wald chi2(4)      =       3.61
              Log pseudolikelihood  = -68101.058              Prob > chi2       =     0.4613
              
                                               (Std. Err. adjusted for clustering on Code_c)
              ------------------------------------------------------------------------------
                           |               Robust
                      chol |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                    std_pr |   .2426427   .1648578     1.47   0.141    -.0804728    .5657581
                   gdp_cap |  -.0000806   .0001443    -0.56   0.577    -.0003635    .0002023
                  pop_dens |    .015179   .0143058     1.06   0.289    -.0128599    .0432178
                  pol_stab |   .0896382   .3645985     0.25   0.806    -.6249617     .804238
              ------------------------------------------------------------------------------
              However, it shows that all my variables are not statistically significant! I am not sure what I'm doing wrong.

              Thank you so much for your help.

              Edit: I have reduced the number of countries, deleting those with no reported cases. However, this did not have a significant impact.
              Last edited by Minerva Evans; 06 Aug 2018, 14:01.

              Comment


              • #8
                Dear Minerva,

                First of all, do not remove observations based on the value of the dependent variable; that generally leads to invalid results.

                About your model, you may not be doing anything wrong, it may just be that your regressors do not vary much over time (that is the case with gdp and population) and therefore their effect is captured by the fixed effects. In view of this result you may decide that it is not appropriate to condition on the fixed effects and run a standard Poisson regression model.

                Best wishes,

                Joao

                Comment


                • #9
                  Thank you Joao for your answer.

                  Do you think I should check for stationarity in my panel data before doing the poisson regression?

                  Comment


                  • #10
                    That should not be needed.

                    Best wishes,

                    Joao

                    Comment


                    • #11
                      Hi everyone,

                      I am working with count data in a panel setting. My outcome variable has a lot of zeros so when I run this:

                      Code:
                      xtpoisson n sup_studies EU_Esp votea, fe vce(robust)
                      I get the message:

                      Code:
                      note: 31 groups (217 obs) dropped because of all zero outcomes
                      and the output is:

                      Code:
                      Conditional fixed-effects Poisson regression Number of obs = 294
                      Group variable: CODI_BARRI Number of groups = 42
                      
                      Obs per group:
                      min = 7
                      avg = 7.0
                      max = 7
                      
                      Wald chi2(3) = 53.40
                      Log pseudolikelihood = -407.2029 Prob > chi2 = 0.0000
                      
                      (Std. Err. adjusted for clustering on CODI_BARRI)
                      ------------------------------------------------------------------------------
                      | Robust
                      n | Coef. Std. Err. z P>|z| [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                      sup_studies | 28.61781 4.682422 6.11 0.000 19.44043 37.79519
                      EU_Esp | .0009798 .0003751 2.61 0.009 .0002445 .001715
                      votea | -.0184421 .0108895 -1.69 0.090 -.039785 .0029008
                      ------------------------------------------------------------------------------
                      I was wondering how I could keep my the observations of n=0 like in a zero inflated poisson but for panel data. Thanks,
                      Marianna

                      Comment


                      • #12
                        Dear Marianna Sebo,

                        That note is somewhat misleading because you are effectively using all observations. What happens is that when the outcome for one group is always equal to zero, those observations are not informative about the slope parameters, and therefore can be dropped to facilitate the estimation. So, what you are doing is correct that you should not worry about that note.

                        Best wishes,

                        Joao

                        Comment


                        • #13
                          Thanks for the clarification Joao Santos Silva!
                          Best,
                          Marianna

                          Comment

                          Working...
                          X