Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multilevel negative binomial modelling via bayesmh

    Dear Statalist members,

    i am interested in fitting a three level negative binomial (random intercept) via bayesmh and made some progress with a working example using melanoma.dta as data source. Currently i am struggeling with the implementation of random intercepts - in short, a model with one random intercept for region takes very long to converge, but i'm not sure how to speed up the estimation process and how to implement a second random intercept as i am new bayesian analysis. Is there anything i can do to improve the syntax and implement a three-level approach

    I've used the negative binomial likelihood evaluator provided by Houssein Assaad in this forum and prepared the data by creating a new variable for multivariate analysis:
    Code:
    use http://www.stata-press.com/data/r14/melanoma, replace
    set seed 12345
    generate random = uv*uniform()
    A non-multilevel model works fine, but can't figure out where to implement a second random for nation term in my multilevel model (regions nested in nations)
    Code:
    * Frequentistic model
    nbreg deaths uv random
    
    * Bayesian model
    bayesmh deaths uv random, lleval(negbin, param({lnalpha})) ///
    prior({deaths:uv}{deaths:random}{deaths:_cons}, flat) ///
    prior({lnalpha},flat)
    
    * Frequentistic model (First Random Intercept: Region):
    menbreg deaths uv random ||region:, nolog noheader
    
    * Bayesian model (First Random Intercept: Region):    
    bayesmh deaths uv random i.region, lleval(negbin, param({lnalpha})) ///
    prior({deaths:uv}{deaths:random}{deaths:_cons}, flat) ///
    prior({deaths:i.region}, flat) ///
    prior({var_region}, igamma(0.001, 0.001))                    ///
    prior({lnalpha}, flat) ///
    block({deaths:_cons}{deaths:uv}{deaths:random}{lnalpha}) ///
    block({deaths:i.region}) exclude({deaths:i.region}) ///
    burnin(50000) mcmcsize(50000) dots
    Any comment would be highly appreciated.

  • #2
    Daniel,
    in order to achieve a more efficient sampling of a multilevel negative binomial model you need to use some of the random effects facilities of bayesmh; see options reffects and redefine. Note however that these options are not supported with likelihood evaluators and I suggest you program the negative binomial likelihood directly using the generic likelihood specification llf. For an outcome variable y, linear predictor {xb} and model parameter {lnalpha}, the log-likelihood can be calculated using the expression
    Code:
    ln(nbinomialp(exp(-{lnalpha}), y, 1/(1+exp({lnalpha}+{xb}))))
    For example, the Bayesian formulation of
    Code:
    nbreg deaths uv random
    using flat priors is
    Code:
    bayesmh deaths, nocons likelihood(llf(                ///
        ln(nbinomialp(exp(-{lnalpha}), deaths,            ///
        1/(1+exp({deaths:uv random}+{deaths:_cons}))))))  ///
        prior({deaths:uv random _cons}{lnalpha}, flat)
    Now, you can add random effects to the model. Say, you want a Bayesian formulation of the two-level model
    Code:
    menbreg deaths uv random ||region:, nolog noheader
    You can use the option redefine(reg:i.region) to define the random intercepts and the short-cut {reg:} to refer to them in the likelihood expression. To have a proper two-level specification, I suggest that you suppress the base level of region, exclude the constant term from the liner predictor form and use the prior prior({reg:i.region}, normal({deaths:_cons}, {var})) for the random intercepts; the parameter {deaths:_cons} will be the mean random effect. The full specification can be something like this

    Code:
    fvset base none region
    bayesmh deaths, nocons likelihood(llf(                   ///
        ln(nbinomialp(exp(-{lnalpha}), deaths,               ///
        1/(1+exp({lnalpha}+{reg:}+{deaths:uv random}))))))   ///
        redefine(reg:i.region)                               ///
        prior({deaths:uv random _cons}, normal(0, 100))      ///
        prior({reg:i.region}, normal({deaths:_cons}, {var})) ///
        prior({var}, igamma(0.001, 0.001))                   ///
        prior({lnalpha}, flat)                               ///
        block({deaths:_cons uv random}{lnalpha})             ///
        exclude({reg:i.region})
    Once you understand this specification you can extend it to three-level model specification.

    Nikolay

    Comment


    • #3
      Thank you, Nikolay, your example was very helpful - especially your suggestion regarding the generic likelihood specification! All my models run much faster now.

      Comment


      • #4
        Hey guys,

        i found this topic online and would like to pose an additional question to you - i hope you don't mind. Is it possible to extend Nikolay's example to more complex models, e.g. zero-inflated, zero-truncated or hurdle models? So far i've found several examples of multilevel zero-inflated poisson models, like this one here, but almost all of them focus on SAS. Do you think it is possible to build such a model in Stata using the bayesian framework?

        Thank you in advance. Sandra

        Comment


        • #5
          Hi Sandra,

          Yes, the negative binomial example I've shown above can be extended to more complex models as long as the likelihood can be formulated as a single substitutable expression, which I believe is true for the zero-inflated and truncated negative binomial, and the hurdle models. Not all models however can be formulated using the llf() likelihood option; for example, it is not possible for likelihoods that involve iterations across observations. If you interested in a particular model, you may continue this discussion or contact Stata technical support.

          Nikolay

          Comment


          • #6
            Hey Nikolay,

            i'm very interested in extending your example to account for zero-inflation as most of my data sets contain strong evidence of both overdispersion and excess zeros. Unfortunately, i haven't figured out how to implement both parts of a zero-inflation model, logit part and count part, in the llf() option - in the SAS example i've mentioned earlier, an if-else-combination is used to achieve this task.

            Sandra

            Comment


            • #7
              Including zero-inflation sounds tempting... Sadly, i can't provide any solution to your question. I've just started to read up on likelihood specifications for bayesmh. Having said that, is there some advanced literature on likelihood specifications in Stata?

              Comment


              • #8

                Sandra,

                you can specify the zero-inflated Poisson regression model in bayesmh using the following likelihood option
                Code:
                llf(cond(y==0,                                                     ///
                    ln(1/(1+exp(-{infl:})) + 1/(1+exp({infl:}))*exp(-exp({xb:}))), ///
                    y*{xb:} - ln(1+exp({infl:}))                                   ///
                    - exp({xb:}) - lngamma(y+1) ))
                where y is the dependent variable, {xb:} is the linear predictor form in the Poisson model and {infl:} is the linear form that models the inflation at zero.

                In the link you have provided we see examples using the following data
                Code:
                import delimited http://statistics.ats.ucla.edu/stat/data/hdp.csv, clear
                encode smokinghx, gen(smoking)
                encode cancerstage, gen(cancer)
                drop smokinghx cancerstage
                The considered zero-inflated Poisson model can be fitted in Stata using the zip command
                Code:
                zip nmorphine age lengthofstay bmi i.cancer, inflate(age i.smoking i.cancer)
                A Bayesian formulation of this model would be
                Code:
                gen cons = 1
                bayesmh nmorphine, noconstant likelihood(llf(cond(nmorphine==0,    ///
                    ln(1/(1+exp(-{infl:})) + 1/(1+exp({infl:}))*exp(-exp({xb:}))), ///
                    nmorphine*{xb:} - ln(1+exp({infl:}))                           ///
                    - exp({xb:}) - lngamma(nmorphine+1) )))                        ///
                    xbdefine(xb:   age lengthofstay bmi i.cancer cons)             ///
                    xbdefine(infl: age i.smoking i.cancer cons)                    ///
                    prior({xb:} {infl:}, normal(0, 10))                            ///
                    block({xb:i.cancer}) block({xb:age lengthofstay bmi cons})     ///
                    block({infl:age cons}) block({infl:i.cancer}) dots
                You can amend the linear form {xb:} with random effects, as in the negative binomial example above, to implement a mixed effects zero-inflated Poisson model.

                I hope you find this helpful,
                Nikolay
                Last edited by Nikolay Balov (StataCorp); 04 Apr 2016, 10:02.

                Comment


                • #9
                  Thank you so much, Nikolay, it works like a charm! As soon as possible I will include some random effects and post a feedback here. Again, thank you so much!

                  Comment


                  • #10
                    Hello again,

                    following Nikolays example, i managed to implement some random effects in zero-inflated poisson models and even a even a simple hurdle model. Unfortunately, i've struggled with zero-inflated negative binomial models for quite some time now.

                    Using Stata's default example for zinb models, i came up with this syntax following Joseph Hilbes log likelihood functions (Joseph Hilbe (2011): Negative Binomial Regression, p. 372)

                    Code:
                    webuse fish
                    zinb count persons livebait, inflate(child camper)
                    
                    bayesmh count, noconstant likelihood(llf(cond(count==0, ///
                    ln(1/(1+exp(−{xb1:})))+1/(1+exp({xb1:}))*(1/(1+ {alpha}*exp({xb:})))^(1/{alpha}), ///
                    ln(1/(1+exp(−{xb1:}))) + ln((1/{alpha})+count) - ln(count+1) − ln(1/{alpha}) +            ///
                    (1/{alpha}) * ln(1/(1+{alpha}*exp({xb:}))) + count*ln(1−(1/(1+ {alpha}*exp({xb:})))))) ///
                    xbdefine(xb: persons livebait cons)        ///
                    xbdefine(xb1: child camper cons)          ///
                    prior({xb:} {xb1:}, normal(0, 10)))           ///
                    prior({alpha}, normal(0, 10))                  ///
                    block({xb: persons livebait cons})         ///
                    block({xb1: child camper cons}) dots
                    These lines result in the following error message:
                    option xbdefine(xb: persons livebait cons) xbdefine(xb1: child camper cons) prior({xb:} {xb1:}, normal(0, 10)) is not supported by likelihood model
                    logdensity


                    Has someone made a similar experience or can provide some insights into this error? Thank you!
                    Last edited by Daniel Weller; 23 Apr 2016, 03:33.

                    Comment


                    • #11
                      Nikolay, could you possibly extend your first example which you posted on March 29th (negative binomial with random intercept) to also include a single random parameter for either "random" or "uv"? Any instruction would be immensely appreciated.

                      Comment


                      • #12
                        Hello again, Daniel, is there a bracket missing in your likelihood function? I've added an additional closing bracket, but it still results in an error regarding xbdefinebfir xb1. Is it at all possible to use xbdefine in that way?

                        Comment


                        • #13
                          Justin,

                          I will extend the negative binomial example posted on March 29th to include a random slope for the uv variable. The resulting model will be a Bayesian formulation of the following multilevel model fitted with the menbreg command
                          Code:
                          . menbreg deaths uv random ||region: uv
                          In addition to the random intercept parameters {reg:} I introduce random coefficients {reg_uv:}, one for each level of region, again using the redefine() option: redefine(reg_uv:i.region). In the log-likelihood expression for the model I use {reg_uv:} simply as a coefficient to the uv variable
                          Code:
                          ln(nbinomialp(exp(-{lnalpha}), deaths, 1/(1+exp({lnalpha}+{reg:}+{reg_uv:}*uv+{deaths:random}))))
                          The random coefficients {reg_uv:} I assign a normal prior with unknown mean {deaths:_cons} and variance {uv:var}.

                          Because of the increased complexity of the model, you may need to think more carefully about specifying priors, blocks of parameters, etc. Below I show a full specification of the model where I use a somewhat more informative igamma(2,1) prior for the variance parameters.
                          Code:
                          . fvset base none region
                          . set seed 12345
                          * Bayesian model (Random Intercept: Region, Random uv slope: Region):
                          . bayesmh deaths, nocons likelihood(llf(                            ///
                              ln(nbinomialp(exp(-{lnalpha}), deaths,                          ///
                              1/(1+exp({lnalpha}+{reg:}+{reg_uv:}*uv+{deaths:random}))))))    ///
                              redefine(reg:i.region)                                          ///
                              redefine(reg_uv:i.region)                                       ///
                              prior({reg:i.region},    normal({deaths:_cons}, {deaths:var}))  ///
                              prior({reg_uv:i.region}, normal({uv:_cons}, {uv:var}))          ///
                              prior({deaths:var},      igamma(2, 1))                          ///
                              prior({uv:var},          igamma(2, 1))                          ///
                              prior({lnalpha},         flat)                                  ///
                              prior({deaths:random _cons}{uv:_cons}, normal(0, 100))          ///
                              block({deaths:_cons random}{lnalpha}) block({deaths:var})       ///
                              block({uv:var}{uv:_cons})                                       ///
                              exclude({reg:i.region}{reg_uv:i.region}) dots
                          Please, let me know if you have problems running this example.
                          Nikolay

                          Comment


                          • #14
                            Hi Daniel,
                            I am sorry for the late response. You have made a really good progress in formulating the zero-inflated negative binomial model.

                            I see a syntactical error in your specification (a misplaced parenthesis) causing the error. Yes, it is not easy to keep track of so many closing parentheses. You should have
                            Code:
                            bayesmh count, noconstant likelihood(llf(cond(count==0, ///
                                ln(1/(1+exp(−{xb1:}))) + 1/(1+exp({xb1:}))*(1/(1+ {alpha}*exp({xb:})))^(1/{alpha}), ///
                                ln(1/(1+exp(−{xb1:}))) + ln((1/{alpha})+count) - ln(count+1) − ln(1/{alpha}) +            ///
                                (1/{alpha}) * ln(1/(1+{alpha}*exp({xb:}))) + count*ln(1−(1/(1+ {alpha}*exp({xb:}))))))) ///
                                xbdefine(xb: persons livebait cons)        ///
                                xbdefine(xb1: child camper cons)          ///
                                prior({xb:} {xb1:}, normal(0, 10))           ///
                                prior({alpha}, normal(0, 10))                  ///
                                block({xb: persons livebait cons})         ///
                                block({xb1: child camper cons}) dots
                            However, there is another, algebraic problem in the actual likelihood function and you still would not be able to run the above command. Below, I rewrote the log-likelihood following the formulas from the documentation of the zinb command.

                            Note that I am using {xb:} for the main linear predictor and {xg:} for the linear predictor at zero (inflation). Because {alpha} is a positive parameter, I specify lognormal(0, 10) prior for it and initialize it with a positive value.

                            Code:
                            . bayesmh count, noconstant                                    ///
                                likelihood(llf(cond(count<=0, ln(invlogit({xg:}) +         ///
                                (1-invlogit({xg:}))/(1+{alpha}*exp({xb:}))^(1/{alpha})),   ///
                                ((1-invlogit({xg:})) + lngamma(count+1/{alpha}) -          ///
                                lngamma(count+1) - lngamma(1/{alpha}) -                    ///
                                (count+1/{alpha})*ln(1+{alpha}*exp({xb:})) +               ///
                                count*(ln({alpha})+{xb:})) )))                             ///
                                xbdefine(xb: persons livebait cons)                        ///
                                xbdefine(xg: child camper cons)                            ///
                                prior({xb:} {xg:}, normal(0, 10))                          ///
                                prior({alpha},     lognormal(0, 10))                       ///
                                block({xb:}) block({xg:}) init({alpha} 1) dots
                            Note that the likelihood can also be re-parametrized with respect to {lnalpha}=log({alpha}) somewhat more efficiently.

                            Nikolay

                            Comment


                            • #15
                              Hello everybody! Thank you so much for your comments, Sandra and Nikolay! I'm pretty sure Nikolay's approach provides a solution to many problems a few of my colleagues encountered over the last couple of weeks.

                              Comment

                              Working...
                              X