Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to build mixed logit model where the dependent variable is in Rating Scale

    Hi,

    I have a SP dataset where the user's choice between two alternatives has been recorded in a rating scale.

    Eg. Following is one of the sample row from the dataset

    Rating_A Rating_B TT_A TC_A TT_B TC_B

    0.7 0.3 10 5 15 3.5

    Can anyone please let me know how to build a mixed logit (Panel data) model in this situation.

    Most of the mixed logit models that I found on internet use dichotomous dependent variable.

    Being a novice, I'll be grateful if you can let me know the solution or send me the link to some relevant text.

    Thanks
    Neeraj

  • #2
    Neeraj

    It looks like your rating variables contain proportions which sum to one (i.e. Rating_A + Rating_B = 1), is that right? In that case you can use the approach suggested by Papke and Wooldridge (1996) to model fractional response variables. As demonstrated by Baum (2008), this can be implemented in Stata using the glm command, e.g.

    Code:
    use http://www.ats.ucla.edu/stat/stata/faq/proportion, clear
    glm meals yr_rnd parented api99, link(logit) family(binomial) vce(robust) nolog
    In your case the dependent variable would be Rating_A and the independent variables the difference between the characteristics of the alternatives (i.e. TT = TT_A - TT_B and TC = TC_A - TC_B).

    You can estimate a mixed version of this model by using meglm or gllamm (SJ, SSC).

    Alternatively you can trick logit to estimate the model in the example above by using weights:

    Code:
    use http://www.ats.ucla.edu/stat/stata/faq/proportion, clear
    gen id = _n
    expandcl 2, cluster(id) generate(newid)
    bysort id (newid): gen n = _n
    gen dep = 1 if n==1
    replace dep = 0 if n==2
    replace meals = 1-meals if n==2
    logit dep yr_rnd parented api99 [pw=meals], vce(cluster id) nolog
    Using this logic you can estimate the mixed logit model with mixlogit (SJ, SSC), but the data then needs to be in long form as described in my 2007 SJ paper (reference below).

    Arne

    References

    Baum, C.F. 2008. Modeling proportions. Stata Journal 8: 299–303.

    Hole, A.R. 2007. Fitting mixed logit models by using maximum simulated likelihood. Stata Journal 7: 388-401.

    Papke, L. E. and J. Wooldridge. 1996. Econometric methods for fractional response variables with an application to 401(k) plan participation rates. Journal of Applied Econometrics 11: 619–632.
    Last edited by Arne Risa Hole; 25 Jan 2015, 11:35.

    Comment


    • #3
      Hi Arne,

      Thanks for your reply.

      Regarding your question on rating variables containing proportions, let me describe the experiment again, which i didn't do previously:

      1. An experiment showing alternatives A and B was given to a respondent
      2. The respondent was asked to rate the alternative based on the following scale:

      (1) Definitely A

      Comment


      • #4
        Oops! Something went wrong..

        Hi Arne,

        Thanks for your reply.

        Regarding your question on rating variables containing proportions, let me describe the experiment again, which i didn't do previously:

        1. An experiment showing alternatives A and B was given to a respondent
        2. The respondent was asked to rate the alternative based on the following scale:

        (1) Definitely A (2) Probably A (3) Cant Say (4) Probably B (5) Definitely B

        3. Now a rule was set such that If respondent gives a rating (2) then RATING_A = 0.7 and RATING_B = 0.3. Similarly, for rating (5) RATING_A = 0.1 and RATING_B = 0.9


        Effectively, I can tailor my dataset like this..

        Rating TT_A TC_A TT_B TC_B

        2 10 5 15 3.5


        I wish to build an Ordered Logit model initially followed by Mixed logit (Panel)...

        I am only coming up with works done on Rank Ordered Logit so far.. Can you let me know how to accomplish Rating Ordered Logit?


        Thanks
        Neeraj

        Comment


        • #5
          So your rating numbers are a monotone transform of the original ranking 1 to 5. How to use it, then, depends on how that transform was arrived at and what it means. At one extreme, it might be the result of a program of research, grounded in theory, that demonstrated that 70% of those who respond 2 exhibit a certain behavior, whereas that behavior is exhibited by only 10% of those who respond with 5. In that case, the ratings are a true proportion, and Arne Risa Hole's advice to look into -glm- and -meglm- is an excellent way to proceed.

          At the other extreme, the 0.7, 0.1 and the other numbers are just some arbitrary numbers that were assigned, with no particular theoretical or empirical justification. In that case, you have the illusion, but not the reality, of a proportion. The most sensible thing to do in this case is to acknowledge that these new numbers convey no more information than the original 1 through 5. Ignore them and just work with the original 1-5 as an ordinal response variable.

          Or there may be some partial theoretical or empirical basis for these numbers, but nothing really solid. In that case, you have to make a decision how seriously you want to take those numbers and then act accordingly. If you believe them, I would follow Arne Risa Hole's advice. If you think they are thin gruel, then ignore them and stick with rank-based analysis.

          Putting it more briefly, there are rank-based logistic models, and there are proportion-based logistic models. (And there are other models that treat the "ratings" as being interval-level measures but are not based on the logistic distribution, e.g. ordinary linear regression.) But to my knowledge there are no rating ordered logistic models.

          Comment


          • #6
            Thank you Sir,

            I would like to keep only the original ratings from 1-5. No proportions.

            The objective of the study was to evaluate the probability of shift from alternative A to B or vice-versa.

            Can you please suggest which models I can build initially.

            One of the previous study had applied Binary Logit on this dataset. I wish to do a comparison between different models.

            Thanks
            Neeraj

            Comment


            • #7
              Neeraj

              As Clyde Schechter has very clearly explained an alternative to the approach I outlined would be to work with the original 1-5 as an ordinal response variable. Given the additional information you have now provided about your data that seems like a more sensible option to me, unless there is some good justification for assigning the proportions to the ranking. You can simply fit an ordered logit or probit model using the ranking as your dependent variable and the difference between the characteristics of the alternatives as independent variables. Mixed ordered models can be fit using meologit or meoprobit.

              Arne

              Comment


              • #8
                Thanks a lot Arne!

                Just like to add on that 1-5 denotes a "rating" scale (not ranking) which a respondent provides for the 2 competing alternatives in the experiment.


                I'll certainly try the meologit command by preparing the dataset accordingly. I'll be glad if you can send me some relevant links explaining meologit technique


                Also, is there a possibility that I can specify the individual utility function for each alternative and then find the probabilities?
                Eg. Utility_A = b0 + b1 * TT_A + b2 * TC_A
                Utility_B = b1 * TT_B + b2 * TC_B


                Thanks once again for your help

                Comment


                • #9
                  Thanks for the clarification. For background on meologit I would recommend the Stata manual (www.stata.com/manuals13/memeologit.pdf) which is a really excellent resource.

                  When you define the variables the way I suggested you are estimating Utility_A - Utility_B = b0 + b1 * (TT_A - TT_B) + b2 * (TC_A - TC_B). In other words the estimated coefficient for (TT_A - TT_B) is an estimate of b1 and similarly for b2.

                  Comment


                  • #10
                    I'm interested to develop a mixed logit model for crash injury severity. My response variable is "injury severity" that is categorical with 3 levels (i.e. minor injury, serious injury, fatality). There are various explanatory variables but I would list few of them. Land use (urban, rural), lighting (day, evening, night), vehicle type (passenger car, light truck, heavy truck), speed limit, driver age, and vehicle age are the explanatory variables. "Land use", "lighting" and "vehicle type" are also categorical. There are more than 2000 observations.
                    I would be really grateful if I may be provided with the code for mixed logit model in STATA and how te create a dataset for a mixed logit severity model
                    Thank you.

                    Comment


                    • #11
                      #10 is a duplicate post. See https://www.statalist.org/forums/for...d-logit-models.

                      Comment

                      Working...
                      X