Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nested logit with panel data?

    Hi all,

    I have data from a discrete choice experiment in which respondents were shown 12 choice tasks. For each choice task, they were asked if they would like to make a purchase, and if so, which of two alternatives they would prefer. Each alternative was characterized by a set of attributes. The choice tasks included alternatives with different variations of levels of these attributes (but the same set of attributes).

    I would like to run a nested logit model, in which the upper level is a decision to purchase or not, and the lower level is which alternative (if they are purchasing either). I'm having some trouble finding examples of using nested logit with multiple responses per individual (ie panel data). Is it appropriate to be using this approach and is it possible to use nlogit here?

    Much appreciated.

    Thank you,
    Seema Kacker


  • #2
    (1) You can do much better than -nlogit-. You can fit what Stata calls panel-data mixed logit models, either by using Arne Risa Hole's -mixlogit- command or Stata 16's -cmxtmixlogit- command. If you'd like to work with a discrete mixing distribution instead of a continuous mixing distribution, you can use my -lclogit2- instead.

    When you specify your model, you can include an alternative specific constant that is equal to 0 for the opt-out option and 1 for both purhcase options. Then you let the coefficient on this ASC to be random. That will induce a positive correlation between random utilities of the purchase options, thereby allowing you to mimic the defining feature of the -nlogit-. Have a look at Walker et al. (2007) for more information.

    Walker, J.L., Ben‐Akiva, M. and Bolduc, D. (2007), Identification of parameters in normal error component logit‐mixture (NECLM) models. J. Appl. Econ., 22: 1095-1125. https://doi.org/10.1002/jae.971

    Of course, you don't have to stop there. You can specify random coefficients on your product attributes to capture unobserved preference heterogeneity across individuals. -nlogit- doesn't allow you to do this.

    (2) Now turning to a technical subtlety: What -mixlogit-, -cmxtmixlogit- and -lclogit2- allow you to estimate are panel-data mixed "multinomial logit" models. Strictly speaking, the positive correlation induced by the random ASC in these models is not a faithful mimicry of the positive correlation in the nested logit model: There is a subtle difference, which I explain in:

    Oviedo, J.L., Yoo, H.I. A Latent Class Nested Logit Model for Rank-Ordered Data with Application to Cork Oak Reforestation. Environ Resource Econ 68, 1021–1051 (2017). https://doi.org/10.1007/s10640-016-0058-7.

    In principle, you can estimate panel-data mixed "nested logit" models to address that subtle distinction. In practice, the benefit is small or at least it was small in our application. What we observe is that once you induce a positive correlation via the ASC ala mixed logit, the other kind of correlation in idioysncratic errors ala nested logit becomes practically nothing.

    Comment


    • #3
      Dear Hong Il Yoo,

      I have a question related to the above topic to some extent.

      I have a revealed-preferences panel data of purchases of households in grocery stores in a number of meat categories.
      Each purchase occasion consists of a set of alternatives (e.g the "choice set") and a set of attributes of these alternatives (such as price, brand, nutritional content, etc.).
      At each purchase occasion, only one alternative was chosen by the household. The number of alternatives in the choice set can vary between purchase occasions and can be quite large (up to 50 alternatives).
      In addition, I observe purchase occasions where there was no purchase in the category (i.e., the household purchased the outside good). Although no purchase was made in the category, I do observe the choice set that was available in the category on this purchase occasion.

      I'd like to use mixed logit to evaluate:
      1. The distribution of household tastes to each of the product attributes.
      2. The level of attachment to the category (vs. the outside good)
      I expect to find, for example, that households with a higher level of attachment to the category will be less price-sensitive.

      To achieve both objectives, I thought to follow your advice above.
      That is, add a dummy variable "d" that equals 1 to each of the alternatives within the category and 0 to the outside alternative (which I will add manually to the choice set). As you suggested, this variable will get random coefficients in the estimation procedure.
      Since the attributes' values of the outside alternative are unknown, each of the attributes will take the value of 0 for the outside alternative (it can be seen as an interaction between the dummy variable "d" and each of the product attributes). I guess this can also be seen as a normalization of the utility from the outside alternative to zero.

      I wonder if this strategy makes sense? And if not, is there another strategy you are familiar with to achieve the above goals?

      Many thanks,
      Adam Dvir


      Comment


      • #4
        Adam Dvir: Yes I think it's a sensible default strategy. Estimating 50 random category-specific constants is a daunting task though: I guess you'll have to be prepared for a long wait before seeing your first set of estimates. But I don't see how this strategy will allow you to test the hypothesis that "households with a higher level of attachment to the category will be less price-sensitive."

        Comment


        • #5
          Dear Hong Il Yoo, thank you so much for responding. I really appreciate it.

          I actually thought of only one dummy variable - "d" for the internals/external alternatives and not 50 ASCs for each product. Do you think it will still work?
          Also, would it be correct to add to the estimate continuous explanatory variables, such as price, even when the price of the external alternative is unknown and therefore set to 0 in all the choice sets (I am afraid it may bias the price coefficient)?

          Regarding the hypothesis that:
          "households with a higher level of attachment to the category will be less price-sensitive."
          I thought to use the estimation results of the random coefficients at the household level (beta_n) of the variables "price" and "d" and test the correlation between them. I expect to find that households that receive a low coefficient on price (in absolute value) (i.e. non-price sensitive households) receive a relatively high coefficient on variable “d”, which reflects the level of attachment to the category, and vice versa.

          Many thanks,
          Adam


          Comment


          • #6
            Adam Dvir: Thanks a lot for your clarification. I see, what you're planning to do is to estimate the mixed logit version of a nested logit model with two nests, where one nest captures only the outside good and the other nest captures all the inside goods. Yes it will work, and perhaps you can even think of adding some non-random alternative specific constants. Setting the price and all other attributes to zero is OK and will not cause any bias as long as you include a dummy that distinguishes inside goods from the outside good. As a matter of fact, analytically, you're not actually assuming that the price of the outside good is zero; what you're doing is to normalize the systematic utility of the outside good to zero. Replacing price with zero is what you'll have to do mechanically in Stata to implement that the desired normalization. Your use of a correlation coefficient to the hypothesis in quotes is also a sensible approach.

            Comment


            • #7
              Hong Il Yoo, thank you very much for the detailed response.
              I deeply appreciate it.
              Adam

              Comment

              Working...
              X