Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should I use seqlogit or cmp, or ... ? Please help


    Hi everyone, (more specifically Maarten Buis David Roodman )

    I am currently working on my master's thesis on online customer behaviour. Unfortunately, it is unclear to me what routine/model I should use since I am no statistician and have no experience with Stata. My data is clickstream data aggregated on session, article and date level. The sequence that I want to test is viewing product -> adding product to cart -> purchase. At this moment I have two binary outcome variables add_to_cart and purchased (ofc 0/1). So, someone could only viewed a product, viewed and added, and viewed, added and purchased the product.. I have multiple independent variables with multiple data types, of which some should be included in both equations, and others not (e.g. due to multicollinearity). I also want to test for interaction and quadratic effects, something I think is more applicable for cmp but correct me if I am wrong. All variables after the last quadratic effect are control variables. Please help me decide on what model (and additional syntax) to use.

    I coded cmp as follows (with add to cart model as some sort of selection model and purchase as second equation):

    cmp (purchased = i.availability_desired_size i.articles_per_style_cat collection_ind ///
    c.size_range_width_log#i.articles_per_style_cat ///
    c.size_range_width_log#collection_ind ///
    i.articles_per_style_cat#collection_ind ///
    c.size_range_width_log##c.size_range_width_log ///
    review_qty average_rating gender_code age_code selling_price ///
    sale_rate promo_price_rate day_of_week is_weekend week_of_year ///
    month) ///
    (add_to_cart = i.articles_per_style_cat collection_ind ///
    c.size_range_width_log#c.size_range_availability_l og ///
    c.size_range_width_log#i.articles_per_style_cat ///
    c.size_range_width_log#collection_ind ///
    c.size_range_availability_log#i.articles_per_style _cat ///
    i.articles_per_style_cat#collection_ind ///
    c.size_range_availability_log##c.size_range_availa bility_log ///
    c.size_range_width_log##c.size_range_width_log ///
    review_qty average_rating gender_code age_code selling_price ///
    sale_rate promo_price_rate day_of_week is_weekend week_of_year ///
    month) ///
    , ind($cmp_probit $cmp_probit) qui

    I have no syntax for seqlogit because I do not fully understand the syntax. I have no idea how I should include interaction/quadratic effects, if possible at all. I have to create a new variable for seqlogit to indicate the transitions (tree). A variable with (for example) values 0, 1, 2, and with: tree(0:1, 1:2) right?

    Thanks in advance. Kind regards,
    Nick Verschut

  • #2
    I don't know anything about seqlogit, so I can't respond on that.

    You might want to restrict the sample of the purchased equation to observations for which add_to_cart is 1. In that case do something like "ind($cmp_probit*add_to_cart $cmp_probit)".

    Comment


    • #3
      Thank you for you response David Roodman . I was planning on adding this to indeed indicate that the second equation is restricted to the sample where add_to_cart is 1. But when I ran the command (without quietly) Stata already indicated that it only used the sample of the data where add_to_cart is 1 for the second equation. So I figured that the *add_to_cart wasnt necessary. But you suggest to include it anyways?

      Comment

      Working...
      X