Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic regression with outcome probability based on month

    I am performing a logistic regression to find out the probability of a given customer to click on one of my organizations emails.

    The model looks thus like this:

    y = β0 + β1X1 + β2X2 + ... +βnXn

    where the outcome is the probability for the given observation/customer to click on a link in the email. The outcome variable I am using is a dummy called 'clicked'. The independent variables display the age, number of kids and many other information regarding the client.

    I also have the time and date at which the emails where sent.

    I am now curious to find out if a client is more likely to click on an email in a given month.

    Am I correct to assume that in order to model this, I need to interact the 'clicked' variable with the dummy representing the month that I am interested in?

    f.e. 'clicked' * 'month_july'

    And set this interacted variable as the new outcome variable?

    Or how would you proceed?

  • #2
    There are several things wrong. First, your model does not describe a logistic regression model.

    You don't need to interact the month with the explained/dependent/left-hand-side/y-variable. That would be bad in many different ways. Instead just add the month variable without interaction. Why would a month variable be any different from all the other explanatory/independent/right-hand-side/x-variables you added to your model?
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Absolutely right. sorry for the stupid question and thank you for the smart reply

      I thought that by interacting the two I would be left with the clicks in the given month.
      But you are correct, I can just look at the coefficients and see from there the effect.

      I am trying to figure out a way to run the model with the different months in a ML algo and have it return the probability per customer per month.

      Thank you for your corrections
      Last edited by Loris Di Stefano; 07 Apr 2022, 06:56.

      Comment


      • #4
        Instead of having a dummy for each separate month, you should create a Stata monthly date variable and - let's pretend you give it the name mon - then you include i.mon in your independent variables. This will simultaneously include indicator (dummy) terms in your model for each separate month.

        Code:
        help datetime
        help factor variables
        Code:
        logistic y X1 X2 ... Xn i.mon

        Comment

        Working...
        X