Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Only Post Treatment Data: Covariate Adjustment

    I am seeking to exploit a natural experiment in my paper.
    • Retrospective treatment: i.e., announcement of treatment came after the users have chosen a date to register (Jan 16 announcement that anyone who registered prior to Jan 1 gets free rewards)
    • So, I took users who joined the platform +-2 days around Jan 1 to get treatment and control groups.
    • For causal inference, though I have treatment and control groups, I do not have pre-treatment data.
    Can you please advice me for causal identification. I recently read a paper on covariate adjustment, wherein my understanding suggests that Covariate adjustment is desirable because, if a covariate predicts outcome, accounting for its effect on outcome will improve power to detect a treatment effect [24] unless none of the covariates in a model are prognostic [2]. Thus, I can include covariates that can potentially impact y and can/cannot be different for control and treatment group.

    Am I right in thinking that matching estimators make sense because treatment and control groups are divided purely based on the criteria of date of registering. So, there is no observed/unobserved factors that enhance the propensity of treatment of one user vs. the other.

    Best,
    gr
    Last edited by guneet robin; 21 Jun 2024, 11:47.

  • #2
    Interesting. The treatment, being unknown at the time, does not affect the time you sign up. Still, it seems to me you're basically trying to assess whether the people registering early are somehow different than those that register late, and if those differences may affect the treatment effect.

    What's the outcome the rewards are affecting?

    Comment


    • #3
      The purchase of items with the firm offering rewards. I study the y = purchase count and purchase value (USD) over a period of three years. I have a set of covariates that may impact y but not treatment. Including those will enhance the prescision of treatment.

      Comment


      • #4
        I think you need something that might predict early registration, which is the same as determining the treatment.

        Or, match on the X's you have if they have large standardized differences. My guess is that the sample would be fairly balanced absent matching.

        Comment


        • #5
          I don't work in marketing, so this is just the opinion of an informed layperson. But I'm skeptical that defining the treatment groups as those within two days before or after Jan 1 is sufficiently exogenous from subsequent purchasing behavior to serve for an experiment of nature. Jan 1 is a legal holiday. People who shop on that holiday may differ systematically from those who don't. I would pick a couple of days close to Jan 1, but excluding Jan 1 itself, and also such that the sets of days chosen for each group contain the same number of weekend days, and I would then also verify that the number of shoppers actually accrued on weekend days is the same in both groups. The bit about weekend days is for the same reason: weekend shopper seem likely to differ in their overall shopping habits from weekday shoppers.

          Comment


          • #6
            I'm betting that the means of the Xs are nearly identical between those before and those after, no matter the +- days distance. If so, matching and other methods don't get you anything.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Jan 1 is a legal holiday. People who shop on that holiday may differ systematically from those who don't. I would pick a couple of days close to Jan 1, but excluding Jan 1 itself, and also such that the sets of days chosen for each group contain the same number of weekend days, and I would then also verify that the number of shoppers actually accrued on weekend days is the same in both groups. The bit about weekend days is for the same reason: weekend shopper seem likely to differ in their overall shopping habits from weekday shoppers.
              Thank you for the reply, Clyde. Actually, the cutoff time is August 31, 12 midnight. I tried to simplify the case by choosing Dec 31 midnight as the cutoff. But part of your argument may still hold as the shopping habits may differ for Aug 29,30, and 31 from September 1,2 and 3. I checked the X's of the users who joined the platform during these days are not identical. These X's are their income status, shopping frequency etc. Since these X's, though do not impact the treatment by firm, they can predict y. So, I was hoping to do covariate adjustment in my treatment model for better precision.

              I am not sure how do account for the fact that people have different propensity to register on the platform at the end of the month vs. beginning of the month. Income can be one aspect, which I am controlilng for in my model.

              Can you share your perspective?

              Comment


              • #8
                Originally posted by George Ford View Post
                I'm betting that the means of the Xs are nearly identical between those before and those after, no matter the +- days distance. If so, matching and other methods don't get you anything.
                Thank you for your reply, George. I want to see the causal impact of rewards on y. For that, using the set-up 1) retrospective announcement of rewards, 2) treatment is fully observed and deterministic, i.e., solely based on the date of joining, and 3) users who joined +/-3 days of cutoff (treatment cutoff date) gives me control and treatment who, on a continuum of their interest in/attraction towards the platform are 'same' because they all joined within 6 days. Platform was three years old when treatment happened. In early days users registering over a period of 6 days may differ from each other in terms of their attraction towards the platform.

                Please share your perspective.

                Comment


                • #9
                  I am not sure how do account for the fact that people have different propensity to register on the platform at the end of the month vs. beginning of the month. Income can be one aspect, which I am controlilng for in my model.

                  Can you share your perspective?
                  Well, with regard to what are the attributes of people that distinguish end of month from beginning of month shoppers, I have no perspective. I have no experience in marketing and no lay intuitions about that.

                  As for my statistical perspective, once you, perhaps after consultation with others in your field who do understand these issues, decide what those attributes are (or, at least, the ones for which you can get data), I would be inclined to do a "doubly robust" analysis that entails both including these as covariates in the model and using propensity score weighting or propensity score matching. The official Stata command -teffects ipwra- automates the calculations and most of the setup for this.

                  Added: If, in the end, you decide that income is the only relevant attribute, or the only one for which data can be obtained, then I would be inclined not to include it as a covariate but, instead, to match on it. I would use the narrowest caliper match that didn't lose an unacceptably large amount of the data due to unavailability of a match. Of course this assumes that the distributions of income in the two groups overlap sufficiently that some reasonable number of matches can be obtained with a meaningfully narrow caliper.
                  Last edited by Clyde Schechter; 23 Jun 2024, 13:08.

                  Comment


                  • #10
                    Thank you dear, Clyde Schechter. This is useful.

                    Comment


                    • #11
                      Post the means of the Xs for the +-6 day group? If the thing is up three years, I suspect there wouldn't be much difference. You're well past early adoption influences. And keep in mind the +- short period is intended to get equivalent groups, so it is itself a sort of matching (matching on date, sort of).

                      You could expand the range to see when differences start to emerge, if ever for plausible ranges (after three years, I suspect the means would be similar for +- months, giving you more data).

                      If there is a difference, then match on the Xs and proceed.

                      Comment


                      • #12
                        Thank you, dear, George Ford. I agree with your viewpoint.

                        Comment

                        Working...
                        X