Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Significant difference in number of observations between treatment and control group

    I have a data set where the control group is almost 80 times larger than the treatment group. The treatment and the control group observations are almost the same as the population. I have two questions in this regard:

    1) What are the potential problems of using a data set for running regressions where control group is 80 times larger than the treatment group?

    2) I chose to do a propensity score matching to retain only the matched observations where the number of observations in the treatment group is same as the control group. I chose to use this sample for running regressions. Is this approach correct?

  • #2
    First things first, was treatment assignment randomised? That is the most fundamental question fo causality (which I presume you're after).

    Comment


    • #3
      It is not an experimental study. The treatment group consists of firms belonging to a certain industry and the control group is its sector peers.

      Comment


      • #4
        OK, was treatment implemented exogenously with respect to the outcome?

        Comment


        • #5
          Yes, treatment was implemented exogenous to the outcome.

          Comment


          • #6
            Isha:
            Maxence highlighted two relevant points.
            As an aside, I would go PSM.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Dear Carlo and Maxence, thank you for addressing my issue. I am uncertain as to why would PSM be an appropriate solution for this? To articulate my question better - What would be the problem in using a sample as mentioned (number of observations in treatment and control group is sognificantly high), that would be solved by adopting a PSM?

              Comment


              • #8
                You may also want to check out the comunity-contributed command sdid.

                But bottom line is, as long as your treatment is plausibly exogenous, it all gets a lot easier.

                Next question: do you have panel data or cross sectional data?

                Comment


                • #9
                  Isha:
                  the most paramount issue that I see with your approach #1 is that, given its sky-rocketing sample size, the control group could include firms that differ in many respects from their treatment counterparts.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Dear Maxence, I have panel data for an observation period of 15 years (2005-2019). Two of my mentioned approaches are as follows:

                    1) In the first approach, I am using the panel data of 15 years for treatment group and control group (100X treatment group observations) to conduct a fixed effects regression.
                    2) In the second approach, I am using only the 2005 values of the covariates to do the propensity score matching. I am then retaining the matched observations only and dropping the rest from the sample data. I am using these matched observations to conduct the fixed effects regression.

                    Dear Carlo, you are absolutely right. The treatment group characteristics are significantly different than the control group characteristics if I am following the first approach.

                    Thank you so much for your help

                    Comment

                    Working...
                    X