Significant difference in number of observations between treatment and control group

Isha Mohanty

Join Date: Aug 2022

Posts: 11
#1

Significant difference in number of observations between treatment and control group

06 Sep 2022, 06:00

I have a data set where the control group is almost 80 times larger than the treatment group. The treatment and the control group observations are almost the same as the population. I have two questions in this regard:

1) What are the potential problems of using a data set for running regressions where control group is 80 times larger than the treatment group?

2) I chose to do a propensity score matching to retain only the matched observations where the number of observations in the treatment group is same as the control group. I chose to use this sample for running regressions. Is this approach correct?
Tags: None
Maxence Morlet

Join Date: Mar 2021

Posts: 650
#2

06 Sep 2022, 08:49

First things first, was treatment assignment randomised? That is the most fundamental question fo causality (which I presume you're after).
1 like
Comment
Isha Mohanty

Join Date: Aug 2022

Posts: 11
#3

06 Sep 2022, 12:09

It is not an experimental study. The treatment group consists of firms belonging to a certain industry and the control group is its sector peers.
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 650
#4

06 Sep 2022, 12:50

OK, was treatment implemented exogenously with respect to the outcome?
1 like
Comment
Isha Mohanty

Join Date: Aug 2022

Posts: 11
#5

06 Sep 2022, 13:25

Yes, treatment was implemented exogenous to the outcome.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#6

06 Sep 2022, 13:32

Isha:
Maxence highlighted two relevant points.
As an aside, I would go PSM.

Kind regards,
Carlo
(Stata 19.0)
Comment
Isha Mohanty

Join Date: Aug 2022

Posts: 11
#7

06 Sep 2022, 13:55

Dear Carlo and Maxence, thank you for addressing my issue. I am uncertain as to why would PSM be an appropriate solution for this? To articulate my question better - What would be the problem in using a sample as mentioned (number of observations in treatment and control group is sognificantly high), that would be solved by adopting a PSM?
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 650
#8

07 Sep 2022, 00:32

You may also want to check out the comunity-contributed command sdid.

But bottom line is, as long as your treatment is plausibly exogenous, it all gets a lot easier.

Next question: do you have panel data or cross sectional data?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17678
#9

07 Sep 2022, 00:40

Isha:
the most paramount issue that I see with your approach #1 is that, given its sky-rocketing sample size, the control group could include firms that differ in many respects from their treatment counterparts.

Kind regards,
Carlo
(Stata 19.0)
Comment
Isha Mohanty

Join Date: Aug 2022

Posts: 11
#10

07 Sep 2022, 11:07

Dear Maxence, I have panel data for an observation period of 15 years (2005-2019). Two of my mentioned approaches are as follows:

1) In the first approach, I am using the panel data of 15 years for treatment group and control group (100X treatment group observations) to conduct a fixed effects regression.
2) In the second approach, I am using only the 2005 values of the covariates to do the propensity score matching. I am then retaining the matched observations only and dropping the rest from the sample data. I am using these matched observations to conduct the fixed effects regression.

Dear Carlo, you are absolutely right. The treatment group characteristics are significantly different than the control group characteristics if I am following the first approach.

Thank you so much for your help
Comment

Announcement

Significant difference in number of observations between treatment and control group

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment