creating treatment groups for difference in differences analysis using bysort

Phillip Charhill

Join Date: Dec 2021

Posts: 6
#1

creating treatment groups for difference in differences analysis using bysort

26 Dec 2021, 13:16

I need to perform some difference in differences analysis on some panel data. I need to create a control group and a treatment group. The panel data consists of two waves (a 2019 and 2020). The treatment group are those whose second wave interview occurred on or after march 2nd 2020.

I use the command:
gen treatment=0
replace treatment = 1 if date >= 21976

Date is the variable showing the date of their interview and 21976 is the Stata code for march 2nd 2020. treatment is to be a dummy, 1 for the treatment group and 0 for the control group.

This makes the observations in wave 2 equal zero for the control group and 1 for the treatment group.

What I need to do is tell Stata that the individuals for which treatment = 1 in wave 2, make treatment equal 1 in wave 1

I've been told that the bysort command will be useful for this task. But after searching online for hours I haven't been able to find anything to help me do this task.

Any help is much appreciated because I'm at the end of my rope trying to solve this
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

26 Dec 2021, 13:31

I am going to assume that you have the following variables
id - an identifier for each individual

date - a Stata daily date of the interview; you have two interviews for each individual

Then the following untested code may start you in a useful direction.

Code:

generate treatment = . // create the treatment indicator in the second wave for each individual bysort id (date): replace treatment = date>=td(2-3-2020) if _n==2 // copy the treatment from the second wave to the first wave for each individual bysort id (date): replace treatment = treatment[_2] if _n==1
Comment
Phillip Charhill

Join Date: Dec 2021

Posts: 6
#3

26 Dec 2021, 14:04

Thank you this seems like a good starting point. Yes I do have an id variable (pidp) that uniquely identifies each person in the dataset. I also have a wave identifier, 9 for the first wave, and 10 for the second. And also a data variable showing the date of the interview of each person in each wave.
The first part of the code runs fine, both the generate and the first bysort. But when I run the second bysort stata returns an errors, "_2 not found" and "r(111)".

Can you please explain what the [_2] in your code means please so I can try to adapt it so it will run correctly.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

26 Dec 2021, 14:54

I am sorry, that should not have had the underscore in front of the 2

Code:

bysort id (date): replace treatment = treatment[2] if _n==1

If you are unfamiliar with Stata's subscripting notation, like the "[2]" in the corrected code, see the output of

Code:

help subscripting

for an introduction.
Comment
Phillip Charhill

Join Date: Dec 2021

Posts: 6
#5

26 Dec 2021, 14:59

That code has solved my problem, and now I understand what it is doing. I am very grateful for your help with problem.

Thanks again!
Comment

Announcement

creating treatment groups for difference in differences analysis using bysort

Comment

Comment

Comment

Comment