Using xtdidregress command with 2 cohorts from different birth years

Dilini Jayasinghe

Join Date: Nov 2024

Posts: 6
#1

Using xtdidregress command with 2 cohorts from different birth years

09 Dec 2024, 23:17

I am using xtdidregress command in Stata 18.0. Purpose of doing this is to identify the causal effect of an educational program on treated cohort vs control cohort. Treated cohort was born 4 years later than control cohort.

This is the command I used

Code:

xtset hicid age_group xtdidregress (learning i.treat i.post) (did), nogteffects group(hicid) time(age_group)

I am using age_group as my panel variable because I can only compare the 2 groups across age_group and not the year (due to the 4 year gap between them). The policy was implemented for the treated cohort when they are 6-7 years old. Despite having data for all observations in the first wave (i.e., 4-5 age group), when I run the above command, I receive following output

Code:

. xtdidregress (learning i.treat i.post) (did), nogteffects group(hicid) time(age_group) note: 1.treat omitted because of collinearity. Treatment and time information Time variable: age_group Control: did = 0 Treatment: did = 1 Control Treatment Group hicid 980 906 Time Minimum 4 6 Maximum 8 8 Difference-in-differences regression Number of obs = 8,036 Data type: Longitudinal (Std. err. adjusted for 1,886 clusters in hicid) Robust learning Coefficient std. err. t P>t [95% conf. interval] ATET did (1 vs 0) .0979579 .0461282 2.12 0.034 .0074902 .1884255 Note: ATET estimate adjusted for covariates and panel effects. Note: Treatment occurs at different times and estimation sample contains units that switch in and out of treatment.

treat = all observations in the treated cohort (that is younger cohort).
control = all observations in the untreated (i.e., earlier) cohort

post = all observations in both cohorts that is aged 6 and above is marked as 1 and 0 otherwise.

did = post*treat

These are my problems:

1) A note appears

Code:

note: 1.treat omitted because of collinearity.

. When I checked the number of observations for treat and did (i.e., post*treat), there is a variation, and they are not the same. That means there should not be collinearity.

Code:

. tabulate treat if cohort == "B" treat | Freq. Percent Cum. ------------+----------------------------------- 1 | 4,953 100.00 100.00 ------------+----------------------------------- Total | 4,953 100.00 . tabulate did if cohort == "B" did | Freq. Percent Cum. ------------+----------------------------------- 0 | 906 18.29 18.29 1 | 4,047 81.71 100.00 ------------+----------------------------------- Total | 4,953 100.00

So I can't figure out why Stata omits treat variable stating collinearity problem.

2) In the first output, under time, the maximum time for both groups is 8, indicating that each group has observations that first appear in different ages. But this is not true when I check the data. All observations for control group first appears only at the age of 4 and no observations start from age 8. Similarly all observations for treated group first appears only at the age of 6.

Is there a way to ask Stata to give me a list of unique IDs where above (2) doesn't hold?
Tags: None

Announcement

Using xtdidregress command with 2 cohorts from different birth years