Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using xtdidregress command with 2 cohorts from different birth years

    I am using xtdidregress command in Stata 18.0. Purpose of doing this is to identify the causal effect of an educational program on treated cohort vs control cohort. Treated cohort was born 4 years later than control cohort.

    This is the command I used
    Code:
    xtset hicid age_group
    xtdidregress (learning i.treat i.post) (did), nogteffects group(hicid) time(age_group)
    I am using age_group as my panel variable because I can only compare the 2 groups across age_group and not the year (due to the 4 year gap between them). The policy was implemented for the treated cohort when they are 6-7 years old. Despite having data for all observations in the first wave (i.e., 4-5 age group), when I run the above command, I receive following output

    Code:
    . xtdidregress (learning i.treat i.post) (did), nogteffects group(hicid) time(age_group) 
    note: 1.treat omitted because of collinearity.
    
    Treatment and time information
    
    Time variable: age_group
    Control:       did = 0
    Treatment:     did = 1
    
    Control  Treatment
    
    Group        
    hicid        980        906
    
    Time         
    Minimum          4          6
    Maximum          8          8
    
    
    Difference-in-differences regression                     Number of obs = 8,036
    Data type: Longitudinal
    
    (Std. err. adjusted for 1,886 clusters in hicid)
    
    Robust
    learning  Coefficient  std. err.      t    P>t     [95% conf. interval]
    
    ATET         
    did 
    (1 vs 0)     .0979579   .0461282     2.12   0.034     .0074902    .1884255
    
    Note: ATET estimate adjusted for covariates and panel effects.
    Note: Treatment occurs at different times and estimation sample contains units that switch    in    and    out    of    treatment.
    treat = all observations in the treated cohort (that is younger cohort).
    control = all observations in the untreated (i.e., earlier) cohort

    post = all observations in both cohorts that is aged 6 and above is marked as 1 and 0 otherwise.

    did = post*treat


    These are my problems:

    1) A note appears
    Code:
    note: 1.treat omitted because of collinearity.
    . When I checked the number of observations for treat and did (i.e., post*treat), there is a variation, and they are not the same. That means there should not be collinearity.

    Code:
        . tabulate treat if cohort == "B"
    
          treat |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |      4,953      100.00      100.00
    ------------+-----------------------------------
          Total |      4,953      100.00
    
    . tabulate did if cohort == "B"
    
            did |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |        906       18.29       18.29
              1 |      4,047       81.71      100.00
    ------------+-----------------------------------
          Total |      4,953      100.00
    So I can't figure out why Stata omits treat variable stating collinearity problem.

    2) In the first output, under time, the maximum time for both groups is 8, indicating that each group has observations that first appear in different ages. But this is not true when I check the data. All observations for control group first appears only at the age of 4 and no observations start from age 8. Similarly all observations for treated group first appears only at the age of 6.

    Is there a way to ask Stata to give me a list of unique IDs where above (2) doesn't hold?
Working...
X