Correct specification for Drdid

Sandra Macedo

Join Date: Sep 2020

Posts: 35
#1

Correct specification for Drdid

12 Apr 2022, 05:06

Hello all,

I wish to apply a doubly robust difference-in-difference estimator using the drdid package. My data is repeated cross-section therefore there are unbalanced characteristics across the two groups, thus in order to estimate atet, I need reweighting.
policy is my "post" variable (=1 after policy) and treat = 1 if individual is treated.
My codes:

1) using only pre-treatment variables
[CODE
drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) all

Doubly robust difference-in-differences estimator summary
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATET |
dripw | .0253289 .0062734 4.04 0.000 .0130333 .0376245
dripw_rc1 | .0073257 .0062618 1.17 0.242 -.0049473 .0195986
drimp | -.0192829 .005635 -3.42 0.001 -.0303273 -.0082384
drimp_rc1 | -.0000959 .0056453 -0.02 0.986 -.0111605 .0109687
reg | -.0003035 .0047153 -0.06 0.949 -.0095454 .0089383
ipw | -.0219987 .0068335 -3.22 0.001 -.0353921 -.0086053
stdipw | .0034359 .0076242 0.45 0.652 -.0115072 .018379
------------------------------------------------------------------------------
Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
dripw :Doubly Robust IPW
drimp :Doubly Robust Improved estimator
reg :Outcome regression or Regression augmented estimator
ipw :Abadie(2005) IPW estimator
stdipw:Standardized IPW estimator
sipwra:IPW and Regression adjustment estimator.

2) using variables from both groups (before and after)

[CODE] drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(policy) tr(treat) all

Doubly robust difference-in-differences estimator summary
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATET |
dripw | .027256 .0036539 7.46 0.000 .0200944 .0344175
dripw_rc1 | .0102055 .0037384 2.73 0.006 .0028783 .0175327
drimp | .0118586 .0030361 3.91 0.000 .005908 .0178093
drimp_rc1 | .0249633 .0030417 8.21 0.000 .0190017 .0309249
reg | .0180872 .0024844 7.28 0.000 .0132179 .0229566
ipw | -.0164215 .0037662 -4.36 0.000 -.0238031 -.0090399
stdipw | -.0088338 .0042984 -2.06 0.040 -.0172585 -.000409
------------------------------------------------------------------------------
]

Considering this is a rcs data type, I am not sure which one to use and how can I assume PT hold after reweighting? in the help file it states "the estimator is appropriate if either the propensity of treatment, or the outcome regression is correctly specified", how can I know?

I asked it to generate the RIF variable but that is only for the "before" part of the data. Am I missing something here?

#[CODE] drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) stub(rif2) dripw

Doubly robust difference-in-differences Number of obs = 2,032,024
Outcome model : least squares
Treatment model: inverse probability
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATET |
treat |
(1 vs 0) | .0253289 .0062734 4.04 0.000 .0130333 .0376245
------------------------------------------------------------------------------

. sum rif2att , d

rif2att
-------------------------------------------------------------
Percentiles Smallest
1% -16.02436 -953.9732
5% -4.304641 -905.3324
10% -2.991486 -877.8985 Obs 2,032,024
25% -1.426803 -812.6595 Sum of wgt. 2,032,024

50% .0161535 Mean .0253289
Largest Std. dev. 8.942679
75% 1.484337 1007.817
90% 3.095296 1056.131 Variance 79.9715
95% 4.399576 1302.176 Skewness 4.728444
99% 15.83226 1311.288 Kurtosis 1675.18

[CODE]

Can anyone help?
Is there a way I could replicate this by hand or generate the weights? (sorry for so many questions)

Last edited by Sandra Macedo; 12 Apr 2022, 06:01.
Tags: None
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

12 Apr 2022, 05:21

Yeah I don't really understand the questions. So let's start from the first problem, could you please rephrase your original question. Also, please put your data and code in the delimiters so we can clearly see what Stata gives you.

The most i get so far is that you wanna reweight your analyses due to imbalances between your treated and untreated units. But it looks like your command (not package) does this with IPW weights. So let's start there, please, what is the issue you're seeing here?
1 like
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2469
#3

12 Apr 2022, 06:47

Hi Sandra (i still owe you an answer to your other post/email)
but
1) your specifications seem odd
Since you are using drdid, the idea is that you have standard 2x2 DID design, meaning two time periods (before and after) and two groups (treatment and control). Seems that your treatment is treat, but you are trying to use "policy" and "year" as your time variables. I suspect that using policy isn't correct.

2) when using Repeated crossection, it is automatic that you will be using data from before and after treatment, basically because you cannot observe "before" characteristics for the same units across time.

3) The rif is correct. if you look at the mean, is exactly what drdid is providing. Keep in mind that the RIF is simply used to obtain standard errors of the estimation, nothing else.

4) with only 2 periods (before and after) you cannot assess parallel trends. In fact, PT assumption cannot be tested ever, because is based on unobserved conterfactuals.
The alternative that one does is look at trends in the past, to see if the trends were parallel before treatment. But with 2 periods, there is no past to look at.

HTH
F
1 like
Comment

Sandra Macedo

Join Date: Sep 2020
Posts: 35

12 Apr 2022, 06:59

Hi Jared,

Thanks! Sorry about the confusion.
Yes I want to use the estimator from Sant’Anna and Zhao, (2020), “Doubly Robust Difference-in-Differences Estimators”, Journal of Econometrics, Vol. 219 (1), pp. 101-122 which includes IPW weighs and outcome regression to correct imbalances. Because this is the first time I am using it, I am confused about the coding, whether I should use method I or method II for my DID estimator, considering my data is repeated cross-section.
I also asked how can I know if PT still hold after reweighting?
Then I mentioned that I generated the rif variable for the pre-treatment groups and I wonder if I can use that for common support (rif2att >0 & rif2att<=1 ) because I want to compare the balanced x unbalanced pre-treatment characteristics.
The help file doesn't give many details about the "stub" option.
These are basically my questions. Thanks!!! ( I hope I managed to use the delimiters properly.)

Method I - using pre treatment variables

Code:

 drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) all

Doubly robust difference-in-differences estimator summary
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATET |
dripw | .0253289 .0062734 4.04 0.000 .0130333 .0376245
dripw_rc1 | .0073257 .0062618 1.17 0.242 -.0049473 .0195986
drimp | -.0192829 .005635 -3.42 0.001 -.0303273 -.0082384
drimp_rc1 | -.0000959 .0056453 -0.02 0.986 -.0111605 .0109687
reg | -.0003035 .0047153 -0.06 0.949 -.0095454 .0089383
ipw | -.0219987 .0068335 -3.22 0.001 -.0353921 -.0086053
stdipw | .0034359 .0076242 0.45 0.652 -.0115072 .018379
------------------------------------------------------------------------------
Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
dripw :Doubly Robust IPW
drimp :Doubly Robust Improved estimator
reg :Outcome regression or Regression augmented estimator
ipw :Abadie(2005) IPW estimator
stdipw:Standardized IPW estimator
sipwra:IPW and Regression adjustment estimator.

Method II - using pretreatment and posttreatment variables

Code:

drdid scores  age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 ,  time(year) tr(treat) all

Doubly robust difference-in-differences estimator summary
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
       dripw |   .0253289   .0062734     4.04   0.000     .0130333    .0376245
   dripw_rc1 |   .0073257   .0062618     1.17   0.242    -.0049473    .0195986
       drimp |  -.0192829    .005635    -3.42   0.001    -.0303273   -.0082384
   drimp_rc1 |  -.0000959   .0056453    -0.02   0.986    -.0111605    .0109687
         reg |  -.0003035   .0047153    -0.06   0.949    -.0095454    .0089383
         ipw |  -.0219987   .0068335    -3.22   0.001    -.0353921   -.0086053
      stdipw |   .0034359   .0076242     0.45   0.652    -.0115072     .018379
------------------------------------------------------------------------------
Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
dripw :Doubly Robust IPW
drimp :Doubly Robust Improved estimator
reg   :Outcome regression or Regression augmented estimator
ipw   :Abadie(2005) IPW estimator
stdipw:Standardized IPW estimator
sipwra:IPW and Regression adjustment estimator.

I then ask for the rif variable to be generated using option stub

Code:

 drdid z_enemglobal  age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 ,  time(year) tr(treat) stub(rif2) dripw

Doubly robust difference-in-differences              Number of obs = 2,032,024
Outcome model  : least squares
Treatment model: inverse probability
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
       treat |
   (1 vs 0)  |   .0253289   .0062734     4.04   0.000     .0130333    .0376245
------------------------------------------------------------------------------

. sum rif2att , d

                           rif2att
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -16.02436      -953.9732
 5%    -4.304641      -905.3324
10%    -2.991486      -877.8985       Obs           2,032,024
25%    -1.426803      -812.6595       Sum of wgt.   2,032,024

50%     .0161535                      Mean           .0253289
                        Largest       Std. dev.      8.942679
75%     1.484337       1007.817
90%     3.095296       1056.131       Variance        79.9715
95%     4.399576       1302.176       Skewness       4.728444
99%     15.83226       1311.288       Kurtosis        1675.18

my data:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float scores byte(age sex maeduc location popsize) int year byte(treat policy2) double rif2att
  2.1880713 17 2 4 1 2 2010 0 0  -2.108615013114744
   .6898354 17 1 4 1 3 2010 0 0   -.887499983940696
   .6624395 17 2 4 1 3 2010 0 0   .4969700701448334
  1.8560517 19 2 4 1 3 2010 0 0   11.93838102427453
  .29095936 17 2 6 1 3 2010 0 0  -3.003514648669609
   .7247713 19 1 5 1 2 2010 0 0  1.1126663743395977
 -1.5315053 18 2 3 1 1 2010 0 0  -41.21447857064167
   1.521267 17 2 3 1 3 2010 0 0   2.262782829250008
   .8320931 17 2 4 1 3 2010 0 0 -.05592393662715369
  .03484334 17 2 4 1 3 2010 0 0  -2.347317806588566
  -.5158417 18 1 4 1 3 2010 0 0 -14.651958523524792
   .6161923 18 1 6 1 3 2010 0 0  -.9896130082723955
  -.3024542 17 1 6 1 2 2010 0 0  2.1652444464276845
  -.6761962 18 1 5 1 3 2010 0 0  -.7347835219172532
   .4083346 22 1 4 1 2 2010 0 0  26.956594693099152
   .4068266 17 1 4 1 3 2010 0 0  -3.025821214328673
   .4153721 17 2 3 1 1 2010 0 0  12.152451447508527
  -.9906225 18 2 2 1 2 2010 0 0  -69.48923468380823
   1.921651 17 2 4 1 3 2010 0 0   2.401802539775441
  1.3958486 18 2 4 1 3 2010 0 0  .02717134189433132
  .31760055 17 2 3 1 1 2010 0 0   .8337790369346683
  1.7230928 17 2 6 1 3 2010 0 0  1.9642725429392918
   .8760782 19 2 4 1 2 2010 0 0   5.886918192477614
    .547577 18 2 5 1 3 2010 0 0  -.9057154068523958
  2.4869146 18 2 5 1 2 2010 0 0  -5.102196571965775
   .7800663 17 1 6 1 3 2010 0 0 -.22930403253520598
  .06600999 17 2 4 1 3 2010 0 0  -3.016041166893883
  1.4994005 17 2 5 1 3 2010 0 0  .04163480892675789
  1.8012598 17 1 5 1 1 2010 0 0 -.03847635408722949
  .44176245 17 2 4 1 3 2010 0 0  -.5159549749371428
   .8175157 17 1 4 1 2 2010 0 0  1.0476292803327028
   .6822948 17 2 6 1 2 2010 0 0   .1112225563241952
 -1.0949284 17 2 4 1 3 2010 0 0  -1.979147834144271
    -.45552 17 1 5 1 2 2010 0 0  1.2333970989871856
  1.2764623 17 2 4 1 2 2010 0 0  .22711194537256793
 -.55203474 17 2 4 1 3 2010 0 0  -9.001959707899845
  .04012203 17 2 4 1 3 2010 0 0 -2.0502740897711567
 -.07122129 20 1 3 1 1 2010 0 0  38.974895779339576
   .9700791 18 2 4 1 3 2010 0 0  1.3117001042990897
    1.76205 17 1 4 1 1 2010 0 0    9.01030298902866
   2.905395 18 1 5 1 3 2010 0 0  -.6241078718779147
  -.3627759 17 2 4 1 2 2010 0 0  -12.66066570293923
   .9444419 17 1 6 1 2 2010 0 0 -.19416693111371802
   1.688911 18 1 4 1 3 2010 0 0   5.402123703010223
   .2557712 16 2 5 1 3 2010 0 0   .4373887140405821
 -1.1029714 17 1 6 1 2 2010 0 0 -3.5819662085760258
   .4741855 17 2 5 1 3 2010 0 0 -.21488108696177288
  -.3072297 17 1 4 1 3 2010 0 0 -10.974035211482592
  -.7030898 17 1 4 1 3 2010 0 0  -8.114072097907837
  -.3197967 17 2 2 1 1 2010 0 0 -62.175483891858896
  -.7385288 18 2 3 1 3 2010 0 0 -15.082248746049483
    1.81081 17 2 6 1 2 2010 0 0  5.9029232390533295
   .6071444 18 2 3 1 2 2010 0 0  1.8937796156609987
   2.723174 17 2 6 1 2 2010 0 0 -1.6061977756752117
  1.5267965 18 1 4 1 3 2010 0 0  3.8227343265849107
  .09742746 17 2 4 1 3 2010 0 0  -2.835893567890441
  1.6207973 18 2 4 1 2 2010 0 0   7.042922349243504
   .6878242 17 2 5 1 3 2010 0 0  -.5173319136348004
   .9472071 17 1 5 1 2 2010 0 0  .11209566168814106
  .53501004 17 2 6 1 1 2010 0 0 -1.6453831478745786
  -.6163777 17 2 3 1 2 2010 0 0 -17.365001836275773
   .3490188 17 2 4 1 3 2010 0 0  -.5816104925022387
   2.735489 18 1 5 1 3 2010 0 0   2.327547697939817
  .55310655 17 1 4 1 3 2010 0 0  -1.469782262546989
 -1.2577964 17 2 5 1 1 2010 0 0  1.7168181942486889
  1.8952606 17 1 4 1 2 2010 0 0   8.490545944447199
   .7599587 18 1 4 1 2 2010 0 0  3.3456687801756613
    1.76205 17 1 5 1 2 2010 0 0    .243274332813464
  1.8399656 17 1 5 1 1 2010 0 0  4.8516981981496485
 -.05061202 18 2 5 1 3 2010 0 0 -3.8392308842269522
   .5792453 18 2 5 1 1 2010 0 0   .2883367236798838
   .4623725 18 1 5 1 3 2010 0 0   .7579834162104133
  1.0731286 19 1 3 1 3 2010 0 0   15.37880698152951
   .7551839 17 2 4 1 2 2010 0 0  .49856416379892976
  2.3172603 18 2 5 1 3 2010 0 0  3.2253940941345705
   .6543964 17 2 4 1 3 2010 0 0  .12845303593599675
  1.5971713 17 1 4 1 2 2010 0 0   .7098854091651324
  1.2048303 17 2 5 1 3 2010 0 0  .02871438799080352
  1.3960994 18 1 5 1 2 2010 0 0  .06690414077634024
   .2391827 17 1 4 1 2 2010 0 0 -3.5617560126159415
   .8755751 17 1 4 1 3 2010 0 0 -.13416128985113435
   .2321452 17 2 4 1 2 2010 0 0 -1.4574450700406196
   1.467983 17 1 4 1 2 2010 0 0   4.466971144051167
  -.5414785 17 2 5 1 2 2010 0 0   4.382081800980223
   .3037772 17 2 4 1 1 2010 0 0 -1.4173456898879264
   1.470245 17 2 6 1 1 2010 0 0    -.51888569876383
 -1.0511951 18 2 4 1 1 2010 0 0  -7.499841017783167
-.036285467 18 1 3 1 1 2010 0 0 -3.2270328883416837
    1.81081 17 2 5 1 2 2010 0 0   .3737696880227126
  .50635695 17 2 4 1 2 2010 0 0 -1.0900491636147591
   1.419977 18 2 2 1 2 2010 0 0  20.719351816552866
   2.066423 17 1 4 1 3 2010 0 0    5.73817905887188
   .6285084 17 1 5 1 3 2010 0 0  -.6173922217845376
   .9165436 18 2 3 1 3 2010 0 0  28.992029021300972
   1.501411 18 2 4 1 2 2010 0 0   8.840326645120548
   .2773868 17 1 4 1 2 2010 0 0 -2.0573360034786523
  .01071494 18 2 4 1 1 2010 0 0 .020211940924463687
-.006878382 18 2 2 1 1 2010 0 0 -10.305251439003298
  .12884493 17 2 4 1 3 2010 0 0 -1.2092669870332968
  .41210455 17 1 4 1 3 2010 0 0  -.6470534020410958
end
label values sex sex
label def sex 1 "M", modify
label def sex 2 "F", modify
label values maeduc maeduc
label def maeduc 2 "Primary", modify
label def maeduc 3 "Middle school", modify
label def maeduc 4 "High school", modify
label def maeduc 5 "University", modify
label def maeduc 6 "Postgrad", modify
label values location location
label def location 1 "Urban", modify
label values popsize pop
label def pop 1 "small", modify
label def pop 2 "medium", modify
label def pop 3 "large", modify

Comment

Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#5

12 Apr 2022, 07:10

I was going to suggest asking Professor Rios, but he did so while I went back to sleep, apparently.

The only additional thing I have to add, that seems strange to me, as you write about method 1 and two which use pre treatment vars, and the next one, method two, uses pre and post treatment vars......

But, these yield the same results. I don't see any differences in the coefficients or anything, you appear to be estimating the same model, no?

If you're really working with 2 time periods (one unit before and one unit of time after), as Fernando says, you can't really validate parallel trends. So, my advice to you, if possible, would be to gather more data on the matter. More pre-intervention data, that is.
Comment
Sandra Macedo

Join Date: Sep 2020

Posts: 35
#6

12 Apr 2022, 07:25

Hi FernandoRios thanks for your help. No need to rush about an answer . In fact my data has 2 periods before and 6 periods after and I am aggregating into "before" and "after" (I couldn't work with csdid although I tried). Anyway, I know I cant test PT but since I have two pre-treatment periods, I could plot or do an event study, and it looks like PT don't hold when I add covariates, so I am trying alternatives to correct it testing different DiD estimators, including quantile regressions. I thought weighting could be a good strategy.
Comment
Sandra Macedo

Join Date: Sep 2020

Posts: 35
#7

12 Apr 2022, 07:30

Hi Jared Greathouse you are right, I posted the wrong output , but the right one is there in the first post, wondering if I can edit. Thanks!! Yes I am glad he showed up. In time, yes I do have more pre-treatment data.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#8

12 Apr 2022, 09:13

More than two periods of pre intervention data?
Comment
Sandra Macedo

Join Date: Sep 2020

Posts: 35
#9

12 Apr 2022, 12:15

Originally posted by Jared Greathouse View Post

More than two periods of pre intervention data?

No, unfortunately.
Comment

Sandra Macedo

Join Date: Sep 2020
Posts: 35

#10

15 Apr 2022, 04:43

This is right code to correct the one I posted my mistake.

Code:

 drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(policy) tr(treat) all


Doubly robust difference-in-differences estimator summary
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
       dripw |    .027256   .0036539     7.46   0.000     .0200944    .0344175
   dripw_rc1 |   .0102055   .0037384     2.73   0.006     .0028783    .0175327
       drimp |   .0118586   .0030361     3.91   0.000      .005908    .0178093
   drimp_rc1 |   .0249633   .0030417     8.21   0.000     .0190017    .0309249
         reg |   .0180872   .0024844     7.28   0.000     .0132179    .0229566
         ipw |  -.0164215   .0037662    -4.36   0.000    -.0238031   -.0090399
      stdipw |  -.0088338   .0042984    -2.06   0.040    -.0172585    -.000409
------------------------------------------------------------------------------
Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
dripw :Doubly Robust IPW
drimp :Doubly Robust Improved estimator
reg   :Outcome regression or Regression augmented estimator
ipw   :Abadie(2005) IPW estimator
stdipw:Standardized IPW estimator
sipwra:IPW and Regression adjustment estimator.

Last edited by Sandra Macedo; 15 Apr 2022, 04:47.

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2469
#11

15 Apr 2022, 06:51

something that seems odd on this code
Policy doesn't strike me as a "time" variable.
also, It seems that you are trying to control for state and school fixed effects. but you have "repeated crossection"?
If the treatment happens and differentiates across students within schools, then its fine.
But if the treatment happes for some schools and not others, then you cannot use school fixed effects as you are doing.

HTH
Comment
Sandra Macedo

Join Date: Sep 2020

Posts: 35
#12

15 Apr 2022, 07:29

Hi FernandoRios, thanks again for helping out. I really appreciate your help. I am not using school fixed effects, state_school is a variable that indicates the state where the students' school of origin is located (maybe I should just rename it). I think my problem is trying to generate the time and treat variables. Treat variable = 1 if individual is treated and 0 if not targeted by the policy. There are treated/untreated before and after the policy . What I did was to aggregate all years after policy - the year of the policy is 2012 - into 1 if year>=2012 and 0 if before 2012 (I have only 2 years before, but I could add 2009 I dropped it because other policies were implemented that year.

Should I use treat_policy=treat*policy and then rewrite the code as below?
drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(year) tr(treat_policy) driwp

Since there are different years and everyone is in fact treated only once, should I try to use csdid instead? but I couldn't generate the gvar, which I think is not applicable to my data. I would appreciate if you could help with that.
Thank you!
Sandra

Last edited by Sandra Macedo; 15 Apr 2022, 07:31.
Comment
Sandra Macedo

Join Date: Sep 2020

Posts: 35
#13

15 Apr 2022, 07:37

Of course, if I do that I would have zero untreated subject in my post period, so I get the error "You do not have 2x2 design"
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2469
#14

15 Apr 2022, 07:44

Let me ask you this:
At what level is the treatment implemented? (individual school state?)
Say its school
you could create the GVAR as follows:
bysort school :egen minyear=min(year) if treated==1
This will give you the earliest treated year
Next
bysort school:egen gvar=max(minyear)
So it reassigns gvar even for periods before treatment

finally:
replace gvar=0 if gvar==.
To make Nevertreated=0

After you do that, do:
tab year gvar

and show me what you get.
Comment

Sandra Macedo

Join Date: Sep 2020
Posts: 35

#15

15 Apr 2022, 08:16

Fernando, the treatment was implemented nationally at individual level in 2012, before 2012 nobody was treated (variable treat includes all those eligible before and after the policy), so when I tried your code, I didn't sort for school
I guess I am not getting the logic here

what I did was:

Code:

 egen minyear=min(year) if treat==1
(2,620,026 missing values generated)

. egen gvar=max(minyear)

 replace gvar2=0 if gvar2==.
(0 real changes made)


. tab gvar year

           |                                             Enem year
      gvar |      2010       2011       2012       2013       2014       2015       2016       2017       2018 |     Total
-----------+---------------------------------------------------------------------------------------------------+----------
      2010 |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 
-----------+---------------------------------------------------------------------------------------------------+----------
     Total |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 


*including only effectively treated subjects (treat_policy = treat*policy)

 egen minyear2=min(year) if treat_policy==1
(4,143,550 missing values generated)

. egen gvar2=max(minyear2)

 replace gvar2=0 if gvar2==.

(0 real changes made)

. tab gvar2 year
        |                                             Enem year
     gvar2 |      2010       2011       2012       2013       2014       2015       2016       2017       2018 |     Total
-----------+---------------------------------------------------------------------------------------------------+----------
      2012 |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 
-----------+---------------------------------------------------------------------------------------------------+----------
     Total |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003

Announcement