Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct specification for Drdid

    Hello all,

    I wish to apply a doubly robust difference-in-difference estimator using the drdid package. My data is repeated cross-section therefore there are unbalanced characteristics across the two groups, thus in order to estimate atet, I need reweighting.
    policy is my "post" variable (=1 after policy) and treat = 1 if individual is treated.
    My codes:

    1) using only pre-treatment variables
    [CODE
    drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) all


    Doubly robust difference-in-differences estimator summary
    ------------------------------------------------------------------------------
    | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET |
    dripw | .0253289 .0062734 4.04 0.000 .0130333 .0376245
    dripw_rc1 | .0073257 .0062618 1.17 0.242 -.0049473 .0195986
    drimp | -.0192829 .005635 -3.42 0.001 -.0303273 -.0082384
    drimp_rc1 | -.0000959 .0056453 -0.02 0.986 -.0111605 .0109687
    reg | -.0003035 .0047153 -0.06 0.949 -.0095454 .0089383
    ipw | -.0219987 .0068335 -3.22 0.001 -.0353921 -.0086053
    stdipw | .0034359 .0076242 0.45 0.652 -.0115072 .018379
    ------------------------------------------------------------------------------
    Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
    dripw :Doubly Robust IPW
    drimp :Doubly Robust Improved estimator
    reg :Outcome regression or Regression augmented estimator
    ipw :Abadie(2005) IPW estimator
    stdipw:Standardized IPW estimator
    sipwra:IPW and Regression adjustment estimator.


    2) using variables from both groups (before and after)

    [CODE] drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(policy) tr(treat) all

    Doubly robust difference-in-differences estimator summary
    ------------------------------------------------------------------------------
    | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET |
    dripw | .027256 .0036539 7.46 0.000 .0200944 .0344175
    dripw_rc1 | .0102055 .0037384 2.73 0.006 .0028783 .0175327
    drimp | .0118586 .0030361 3.91 0.000 .005908 .0178093
    drimp_rc1 | .0249633 .0030417 8.21 0.000 .0190017 .0309249
    reg | .0180872 .0024844 7.28 0.000 .0132179 .0229566
    ipw | -.0164215 .0037662 -4.36 0.000 -.0238031 -.0090399
    stdipw | -.0088338 .0042984 -2.06 0.040 -.0172585 -.000409
    ------------------------------------------------------------------------------
    ]


    Considering this is a rcs data type, I am not sure which one to use and how can I assume PT hold after reweighting? in the help file it states "the estimator is appropriate if either the propensity of treatment, or the outcome regression is correctly specified", how can I know?

    I asked it to generate the RIF variable but that is only for the "before" part of the data. Am I missing something here?

    #[CODE] drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) stub(rif2) dripw

    Doubly robust difference-in-differences Number of obs = 2,032,024
    Outcome model : least squares
    Treatment model: inverse probability
    ------------------------------------------------------------------------------
    | Coefficient Std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    ATET |
    treat |
    (1 vs 0) | .0253289 .0062734 4.04 0.000 .0130333 .0376245
    ------------------------------------------------------------------------------

    . sum rif2att , d

    rif2att
    -------------------------------------------------------------
    Percentiles Smallest
    1% -16.02436 -953.9732
    5% -4.304641 -905.3324
    10% -2.991486 -877.8985 Obs 2,032,024
    25% -1.426803 -812.6595 Sum of wgt. 2,032,024

    50% .0161535 Mean .0253289
    Largest Std. dev. 8.942679
    75% 1.484337 1007.817
    90% 3.095296 1056.131 Variance 79.9715
    95% 4.399576 1302.176 Skewness 4.728444
    99% 15.83226 1311.288 Kurtosis 1675.18


    [CODE]

    Can anyone help?
    Is there a way I could replicate this by hand or generate the weights? (sorry for so many questions)
    Last edited by Sandra Macedo; 12 Apr 2022, 06:01.

  • #2
    Yeah I don't really understand the questions. So let's start from the first problem, could you please rephrase your original question. Also, please put your data and code in the delimiters so we can clearly see what Stata gives you.

    The most i get so far is that you wanna reweight your analyses due to imbalances between your treated and untreated units. But it looks like your command (not package) does this with IPW weights. So let's start there, please, what is the issue you're seeing here?

    Comment


    • #3
      Hi Sandra (i still owe you an answer to your other post/email)
      but
      1) your specifications seem odd
      Since you are using drdid, the idea is that you have standard 2x2 DID design, meaning two time periods (before and after) and two groups (treatment and control). Seems that your treatment is treat, but you are trying to use "policy" and "year" as your time variables. I suspect that using policy isn't correct.

      2) when using Repeated crossection, it is automatic that you will be using data from before and after treatment, basically because you cannot observe "before" characteristics for the same units across time.

      3) The rif is correct. if you look at the mean, is exactly what drdid is providing. Keep in mind that the RIF is simply used to obtain standard errors of the estimation, nothing else.

      4) with only 2 periods (before and after) you cannot assess parallel trends. In fact, PT assumption cannot be tested ever, because is based on unobserved conterfactuals.
      The alternative that one does is look at trends in the past, to see if the trends were parallel before treatment. But with 2 periods, there is no past to look at.

      HTH
      F

      Comment


      • #4
        Hi Jared,

        Thanks! Sorry about the confusion.
        Yes I want to use the estimator from Sant’Anna and Zhao, (2020), “Doubly Robust Difference-in-Differences Estimators”, Journal of Econometrics, Vol. 219 (1), pp. 101-122 which includes IPW weighs and outcome regression to correct imbalances. Because this is the first time I am using it, I am confused about the coding, whether I should use method I or method II for my DID estimator, considering my data is repeated cross-section.
        I also asked how can I know if PT still hold after reweighting?
        Then I mentioned that I generated the rif variable for the pre-treatment groups and I wonder if I can use that for common support (rif2att >0 & rif2att<=1 ) because I want to compare the balanced x unbalanced pre-treatment characteristics.
        The help file doesn't give many details about the "stub" option.
        These are basically my questions. Thanks!!! ( I hope I managed to use the delimiters properly.)


        Method I - using pre treatment variables

        Code:
         drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 , time(year) tr(treat) all
        
        Doubly robust difference-in-differences estimator summary
        ------------------------------------------------------------------------------
        | Coefficient Std. err. z P>|z| [95% conf. interval]
        -------------+----------------------------------------------------------------
        ATET |
        dripw | .0253289 .0062734 4.04 0.000 .0130333 .0376245
        dripw_rc1 | .0073257 .0062618 1.17 0.242 -.0049473 .0195986
        drimp | -.0192829 .005635 -3.42 0.001 -.0303273 -.0082384
        drimp_rc1 | -.0000959 .0056453 -0.02 0.986 -.0111605 .0109687
        reg | -.0003035 .0047153 -0.06 0.949 -.0095454 .0089383
        ipw | -.0219987 .0068335 -3.22 0.001 -.0353921 -.0086053
        stdipw | .0034359 .0076242 0.45 0.652 -.0115072 .018379
        ------------------------------------------------------------------------------
        Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
        dripw :Doubly Robust IPW
        drimp :Doubly Robust Improved estimator
        reg :Outcome regression or Regression augmented estimator
        ipw :Abadie(2005) IPW estimator
        stdipw:Standardized IPW estimator
        sipwra:IPW and Regression adjustment estimator.
        Method II - using pretreatment and posttreatment variables

        Code:
        drdid scores  age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 ,  time(year) tr(treat) all
        
        Doubly robust difference-in-differences estimator summary
        ------------------------------------------------------------------------------
                     | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
        ATET         |
               dripw |   .0253289   .0062734     4.04   0.000     .0130333    .0376245
           dripw_rc1 |   .0073257   .0062618     1.17   0.242    -.0049473    .0195986
               drimp |  -.0192829    .005635    -3.42   0.001    -.0303273   -.0082384
           drimp_rc1 |  -.0000959   .0056453    -0.02   0.986    -.0111605    .0109687
                 reg |  -.0003035   .0047153    -0.06   0.949    -.0095454    .0089383
                 ipw |  -.0219987   .0068335    -3.22   0.001    -.0353921   -.0086053
              stdipw |   .0034359   .0076242     0.45   0.652    -.0115072     .018379
        ------------------------------------------------------------------------------
        Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
        dripw :Doubly Robust IPW
        drimp :Doubly Robust Improved estimator
        reg   :Outcome regression or Regression augmented estimator
        ipw   :Abadie(2005) IPW estimator
        stdipw:Standardized IPW estimator
        sipwra:IPW and Regression adjustment estimator.
        I then ask for the rif variable to be generated using option stub

        Code:
         drdid z_enemglobal  age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school if policy==0 ,  time(year) tr(treat) stub(rif2) dripw
        
        Doubly robust difference-in-differences              Number of obs = 2,032,024
        Outcome model  : least squares
        Treatment model: inverse probability
        ------------------------------------------------------------------------------
                     | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
        ATET         |
               treat |
           (1 vs 0)  |   .0253289   .0062734     4.04   0.000     .0130333    .0376245
        ------------------------------------------------------------------------------
        
        . sum rif2att , d
        
                                   rif2att
        -------------------------------------------------------------
              Percentiles      Smallest
         1%    -16.02436      -953.9732
         5%    -4.304641      -905.3324
        10%    -2.991486      -877.8985       Obs           2,032,024
        25%    -1.426803      -812.6595       Sum of wgt.   2,032,024
        
        50%     .0161535                      Mean           .0253289
                                Largest       Std. dev.      8.942679
        75%     1.484337       1007.817
        90%     3.095296       1056.131       Variance        79.9715
        95%     4.399576       1302.176       Skewness       4.728444
        99%     15.83226       1311.288       Kurtosis        1675.18


        my data:

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float scores byte(age sex maeduc location popsize) int year byte(treat policy2) double rif2att
          2.1880713 17 2 4 1 2 2010 0 0  -2.108615013114744
           .6898354 17 1 4 1 3 2010 0 0   -.887499983940696
           .6624395 17 2 4 1 3 2010 0 0   .4969700701448334
          1.8560517 19 2 4 1 3 2010 0 0   11.93838102427453
          .29095936 17 2 6 1 3 2010 0 0  -3.003514648669609
           .7247713 19 1 5 1 2 2010 0 0  1.1126663743395977
         -1.5315053 18 2 3 1 1 2010 0 0  -41.21447857064167
           1.521267 17 2 3 1 3 2010 0 0   2.262782829250008
           .8320931 17 2 4 1 3 2010 0 0 -.05592393662715369
          .03484334 17 2 4 1 3 2010 0 0  -2.347317806588566
          -.5158417 18 1 4 1 3 2010 0 0 -14.651958523524792
           .6161923 18 1 6 1 3 2010 0 0  -.9896130082723955
          -.3024542 17 1 6 1 2 2010 0 0  2.1652444464276845
          -.6761962 18 1 5 1 3 2010 0 0  -.7347835219172532
           .4083346 22 1 4 1 2 2010 0 0  26.956594693099152
           .4068266 17 1 4 1 3 2010 0 0  -3.025821214328673
           .4153721 17 2 3 1 1 2010 0 0  12.152451447508527
          -.9906225 18 2 2 1 2 2010 0 0  -69.48923468380823
           1.921651 17 2 4 1 3 2010 0 0   2.401802539775441
          1.3958486 18 2 4 1 3 2010 0 0  .02717134189433132
          .31760055 17 2 3 1 1 2010 0 0   .8337790369346683
          1.7230928 17 2 6 1 3 2010 0 0  1.9642725429392918
           .8760782 19 2 4 1 2 2010 0 0   5.886918192477614
            .547577 18 2 5 1 3 2010 0 0  -.9057154068523958
          2.4869146 18 2 5 1 2 2010 0 0  -5.102196571965775
           .7800663 17 1 6 1 3 2010 0 0 -.22930403253520598
          .06600999 17 2 4 1 3 2010 0 0  -3.016041166893883
          1.4994005 17 2 5 1 3 2010 0 0  .04163480892675789
          1.8012598 17 1 5 1 1 2010 0 0 -.03847635408722949
          .44176245 17 2 4 1 3 2010 0 0  -.5159549749371428
           .8175157 17 1 4 1 2 2010 0 0  1.0476292803327028
           .6822948 17 2 6 1 2 2010 0 0   .1112225563241952
         -1.0949284 17 2 4 1 3 2010 0 0  -1.979147834144271
            -.45552 17 1 5 1 2 2010 0 0  1.2333970989871856
          1.2764623 17 2 4 1 2 2010 0 0  .22711194537256793
         -.55203474 17 2 4 1 3 2010 0 0  -9.001959707899845
          .04012203 17 2 4 1 3 2010 0 0 -2.0502740897711567
         -.07122129 20 1 3 1 1 2010 0 0  38.974895779339576
           .9700791 18 2 4 1 3 2010 0 0  1.3117001042990897
            1.76205 17 1 4 1 1 2010 0 0    9.01030298902866
           2.905395 18 1 5 1 3 2010 0 0  -.6241078718779147
          -.3627759 17 2 4 1 2 2010 0 0  -12.66066570293923
           .9444419 17 1 6 1 2 2010 0 0 -.19416693111371802
           1.688911 18 1 4 1 3 2010 0 0   5.402123703010223
           .2557712 16 2 5 1 3 2010 0 0   .4373887140405821
         -1.1029714 17 1 6 1 2 2010 0 0 -3.5819662085760258
           .4741855 17 2 5 1 3 2010 0 0 -.21488108696177288
          -.3072297 17 1 4 1 3 2010 0 0 -10.974035211482592
          -.7030898 17 1 4 1 3 2010 0 0  -8.114072097907837
          -.3197967 17 2 2 1 1 2010 0 0 -62.175483891858896
          -.7385288 18 2 3 1 3 2010 0 0 -15.082248746049483
            1.81081 17 2 6 1 2 2010 0 0  5.9029232390533295
           .6071444 18 2 3 1 2 2010 0 0  1.8937796156609987
           2.723174 17 2 6 1 2 2010 0 0 -1.6061977756752117
          1.5267965 18 1 4 1 3 2010 0 0  3.8227343265849107
          .09742746 17 2 4 1 3 2010 0 0  -2.835893567890441
          1.6207973 18 2 4 1 2 2010 0 0   7.042922349243504
           .6878242 17 2 5 1 3 2010 0 0  -.5173319136348004
           .9472071 17 1 5 1 2 2010 0 0  .11209566168814106
          .53501004 17 2 6 1 1 2010 0 0 -1.6453831478745786
          -.6163777 17 2 3 1 2 2010 0 0 -17.365001836275773
           .3490188 17 2 4 1 3 2010 0 0  -.5816104925022387
           2.735489 18 1 5 1 3 2010 0 0   2.327547697939817
          .55310655 17 1 4 1 3 2010 0 0  -1.469782262546989
         -1.2577964 17 2 5 1 1 2010 0 0  1.7168181942486889
          1.8952606 17 1 4 1 2 2010 0 0   8.490545944447199
           .7599587 18 1 4 1 2 2010 0 0  3.3456687801756613
            1.76205 17 1 5 1 2 2010 0 0    .243274332813464
          1.8399656 17 1 5 1 1 2010 0 0  4.8516981981496485
         -.05061202 18 2 5 1 3 2010 0 0 -3.8392308842269522
           .5792453 18 2 5 1 1 2010 0 0   .2883367236798838
           .4623725 18 1 5 1 3 2010 0 0   .7579834162104133
          1.0731286 19 1 3 1 3 2010 0 0   15.37880698152951
           .7551839 17 2 4 1 2 2010 0 0  .49856416379892976
          2.3172603 18 2 5 1 3 2010 0 0  3.2253940941345705
           .6543964 17 2 4 1 3 2010 0 0  .12845303593599675
          1.5971713 17 1 4 1 2 2010 0 0   .7098854091651324
          1.2048303 17 2 5 1 3 2010 0 0  .02871438799080352
          1.3960994 18 1 5 1 2 2010 0 0  .06690414077634024
           .2391827 17 1 4 1 2 2010 0 0 -3.5617560126159415
           .8755751 17 1 4 1 3 2010 0 0 -.13416128985113435
           .2321452 17 2 4 1 2 2010 0 0 -1.4574450700406196
           1.467983 17 1 4 1 2 2010 0 0   4.466971144051167
          -.5414785 17 2 5 1 2 2010 0 0   4.382081800980223
           .3037772 17 2 4 1 1 2010 0 0 -1.4173456898879264
           1.470245 17 2 6 1 1 2010 0 0    -.51888569876383
         -1.0511951 18 2 4 1 1 2010 0 0  -7.499841017783167
        -.036285467 18 1 3 1 1 2010 0 0 -3.2270328883416837
            1.81081 17 2 5 1 2 2010 0 0   .3737696880227126
          .50635695 17 2 4 1 2 2010 0 0 -1.0900491636147591
           1.419977 18 2 2 1 2 2010 0 0  20.719351816552866
           2.066423 17 1 4 1 3 2010 0 0    5.73817905887188
           .6285084 17 1 5 1 3 2010 0 0  -.6173922217845376
           .9165436 18 2 3 1 3 2010 0 0  28.992029021300972
           1.501411 18 2 4 1 2 2010 0 0   8.840326645120548
           .2773868 17 1 4 1 2 2010 0 0 -2.0573360034786523
          .01071494 18 2 4 1 1 2010 0 0 .020211940924463687
        -.006878382 18 2 2 1 1 2010 0 0 -10.305251439003298
          .12884493 17 2 4 1 3 2010 0 0 -1.2092669870332968
          .41210455 17 1 4 1 3 2010 0 0  -.6470534020410958
        end
        label values sex sex
        label def sex 1 "M", modify
        label def sex 2 "F", modify
        label values maeduc maeduc
        label def maeduc 2 "Primary", modify
        label def maeduc 3 "Middle school", modify
        label def maeduc 4 "High school", modify
        label def maeduc 5 "University", modify
        label def maeduc 6 "Postgrad", modify
        label values location location
        label def location 1 "Urban", modify
        label values popsize pop
        label def pop 1 "small", modify
        label def pop 2 "medium", modify
        label def pop 3 "large", modify

        Comment


        • #5
          I was going to suggest asking Professor Rios, but he did so while I went back to sleep, apparently.

          The only additional thing I have to add, that seems strange to me, as you write about method 1 and two which use pre treatment vars, and the next one, method two, uses pre and post treatment vars......

          But, these yield the same results. I don't see any differences in the coefficients or anything, you appear to be estimating the same model, no?

          If you're really working with 2 time periods (one unit before and one unit of time after), as Fernando says, you can't really validate parallel trends. So, my advice to you, if possible, would be to gather more data on the matter. More pre-intervention data, that is.

          Comment


          • #6
            Hi FernandoRios thanks for your help. No need to rush about an answer . In fact my data has 2 periods before and 6 periods after and I am aggregating into "before" and "after" (I couldn't work with csdid although I tried). Anyway, I know I cant test PT but since I have two pre-treatment periods, I could plot or do an event study, and it looks like PT don't hold when I add covariates, so I am trying alternatives to correct it testing different DiD estimators, including quantile regressions. I thought weighting could be a good strategy.

            Comment


            • #7
              Hi Jared Greathouse you are right, I posted the wrong output , but the right one is there in the first post, wondering if I can edit. Thanks!! Yes I am glad he showed up. In time, yes I do have more pre-treatment data.

              Comment


              • #8
                More than two periods of pre intervention data?

                Comment


                • #9
                  Originally posted by Jared Greathouse View Post
                  More than two periods of pre intervention data?
                  No, unfortunately.

                  Comment


                  • #10
                    This is right code to correct the one I posted my mistake.

                    Code:
                     drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(policy) tr(treat) all
                    
                    
                    Doubly robust difference-in-differences estimator summary
                    ------------------------------------------------------------------------------
                                 | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
                    -------------+----------------------------------------------------------------
                    ATET         |
                           dripw |    .027256   .0036539     7.46   0.000     .0200944    .0344175
                       dripw_rc1 |   .0102055   .0037384     2.73   0.006     .0028783    .0175327
                           drimp |   .0118586   .0030361     3.91   0.000      .005908    .0178093
                       drimp_rc1 |   .0249633   .0030417     8.21   0.000     .0190017    .0309249
                             reg |   .0180872   .0024844     7.28   0.000     .0132179    .0229566
                             ipw |  -.0164215   .0037662    -4.36   0.000    -.0238031   -.0090399
                          stdipw |  -.0088338   .0042984    -2.06   0.040    -.0172585    -.000409
                    ------------------------------------------------------------------------------
                    Note: This table is provided for comparison across estimations only. You cannot use it to compare estimates across different estimators
                    dripw :Doubly Robust IPW
                    drimp :Doubly Robust Improved estimator
                    reg   :Outcome regression or Regression augmented estimator
                    ipw   :Abadie(2005) IPW estimator
                    stdipw:Standardized IPW estimator
                    sipwra:IPW and Regression adjustment estimator.
                    Last edited by Sandra Macedo; 15 Apr 2022, 04:47.

                    Comment


                    • #11
                      something that seems odd on this code
                      Policy doesn't strike me as a "time" variable.
                      also, It seems that you are trying to control for state and school fixed effects. but you have "repeated crossection"?
                      If the treatment happens and differentiates across students within schools, then its fine.
                      But if the treatment happes for some schools and not others, then you cannot use school fixed effects as you are doing.

                      HTH

                      Comment


                      • #12
                        Hi FernandoRios, thanks again for helping out. I really appreciate your help. I am not using school fixed effects, state_school is a variable that indicates the state where the students' school of origin is located (maybe I should just rename it). I think my problem is trying to generate the time and treat variables. Treat variable = 1 if individual is treated and 0 if not targeted by the policy. There are treated/untreated before and after the policy . What I did was to aggregate all years after policy - the year of the policy is 2012 - into 1 if year>=2012 and 0 if before 2012 (I have only 2 years before, but I could add 2009 I dropped it because other policies were implemented that year.

                        Should I use treat_policy=treat*policy and then rewrite the code as below?
                        drdid scores age c.age#c.age i.sex i.maeduc ymid i.location i.popsize munic_gdp i.state_school , time(year) tr(treat_policy) driwp

                        Since there are different years and everyone is in fact treated only once, should I try to use csdid instead? but I couldn't generate the gvar, which I think is not applicable to my data. I would appreciate if you could help with that.
                        Thank you!
                        Sandra
                        Last edited by Sandra Macedo; 15 Apr 2022, 07:31.

                        Comment


                        • #13
                          Of course, if I do that I would have zero untreated subject in my post period, so I get the error "You do not have 2x2 design"


                          Comment


                          • #14
                            Let me ask you this:
                            At what level is the treatment implemented? (individual school state?)
                            Say its school
                            you could create the GVAR as follows:
                            bysort school :egen minyear=min(year) if treated==1
                            This will give you the earliest treated year
                            Next
                            bysort school:egen gvar=max(minyear)
                            So it reassigns gvar even for periods before treatment

                            finally:
                            replace gvar=0 if gvar==.
                            To make Nevertreated=0

                            After you do that, do:
                            tab year gvar

                            and show me what you get.

                            Comment


                            • #15
                              Fernando, the treatment was implemented nationally at individual level in 2012, before 2012 nobody was treated (variable treat includes all those eligible before and after the policy), so when I tried your code, I didn't sort for school
                              I guess I am not getting the logic here

                              what I did was:

                              Code:
                               egen minyear=min(year) if treat==1
                              (2,620,026 missing values generated)
                              
                              . egen gvar=max(minyear)
                              
                               replace gvar2=0 if gvar2==.
                              (0 real changes made)
                              
                              
                              . tab gvar year
                              
                                         |                                             Enem year
                                    gvar |      2010       2011       2012       2013       2014       2015       2016       2017       2018 |     Total
                              -----------+---------------------------------------------------------------------------------------------------+----------
                                    2010 |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 
                              -----------+---------------------------------------------------------------------------------------------------+----------
                                   Total |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 
                              
                              
                              *including only effectively treated subjects (treat_policy = treat*policy)
                              
                               egen minyear2=min(year) if treat_policy==1
                              (4,143,550 missing values generated)
                              
                              . egen gvar2=max(minyear2)
                              
                               replace gvar2=0 if gvar2==.
                              
                              (0 real changes made)
                              
                              . tab gvar2 year
                                      |                                             Enem year
                                   gvar2 |      2010       2011       2012       2013       2014       2015       2016       2017       2018 |     Total
                              -----------+---------------------------------------------------------------------------------------------------+----------
                                    2012 |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003 
                              -----------+---------------------------------------------------------------------------------------------------+----------
                                   Total |   957,340  1,074,684  1,112,430  1,221,800  1,231,342  1,292,287  1,367,401  1,209,447  1,026,272 |10,493,003

                              Comment

                              Working...
                              X