Difference-in-differences

Clyde Schechter

Join Date: Apr 2014

Posts: 29955
#46

27 Apr 2021, 10:44

OK, so it looks like the model is working now.

More, in the original paper, they used
[pweight= wtfinl]

, I was wondering if I should do the same...

Are you using the same data the original paper used? Either way, if it is survey data and wtfinl is the sampling weight, then, yes, you must use it.
Comment

Cairone Federica

Join Date: Apr 2021
Posts: 31

#47

27 Apr 2021, 10:54

Yes, I'm using the same data.. Inserting the weights I got the following results:

Code:

  regress register i.ps#i.event_time_var i.statefip i.year [pweight= wtfinl], cluster(statefip) 
(sum of wgt is 2,592,730,808.105)
note: 0b.ps#1.event_time_var identifies no observations in the sample
note: 0b.ps#3.event_time_var identifies no observations in the sample
note: 0b.ps#5.event_time_var identifies no observations in the sample
note: 0b.ps#7.event_time_var identifies no observations in the sample
note: 0b.ps#8.event_time_var identifies no observations in the sample
note: 1.ps#8.event_time_var omitted because of collinearity

Linear regression                               Number of obs     =  1,350,537
                                                F(20, 50)         =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0163
                                                Root MSE          =     .42117

                                       (Std. Err. adjusted for 51 clusters in statefip)
---------------------------------------------------------------------------------------
                      |               Robust
             register |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
    ps#event_time_var |
                0#-4  |          0  (empty)
                0#-2  |          0  (empty)
                 0#0  |          0  (empty)
                 0#2  |          0  (empty)
                 0#3  |          0  (empty)
                1#-5  |  -.0059036   .0279784    -0.21   0.834    -.0620999    .0502928
                1#-4  |  -.0034529   .0136204    -0.25   0.801    -.0308103    .0239045
                1#-2  |   .0019458   .0172818     0.11   0.911    -.0327656    .0366573
                 1#0  |  -.0046085   .0109479    -0.42   0.676     -.026598     .017381
                 1#2  |   .0026346    .008235     0.32   0.750    -.0139059    .0191751
                 1#3  |          0  (omitted)
                      |
             statefip |
              alaska  |   .0330066   .0002879   114.66   0.000     .0324284    .0335848
             arizona  |  -.0889817   .0005167  -172.20   0.000    -.0900196   -.0879439
            arkansas  |  -.0902335   .0000807 -1118.29   0.000    -.0903955   -.0900714
          california  |  -.0343968   .0210691    -1.63   0.109    -.0767153    .0079217
            colorado  |  -.0079093   .0245543    -0.32   0.749    -.0572281    .0414094
         connecticut  |   .0043897   .0002463    17.82   0.000      .003895    .0048844
            delaware  |  -.0352314   .0209989    -1.68   0.100    -.0774089    .0069461
district of columbia  |   .0291851   .0217047     1.34   0.185    -.0144099    .0727802
             florida  |  -.0312628    .019135    -1.63   0.109    -.0696965    .0071709
             georgia  |  -.0645631   .0003268  -197.54   0.000    -.0652195   -.0639066
              hawaii  |  -.1235896   .0085545   -14.45   0.000    -.1407719   -.1064073
               idaho  |  -.0754017   .0004099  -183.95   0.000     -.076225   -.0745784
            illinois  |   .0048417   .0001745    27.75   0.000     .0044913    .0051921
             indiana  |  -.0655986   .0000737  -889.94   0.000    -.0657467   -.0654506
                iowa  |   .0024194   .0001175    20.59   0.000     .0021833    .0026554
              kansas  |  -.0555536   .0000425 -1306.93   0.000     -.055639   -.0554683
            kentucky  |  -.0420218   .0001975  -212.77   0.000    -.0424184   -.0416251
           louisiana  |   .0195673    .024919     0.79   0.436     -.030484    .0696185
               maine  |   .0779861   .0232108     3.36   0.001     .0313657    .1246064
            maryland  |  -.0122415   .0214198    -0.57   0.570    -.0552646    .0307815
       massachusetts  |   .0237636   .0248643     0.96   0.344    -.0261778     .073705
            michigan  |   .0447685   .0001247   359.01   0.000      .044518    .0450189
           minnesota  |   .0817299   .0000901   906.93   0.000     .0815489    .0819109
         mississippi  |   .0253358   .0001167   217.10   0.000     .0251014    .0255702
            missouri  |   .0091647   .0000967    94.78   0.000     .0089705    .0093589
             montana  |  -.0180892   .0001464  -123.60   0.000    -.0183832   -.0177953
            nebraska  |  -.0277487   .0000713  -389.15   0.000     -.027892   -.0276055
              nevada  |  -.1345669   .0008654  -155.50   0.000    -.1363051   -.1328287
       new hampshire  |  -.0446832   .0002048  -218.18   0.000    -.0450946   -.0442719
          new jersey  |  -.0090549   .0001472   -61.52   0.000    -.0093506   -.0087593
          new mexico  |  -.0723982   .0001701  -425.64   0.000    -.0727398   -.0720566
            new york  |  -.0334974   .0002308  -145.15   0.000    -.0339609   -.0330339
      north carolina  |  -.0390187    .020645    -1.89   0.065    -.0804855    .0024481
        north dakota  |   .1185948   .0000626  1895.60   0.000     .1184692    .1187205
                ohio  |   -.034322   .0000898  -382.35   0.000    -.0345023   -.0341417
            oklahoma  |  -.0614503   .0000757  -811.51   0.000    -.0616024   -.0612982
              oregon  |   .0249407   .0192274     1.30   0.201    -.0136787    .0635601
        pennsylvania  |  -.0733064   .0001974  -371.43   0.000    -.0737028     -.07291
        rhode island  |  -.0008808   .0218405    -0.04   0.968    -.0447487    .0429871
      south carolina  |  -.0750477   .0003137  -239.21   0.000    -.0756779   -.0744176
        south dakota  |   .0038458    .000114    33.73   0.000     .0036168    .0040749
           tennessee  |  -.0687819   .0001918  -358.61   0.000    -.0691671   -.0683966
               texas  |  -.0574359   .0002763  -207.85   0.000     -.057991   -.0568809
                utah  |   -.069823   .0004109  -169.91   0.000    -.0706484   -.0689975
             vermont  |   .0075074    .000066   113.73   0.000     .0073749      .00764
            virginia  |  -.0461722   .0001268  -364.18   0.000    -.0464269   -.0459176
          washington  |  -.0176031   .0002531   -69.54   0.000    -.0181115   -.0170946
       west virginia  |  -.0857582   .0002803  -306.00   0.000    -.0863211   -.0851953
           wisconsin  |   .0491407    .000063   780.10   0.000     .0490142    .0492672
             wyoming  |  -.0763217   .0000806  -947.16   0.000    -.0764836   -.0761599
                      |
                 year |
                1984  |   .0506262   .0052681     9.61   0.000      .040045    .0612074
                1986  |   .0106707   .0051308     2.08   0.043     .0003652    .0209762
                1988  |   .0367775   .0068351     5.38   0.000     .0230489    .0505062
                1990  |    .001941   .0066751     0.29   0.772    -.0114663    .0153483
                1992  |   .0659198   .0067676     9.74   0.000     .0523267     .079513
                1994  |   .0069078   .0079755     0.87   0.391    -.0091115    .0229271
                1996  |   .0581916   .0078551     7.41   0.000     .0424141    .0739691
                1998  |   .0265773    .009038     2.94   0.005     .0084239    .0447307
                2000  |   .0727341   .0087036     8.36   0.000     .0552525    .0902157
                2002  |   .0391415   .0108393     3.61   0.001     .0173701    .0609129
                2004  |   .1061117   .0082461    12.87   0.000     .0895488    .1226745
                2006  |   .0676702   .0095717     7.07   0.000     .0484449    .0868955
                2008  |   .1207354   .0090675    13.32   0.000     .1025228     .138948
                2010  |   .0732629   .0089508     8.19   0.000     .0552847     .091241
                2012  |   .1132496   .0092579    12.23   0.000     .0946545    .1318448
                2014  |   .0679406   .0091487     7.43   0.000     .0495649    .0863163
                      |
                _cons |   .7366532   .0063659   115.72   0.000      .723867    .7494395
---------------------------------------------------------------------------------------

Just few more questions:
1. What about t=1? It does not appear.. and, how should I consider the empty time variable?
2. The model I have just shown you is the DD, which therefore identifies the difference between young individuals in the states that have introduced the law and the states in which it has not been introduced.
3. To realize the DDD instead, using the elderly as a group placedo, I just need to insert the variable age?
In this regard, the following regression is correct? :

regress register i.ps#i.event_time_var#age18_24 i.age i.statefip i.year [pweight= wtfinl], cluster(statefip)

4. To grasp a possible differentiated effect among young people in relation to sex and ethnicity, what would be the regression to be done instead? I've been thinking about this, but I'm not sure: regress register i.ps#i.event_time_var#age18_24#i.sex#i.black/i.hispanic i.sex i.black/i.hispanic i.age i.statefip i.year [pweight= wtfinl], cluster(statefip) [/QUOTE]

Last edited by Cairone Federica; 27 Apr 2021, 10:58.

Comment

Cairone Federica

Join Date: Apr 2021

Posts: 31
#48

27 Apr 2021, 11:07

At first, I should get very similar results to this using the simple model, but I don't think the results are correct...
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29955
#49

27 Apr 2021, 11:11

1. What about t=1? It does not appear.. and, how should I consider the empty time variable?

It does not appear because, according to the table you show in #45, there are no observations with t = 1 in your estimation sample. Either there aren't any in the data set at all, or they were excluded from the estimation sample because of missing data on some other variable. You need to check into that. There may be a gap in your data. Or maybe we still don't have the calculation of the event_time variable correct.

2. The model I have just shown you is the DD, which therefore identifies the difference between young individuals in the states that have introduced the law and the states in which it has not been introduced.

The coefficients of the interaction terms are the estimates of the effect of pre-registration, at various times both after implementation and in anticipation of it, on the outcome (register, whatever that represents), compared to the effect in the omitted reference category of event_time_var.

3. To realize the DDD instead, using the elderly as a group placedo, I just need to insert the variable age?
In this regard, the following regression is correct? :

Not quite. You need to also include i.age_18_24 as a separate variable, not just mentioning it in the interaction.

4. To grasp a possible differentiated effect among young people in relation to sex and ethnicity, what would be the regression to be done instead? I've been thinking about this, but I'm not sure: regress register i.ps#i.event_time_var#age18_24#i.sex#i.black/i.hispanic i.sex i.black/i.hispanic i.age i.statefip i.year [pweight= wtfinl], cluster(statefip)

Well, again, the sex and other variables needs to appear separately as well as in the interaction. And if you decide to do race/ethnicity-sex specific analyses, then the sex#black interaction also needs to be separately mentioned in the regression. It is going to be quite challenging to interpret the results of these complicated models with high-order interaction terms.

Added: As I think about it, apart from the difficulty of interpreting models with high-order interaction terms, I don't know if it is reasonable to assume that the state effects and year effects are the same across sexes and race/ethnic groups. So it might be better to do those as subset analyses instead. Your sample is quite large, so that shouldn't be a problem.

Added: Crossed with #48.

Last edited by Clyde Schechter; 27 Apr 2021, 11:18.
Comment
Cairone Federica

Join Date: Apr 2021

Posts: 31
#50

27 Apr 2021, 11:21

Can you show me an example of what you mean by saying

Code:

You need to also include i.age_18_24 as a separate variable, not just mentioning it in the interaction.

? Thank you Clyde (for every single reply)!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29955
#51

27 Apr 2021, 11:37

Re #48. None of the models in that table are comparable to what you have run. For one thing, the table shows the inclusion of state-by-year fixed effects. But you don't have those: you have state effects and year effects, but not state by year. Also all of them involve age group, which is not yet in your model. Perhaps even more important, they show only values of time >= 0. It is unclear whether they are simply not reporting on values of time < 0, or whether they didn't even include negative time values in their model. If the latter, then that is a huge difference between the models.

I am also not clear whether the event time variable we constructed here is the same as what they used. In the passage from the article you showed all the way back in #1, their definition for it simply doesn't say what to do for the ps = 0 group. I suggested coding it as -5 throughout in that group, which is, I think, a sensible approach. But it may not be what they did. So if you are going to compare your results to theirs, you need to search the article to find out how they handled that. And if it isn't in the article at all, you need to contact them and ask.

If your purpose, at least at this stage, is to replicate their study, then you need to use exactly the same model. Even minor additions or omissions from a model can have a large impact on the results.
Comment
Cairone Federica

Join Date: Apr 2021

Posts: 31
#52

27 Apr 2021, 11:50

They combine the DD models for the two age groups (young and old) of voters and develop a triple-difference (hereafter DDD) regression design, using old voters as placebo.

Formally, the empirical model to be tested is as follows: Yi,a,s,t = δs,t + δa,t + δa,s + π ⋅ Xi,a,s,t + 1(18 ≤ a ≤ 24) ⋅ ∑τ=−5 3 βτ ⋅ Ps ⋅ 1(t − Ts = τ) + εi,a,s,t

The treatment variable is constructed here by interacting Ps ⋅ 1(t − Ts = τ) with the age-group dummy 1(18 ≤ a ≤ 24), which is set to 1 if the respondent belongs to the young group. The identification assumption for consistency of the estimates now relies on the absence of shocks that differentially affect the political participation of the young only in the preregistration states during the sample period.
The Table 1 I show you, summarizes the magnitude and the statistical significance of the DDD event study estimates for both voter registration and turnout. For the sake of brevity, even though the underlying model includes the pre-event interaction terms, they display only the βτs for τ ≥ 0.
The fact that the effect lasts up to three elections is partly explained by the presence in the sample of a few treated states with such a long post-treatment exposure. In columns 3 and 6, they finally estimate the average changes in the outcomes following the event, controlling again for respondents’ characteristics. To identify the post-treatment time, they estimate a specification of regression (2) that replaces 1(t − Ts = τ) with 1(t ≥ Ts), an indicator variable set to 1 if individual i is resident in a state s that implements preregistration at some point and responds in any election year t after (and including) Ts. Hence, the treatment effect is captured here by the coefficient of the triple interaction term 1(18 ≤ a ≤ 24) ⋅ Ps ⋅ 1(t ≥ Ts).
Comment

Cairone Federica

Join Date: Apr 2021
Posts: 31

#53

27 Apr 2021, 14:27

Hei Clyde! tried to compare our model with that of the authors. It is very similar (I tried to adapt ours to see if we could get the same results). This is the model implemented by the authors (I show you only some line from the be
ginning and the end )

Code:

eststo: reg register F5_last F4_pre F3_pre F2_pre uno F0_pre L1_pre L2_pre L3_last i.year#i.age18_24 i.statefip#i.age18_24 i.statefip
> #i.year [pweight= wtfinl] , cluster(statefip)


Linear regression                               Number of obs     =  1,350,537
                                                F(26, 50)         =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0518
                                                Root MSE          =     .41363

                                            (Std. Err. adjusted for 51 clusters in statefip)
--------------------------------------------------------------------------------------------
                           |               Robust
                  register |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------------+----------------------------------------------------------------
                   F5_last |   .0022194   .0167672     0.13   0.895    -.0314585    .0358972
                    F4_pre |  -.0030589   .0188419    -0.16   0.872     -.040904    .0347862
                    F3_pre |    .002034   .0171228     0.12   0.906    -.0323582    .0364263
                    F2_pre |  -.0167254   .0222961    -0.75   0.457    -.0615084    .0280576
                       uno |          0  (omitted)
                    F0_pre |   .0271859   .0195184     1.39   0.170    -.0120179    .0663896
                    L1_pre |   .0360784   .0184065     1.96   0.056    -.0008922    .0730491
                    L2_pre |   .0274993   .0145574     1.89   0.065    -.0017402    .0567387
                   L3_last |  -.0157818   .0249377    -0.63   0.530    -.0658707     .034307



        wyoming#2008  |          0  (omitted)
             wyoming#2010  |          0  (omitted)
             wyoming#2012  |          0  (omitted)
             wyoming#2014  |          0  (omitted)
                           |
                     _cons |   .7438226    .001019   729.93   0.000     .7417759    .7458694
--------------------------------------------------------------------------------------------

While, this is our model:

Code:

 regress register i.ps##i.event_time_var i.year#i.age18_24 i.statefip#i.age18_24 i.statefip#i.year i.ps i.event_time_var [pweight= wtf
> inl] , cluster(statefip)

Linear regression                               Number of obs     =  1,350,537
                                                F(20, 50)         =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0518
                                                Root MSE          =     .41364

                                            (Std. Err. adjusted for 51 clusters in statefip)
--------------------------------------------------------------------------------------------
                           |               Robust
                  register |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------------+----------------------------------------------------------------
                      1.ps |   .0527458   .0003385   155.82   0.000     .0520659    .0534257
                           |
            event_time_var |
                       -4  |   .0794331   .0002351   337.81   0.000     .0789608    .0799054
                       -2  |   .0500294   .0002504   199.80   0.000     .0495264    .0505323
                        0  |   .0735789   .0002162   340.30   0.000     .0731446    .0740131
                        2  |   .0867931   .0001782   486.92   0.000     .0864351    .0871511
                        3  |   .1024212   .0001811   565.45   0.000     .1020574     .102785
                           |
         ps#event_time_var |
                     0#-4  |          0  (empty)
                     0#-2  |          0  (empty)
                      0#0  |          0  (empty)
                      0#2  |          0  (empty)
                      0#3  |          0  (empty)
                     1#-4  |          0  (omitted)
                     1#-2  |          0  (omitted)
                      1#0  |          0  (omitted)
                      1#2  |          0  (omitted)
                      1#3  |          0  (omitted)




           wyoming#2010  |          0  (omitted)
             wyoming#2012  |          0  (omitted)
             wyoming#2014  |          0  (omitted)
                           |
                     _cons |   .7439386   .0009832   756.67   0.000     .7419639    .7459134
--------------------------------------------------------------------------------------------

Well, apart from the major differences regarding the event dummy, the rest of the coefficients relative to - statefip#year- , and - statefip#age18_24 - , and - year#age18_24 - are pretty the same (very little difference in comparison).

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment