Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference-in-differences

    Hello everyone! I'm doing my thesis, but I have a big problem with how applying the difference-in-differences methodology on the panel data I have.
    Our goal is to ascertain how the enactment of preregistration laws affects the political participation of young individuals and the distribution of public resources. We begin the analysis by empirically examining the effect of preregistration on young voter registration and turnout. To this end, we take advantage of the fact that preregistration reduces the cost of registering and in turn the cost of voting for young relative to other age groups. Since the age of an individual is a dimension along which the treatment varies, along with time and space, we first split the set of individuals into two age groups: the young and the old. For each of them, we then use a difference-in-differences (hereafter DD) regression design, which compares electoral outcomes for individuals in states with preregistration and states without before and after voting reform is introduced.
    We operationalize the empirical strategy employing the following event study model based on a DD estimator:


    Click image for larger version

Name:	Model.png
Views:	1
Size:	127.9 KB
ID:	1603107



    So, I created my dummies variables, as you can seen below:
    Click image for larger version

Name:	Immagine 2021-04-12 202854.png
Views:	1
Size:	104.9 KB
ID:	1603106


    My problem is now figuring out how to make the difference-in-differences I mentioned above. Taking into consideration the two respective groups (young and old). Should I proceed through regression, or is it better to use other commands (for example the specific diff command); moreover, it is not clear to me in this context, with so many data, to understand how to identify the control and treatment groups.
    I would be grateful if any of you could help me; unfortunately i have only studied the simplest case of diff-in-diff, and also i don't manage very well Stata.

  • #2
    To follow the equation you show, you need to set up some other variables. You need to create the variable called Ps in that equation--it is this one which will distinguish the treatment group from the control group.
    Code:
    by statefip, sort: egen ps = max(pre_reg)
    You also need a variable that gives the number of years since the first election following implementation in the state:
    Code:
    by statefip, sort: egen year_first_post_implement = min(cond(pre_reg, year, .))
    gen years_since_implement = year - year_first_post_implement
    replace years_since_implement = max(min(year_first_post_implement, 3), -5)
    Now, setting aside for the moment the distinction between the young and the older, your regression would then go like this:
    Code:
    xtset statefip
    xtlogit Y i.ps##i.years_since_implement i.year, fe
    where Y should be replaced by your actual 0/1 outcome variable (you don't show what it is called).

    To incorporate the possibility of separate impacts on young and old, you need to expand the interaction to a three-way one:
    Code:
    xtlogit Y i.age18_24##i.ps##i.years_since_implement i.year, fe
    Notes:
    1. I am giving the "bare bones" model here. There may be other variables that need to be taken into account. You may (or may not) need to use cluster-robust standard errors.
    2. None of this code is tested as usable example data was not provided. Therefore, beware of typos or other errors. In the future, when seeking help with coding, always show example data. And always use the -dataex- command to do that. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
    3. Interpreting the results of the model including young vs old is going to be complicated because the years_since_implement variable has 9 levels, and the effects may well be heterogeneous across those 9 time periods to start with, and the moderating effect of age on those may also be heterogeneous across those 9 time periods. Good luck! It's going to be a series of -lincom- commands for all of the three-way interaction terms, and perhaps for an omnibus test, a joint -test- of all 9.

    If you have enough other variables to include in the model so that, conditional on all of those, the delta_s and epsilon_i_s_t are reasonably considered independent, you can simplify your life by using a random effects model instead of fixed effects (-re- instead of -fe-). Because then you can use the -margins- command to get the marginal effects. But you can't really do that after a fixed-effects logistic regression.

    Comment


    • #3
      Ciao Clyde! I was hoping for your answer, I saw that you are very good and familiar with Stata, especially with this methodology. Is it possible to send the do.file via Statalist? I have all the necessary variables and the various do.file containing the results to be produced in the first instance; what I should do is create new interactions and see the difference in effects using other variables such as gender and race. My biggest problem with the files I have is understanding the "syntax" that is used, above all because the study I'm referring to uses the so-called event approach study...

      Comment


      • #4
        The way to show the contents of a do-file here on Statalist is to click on the # button in the toolbar at the top of the message window. (If there is no toolbar at the top of the message window, click on the A button: that will make the toolbar appear.) After clicking on the # button, "code delimiters" will appear in the message window. Copy the contents of the do-file (or the parts of it that you want to show), and paste it between the code delimiters. When you hit "Post Reply", the code will appear in a Code box in a fixed-width font, nicely aligned (well, as nicely aligned as the original code!) When you want to show Results that Stata gave you, you can do the same thing: just copy/paste from the Results window (or your log file) between code delimiters.

        Happy to help with specific questions about Stata syntax.

        Comment


        • #5
          Code:
          **** GENERATING VARIABLE ****
          *****************************
          
          /*GEN PREREGISTRATION*/
          gen pre_reg=1 if statefip==6 & year>= 2009 /*California*/
          replace pre_reg=1 if statefip==8 & year>=2013 /*Colorado*/
          replace pre_reg=1 if statefip==10 & year>=2010 /*Delaware*/ 
          replace pre_reg=1 if statefip==11 & year>=2009 /*DC*/
          replace pre_reg=1 if statefip==12 & year>=2007 /*Florida*/
          replace pre_reg=1 if statefip==22 & year>=2014 /*Louisiana*/ 
          replace pre_reg=1 if statefip==23 & year>=2011 /*Maine*/    
          replace pre_reg=1 if statefip==24 & year>=2010 /*Maryland*/
          replace pre_reg=1 if statefip==25 & year>=2014 /*Massachusetts*/
          replace pre_reg=1 if statefip==37 & year>=2009 & year<2013/*North Caroline*/
          replace pre_reg=1 if statefip==41 & year>=2007 /*Oregon*/
          replace pre_reg=1 if statefip==44 & year>=2010 /*Rhode Island*/
          replace pre_reg=1 if statefip==49 & year>=2015 /*Utah*/
          replace pre_reg=1 if statefip==15 & year>=1993 /*Hawaii*/
          replace pre_reg=1 if statefip==34 & year>=2016 /*New Jersey*/
          replace pre_reg=0 if pre_reg==.
          .
          
          /*GEN VOTED AND REGISTERED*/
          gen register=1 if voreg==2 
          replace register=0 if voreg==1
          replace register=1 if voted==2 & register==.
          gen vote=1 if voted==2
          replace vote=0 if voted==1
          
          /*GEN AGE DUMMIES*/
          gen age18_24 =1 if age>=18 & age<25   
          replace age18_24=0 if age>24 & age!=.
          
          gen pre18=pre_reg*age18_24
          gen online18=online*age18_24
          gen edr18=edr*age18_24
          
          /*CLEAN DATA*/
          replace sex=. if sex==9
          gen black=1 if race==200 
          replace black=0 if black==. & race!=.
          gen hispanic=1 if hispan>0 & hispan<900
          replace hispanic=0 if hispan==0
          replace labforce=. if labforce==0
          replace voteresp=. if voteresp==9
          replace faminc=. if faminc>843
          replace faminc=. if faminc==800
          recode faminc (100=0) (110=0) (120=0) (130=0) (140=0) (150=0) (210=1) (220=1) (231=1) (300=1) (430=2) (440=2) (460=2) (470=2) (500=3) (540=3) (550=3) (600=4) (700=5) (710=5) (720=5) (730=5) (740=5) (810=6) (820=6) (830=6) (840=7) (841=7) (842=7) (843=7),  g(faminc1)
          tab faminc faminc1
          replace educ=. if educ==1 | educ==999
          recode educ (2=0) (10=0) (11=0) (12=0) (13=0) (14=0) (20=0) (21=0) (22=0) (30=0) (31=0) (32=0) (40=0) (50=0) (60=0) (71=1) (72=1) (73=1) (80=2) (81=2) (90=2) (91=2) (92=2) (100=2) (110=3) (111=3) (121=3) (122=3) (123=3) (124=3) (125=3), g(educ1)
          tab educ educ1
          
          global controls i.sex i.black i.hispanic i.educ1 i.faminc1 i.labforce i.metro i.voteresp
          
          drop if hispanic==.
          drop if labforce==. 
          drop if voteresp==.
          drop if faminc1==.
          drop if age18_24==.
          drop if age>90
          
          /*GEN EVENTS ON AGE 18-24 PREREG*/
          
          so statefip
          
          *Generate Cohorts*
          gen treated_states = 0
          by statefip: egen max_pre=max(pre_reg)
          by statefip: replace treated_state=max_pre
          
          by statefip: egen pre_reg_y=min(year) if pre_reg==1
          by statefip: egen target=min(pre_reg_y)
          egen treated_year=group(pre_reg_y)
          by statefip: egen cohort=max(treated_year)
          replace cohort=0 if cohort==.
          
          *Generate Leads and Lags*
          forvalues kk = 0(1)5 {
          by statefip: gen F`kk'=target-2*`kk'
          by statefip: gen F`kk'_pre=0
          by statefip: replace F`kk'_pre=1 if age18_24==1 & year==F`kk'
          by statefip: gen Fold`kk'_pre=1 if age18_24==0 & year==F`kk'
          }
          
          forvalues kk = 1(1)3 {
          by statefip: gen L`kk'=target+2*`kk'
          by statefip: gen L`kk'_pre=0
          by statefip: replace L`kk'_pre=1 if age18_24==1 & year==L`kk'
          by statefip: gen Lold`kk'_pre=1 if age18_24==0 & year==L`kk'
          }
          
          by statefip: gen F5_last=0
          by statefip: replace F5_last=1 if age18_24==1 & year<=target-10 & target!=.
          by statefip: gen L3_last=0
          by statefip: replace L3_last=1 if age18_24==1 & year>=target+6 & target!=.
          
          *Generate event window*
          gen eventwindow = 0
          forvalues kk = 0(1)5 {
              replace eventwindow = 1 if F`kk'_pre == 1 | Fold`kk'_pre==1
              }
          forvalues kk = 1(1)2 {
              replace eventwindow = 1 if L`kk'_pre == 1 | Lold`kk'_pre==1
              }
          
          * Generate mean of omitted time
          gen year_omitted=.
          replace year_omitted=year if F1_pre==1
          by state: egen max_year_omitted=max(year_omitted)
          by state: egen register_young_m=mean(register) if (age18_24==1 & year==max_year_omitted)
          by state: egen register_old_m=mean(register) if (age18_24==0 & year==max_year_omitted)
          by state: egen max_register_young_m=max(register_young_m)
          by state: egen max_register_old_m=max(register_old_m)
          gen register_gap_m=max_register_old_m-max_register_young_m
          
          by state: egen vote_young_m=mean(vote) if (age18_24==1 & year==max_year_omitted)
          by state: egen vote_old_m=mean(vote) if (age18_24==0 & year==max_year_omitted)
          by state: egen max_vote_young_m=max(vote_young_m)
          by state: egen max_vote_old_m=max(vote_old_m)
          gen vote_gap_m=max_vote_old_m-max_vote_young_m
          
          by state: egen register_young_D=mean(register) if (age18_24==1 & pre_reg==0 & target!=.)
          by state: egen register_old_D=mean(register) if (age18_24==0 & pre_reg==0 & target!=.)
          by state: egen max_register_young_D=max(register_young_D)
          by state: egen max_register_old_D=max(register_old_D)
          gen register_gap_D=max_register_old_D-max_register_young_D
          
          by state: egen vote_young_D=mean(vote) if (age18_24==1 & pre_reg==0 & target!=.)
          by state: egen vote_old_D=mean(vote) if (age18_24==0 & pre_reg==0 & target!=.)
          by state: egen max_vote_young_D=max(vote_young_D)
          by state: egen max_vote_old_D=max(vote_old_D)
          gen vote_gap_D=max_vote_old_D-max_vote_young_D


          Well, this is my database, and, honestly, I have no confindence with leads and lags, and also with the commands used in the "Generate mean of omitted time".
          Below, you can see the part of the file that I should use to produce the tables, and from here I was trying to get references to generate the DD model (and a further step consists of a DDD model).

          Code:
          /*DEF GLOBAL VARIABLES*/
          
          global controls i.sex i.black i.hispanic i.educ1 i.faminc1 i.labforce i.metro i.voteresp
          
          *****************
          **** TABLE 1 ****
          *****************
          
          *baseline: Model 1*
          gen uno=0
           
          eststo: reg register F5_last F4_pre F3_pre F2_pre uno F0_pre L1_pre L2_pre L3_last i.year#i.age18_24 i.statefip#i.age18_24 i.statefip#i.year [pweight= wtfinl] , cluster(statefip)
          eststo reg_baseline
          sum register_gap_m, meanonly
          estadd scalar ymean_int = r(mean)
          
          *controls: Model 2*
          eststo: reg register F5_last F4_pre F3_pre F2_pre uno F0_pre L1_pre L2_pre L3_last $controls i.year#i.age18_24 i.statefip#i.age18_24 i.statefip#i.year [pweight= wtfinl] , cluster(statefip)
          eststo reg_controls
          sum register_gap_m, meanonly
          estadd scalar ymean_int = r(mean)
           
          *average: Model 3*
          eststo: reg register pre18 $controls i.year#i.age18_24 i.statefip#i.age18_24 i.statefip#i.year [pweight= wtfinl] , cluster(statefip)
          eststo reg_DDD_controls
          sum register_gap_D, meanonly
          estadd scalar ymean_int = r(mean)
          Do the variables F5_last, F4_pre, etc.. correspond to those previously generated with leads and lags?

          Comment


          • #6
            Well, you certainly put a lot of effort into writing all that code. Much of it, I'm afraid, is not necessary. There is no reason to have separate indicator variables for F5_last, F4_pre, etc. All of those are handled automatically by Stata when you take the approach I used in #2. In general, there is seldom any reason to create indicator ("dummy") variables or separate variables for lags and leads of anything in Stata: read -help fvvarlist- and -help tsvarlist- respectively. Using them saves you a lot of time, reduces the risk of errors, and makes the code much shorter and easier to read/understand.

            Also, I see you frequently follow the pattern (I see this often here on Statalist and wonder where it comes from):
            Code:
            gen new_var = 0
            replace new_var = 1 if some_logical_condition
            That construction can be compressed to:
            Code:
            gen new_var = some_logical_condition
            which again saves time, reduces typing and errors, compactifies the code, and makes the code easier to read and understand.

            Also, you have some very long -recode- commands that can be compressed. The scheme you are using:
            Code:
            recode some_variable (10 = 0) (12 = 0) (15 = 0) (19 = 0) (22 = 1) (31 = 1) (46 = 1) (49 = 1) ...
            can be shortened to:
            Code:
            recode some_variable (10/19 = 0) (20/49 = 1) ...
            In addition to the benefits of time, typing, etc., this also does not require you to know every actual value of the variable being -recode-d. You just specify the ranges involved.

            It is your choice whether to use -xtreg, fe- after -xtset statefip-, or to use -regress ... i.statefip-...- You will get equivalent results. But using -regress i.statefip- will result in your getting a list of coefficients for the state indicator variables which, in most settings, are not of interest and just clutter up the output. -xtreg, fe- will absorb those and not bother you with them.

            All of the above are just different ways to do what you have done in your code. They don't constitute errors on your part, just doing things the hard way instead of the easy way. What I mention below constitute what I believe to be actual errors that need to be fixed in order to correctly fit the model you propose in #1. You can see in #2 how most of the code you have shown can be reduced to about a half-dozen lines.

            I see modeling error in all of your -regress- commands. First, the term i.statefip#i.year does not correspond to anything in the equation you showed in #1. It is legal to have terms like this, but they make it a different model from the one you claim to be trying to follow. Moreover, in the absence of i.statefip and i.year by themselves, the model is just a mis-specification: whenever you specify an interaction, the constituent effects must also be included. That can be accomplished automatically by using ## rather than # to specify the interaction, or you can use #, but then you have to explicitly list the constituents. In this specific case, however, the model you mention in #1 does not include any term correspodning to i.statefip#i.year, so you should eliminate that and replace it with just i.statefip and i.year separately: no interaction between them.

            I also notice that you are using -pweight-s. If this is survey data, the use of pweights is important to get unbiased coefficient estimates. But, if the survey design included stratification or primary and higher level sampling units, then the standard errors will not be correct unless you also account for those. You would have to refer to the documentation provided by the source of the data itself to learn both how the sampling was carried out and which variables in the data provided give you the strata and psus (and higher level sampling units, if any). You would then have to incorporate that information, along with the pweight variable, into the -svyset- command, and you would need to use the -svy:- prefix on your -regress- command (and take out the [pweight = ...] from the -regress- command.



            Comment


            • #7
              Ciao Clyde! Data on voting and registration at the individual level are obtained from the Voting and Registration Supplement of the Current Population Survey (CPS) carried out biennially after each November election by the US Census Bureau, that's why my supervisor suggested me to use pweights. Another problem is that my supervisor is a theoretical, not applied economist; in fact, she doesn't help much in generating the correct model on Stata. Concerning the commands - statefip#i.year and similar- I suppose they were introduced to denote year fixed effects ( to control for time shocks) and state fixed effects ( to account for unobserved state characteristics); More, as you can see from the equation in #1, the - Xi,s,t - is a vector of time-varying individual characteristics that I wrote as
              Code:
               
               global controls i.sex i.black i.hispanic i.educ1 i.faminc1 i.labforce i.metro i.voteresp
              that I need to introduce in my regression model (to do perform that, can I simply introduce them in my regression by using the command $controls?).
              Another aim of our research is to verify if there are differences related to gender and race (as well as age), creating variables of interactions that can grasp the effects. Maybe, even in this case, I made the syntax more complicated , I'll show you how I did it:
              Code:
              /*GEN SEX AND RACE*/
              gen male= sex==1
              gen female= sex==2
              gen black= race==200
              gen hispanic= hispan>0 & hispan<900
              
              /*GEN INTERACTIONS*/
              gen pre18=pre_reg*age18_24
              gen pre18black=pre_reg*black
              gen pre18hispanic=pre_reg*hispanic
              gen pre18female=pre_reg*female
              gen pre18male=pre_reg*male
              gen pre18blackmale=pre_reg*black*male
              gen pre18blackfemale=pre_reg*black*female
              gen pre18hispanicmale=pre_reg*hispanic*male
              gen pre18hispanicfemale=pre_reg*hispanic*female
              Don't consider the variable name, I still have to fix it well and create shorter names!

              Comment


              • #8
                Concerning the commands - statefip#i.year and similar- I suppose they were introduced to denote year fixed effects ( to control for time shocks) and state fixed effects ( to account for unobserved state characteristics);
                Yes, I imagined that is what you were thinking, but that is not what it does. statefip#i.year introduces a separate fixed effect for each combination of state and year. That is many more fixed effects than just having fixed effects for states and fixed effects for years. The equation that you cited in #1 contains only fixed effects for states and fixed effects for years, not for their combinations. So you need i.statefip and i.year in the model, but not their interaction.

                To introduce your covariates, yes you can just include $controls in the list of predictor variables in your regression command.

                The code you show for those interaction terms looks mostly correct, though, you should not include both male and female as these are mutually exclusive and exhaustive categories. Pick one. Also, this can be done more simply There is no need to create these interaction variables at all. Instead, in your regression command you can just include i.,pre18##i.(age18_24 black hispanic)##i.female and Stata will automatically include all of those variables and all of their two way and three way interactions in your model.

                Don't consider the variable name, I still have to fix it well and create shorter names!
                One of the problems with creating your own interaction variables is that you can end up with names that are too long to type (or even too long for legal Stata syntax), or names that are so abbreviated as to be unreadable. The best solution is not to create your own interaction variables.* Let factor-variable notation handle it and you completely avoid this problem.

                *There are occasional situations where an interaction really needs to be a variable in its own right, but these are relatively uncommon. For garden variety regression models with interaction terms, factor-variable notation is the better approach.
                Last edited by Clyde Schechter; 13 Apr 2021, 12:08.

                Comment


                • #9
                  Thanks a lot Clyde! You gave me a great starting point, more than great I would say. So, doing using the command diff-in-diff, or using the xtreg command is pretty much the same? Of course, later I will ask you for some other help because the second step in this work is to make a triple DDD, and I'm sure you already understand my less than basic level in Stata!

                  Comment


                  • #10
                    Most diff-in-diff analyses are carried out using the -xtreg- command. But sometimes they are done with other commands. When the data is serial cross-sections rather than panels, sometimes simple -regress- is good. Also -xtreg, fe- can always be emulated with -regress ... i.panelvar ...-. And some people prefer -areg- or -reghdfe- to -xtreg- for their fixed effects linear regressions. So the way I recommend you think about them is this: -xtreg- is one Stata command that carries out fixed-effects linear regression. There are other Stata commands that do that, too. Fixed-effects linear regression is applicable to many kinds of problems, of which diff-in-diff estimation of intervention effects is only one. There are many other applications of fixed-effects linear regression. Finally, don't think of diff-in-diff as a command. It's a study design, a strategy for identifying causal effects--it can involve statistical models other than fixed-effects linear regression, too.

                    Comment


                    • #11
                      Grazie Clyde!
                      I tried to run the following regression
                      Code:
                      xtset statefip
                      xtlogit register i.ps##i.years_since_implement i.year, fe
                      and this is the outcome:
                      Code:
                      . xtlogit register i.ps##i.years_since_implement i.year, fe 
                      note: 3.years_since_implement omitted because of collinearity
                      note: 1.ps#3.years_since_implement omitted because of collinearity
                      note: multiple positive outcomes within groups encountered.
                      note: 1.ps omitted because of no within-group variance.
                      18,722 (group size) take 14,827 (# positives) combinations results in numeric overflow; computations cannot proceed
                      r(1400);
                      How to fix the problem?
                      Also, when I need to introduce interactions see #8 in my regression, should I write each of them separately or I can simply write all together like:
                      Code:
                      xtset statefip
                      xtlogit register i.ps##i.years_since_implement i.year i.pre18##i.(age18_24 black hispanic)##i.female, fe

                      Comment


                      • #12
                        The lines that begin with -note:- are just informational: they are not problems and don't require you to do anything.

                        The final line about numeric overflow, however, is a fatal error. You have some panel (fipstate) that has 18,722 observations, of which 14,827 have register = 1. That problem is simply too large for Stata to manage. It arises because calculating the likelihood for that panel requires calculated the number of ways you can select 14,827 observations out of 18,722. That is some hugely astronomical number that Stata simply cannot accommodate. You will either have to find a computer and software that can manage this kind of calculation, or, select a substantially smaller random subset of your full data to do your analysis on. Another alternative might be using a linear probability model instead of logistic regression.

                        Comment


                        • #13
                          Thank you Clyde!
                          I tried to run a both a logit model and a simple regression; here the results:
                          Code:
                           reg register i.ps##i.years_since_implement i.statefip i.year , cluster(statefip)
                          note: 3.years_since_implement omitted because of collinearity
                          note: 1.ps#3.years_since_implement omitted because of collinearity
                          note: 44.statefip omitted because of collinearity
                          
                          Linear regression                               Number of obs     =  1,350,537
                                                                          F(15, 50)         =          .
                                                                          Prob > F          =          .
                                                                          R-squared         =     0.0188
                                                                          Root MSE          =     .41783
                          
                                                                    (Std. Err. adjusted for 51 clusters in statefip)
                          ------------------------------------------------------------------------------------------
                                                   |               Robust
                                          register |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------------------+----------------------------------------------------------------
                                              1.ps |  -.0064737   .0008297    -7.80   0.000    -.0081402   -.0048072
                           3.years_since_implement |          0  (omitted)
                                                   |
                          ps#years_since_implement |
                                              1 3  |          0  (omitted)
                                                   |
                                          statefip |
                                           alaska  |   .0308412   .0001853   166.46   0.000      .030469    .0312133
                                          arizona  |  -.0928223    .000166  -559.19   0.000    -.0931557   -.0924889
                                         arkansas  |   -.086923   .0001292  -672.86   0.000    -.0871824   -.0866635
                                       california  |  -.0283233   .0007059   -40.13   0.000     -.029741   -.0269055
                                         colorado  |  -.0109156   .0002137   -51.08   0.000    -.0113447   -.0104864
                                      connecticut  |   .0005448   .0008953     0.61   0.546    -.0012535    .0023431
                                         delaware  |  -.0287721   .0001844  -156.04   0.000    -.0291424   -.0284017
                             district of columbia  |    .034705   .0001774   195.66   0.000     .0343487    .0350613
                                          florida  |  -.0329829   .0010115   -32.61   0.000    -.0350144   -.0309513
                                          georgia  |  -.0683078    .000455  -150.12   0.000    -.0692217   -.0673939
                                           hawaii  |  -.1165353   .0002912  -400.22   0.000    -.1171201   -.1159504
                                            idaho  |  -.0603099   .0001217  -495.41   0.000    -.0605544   -.0600654
                                         illinois  |   .0048569    .000483    10.06   0.000     .0038868     .005827
                                          indiana  |   -.061877   .0002409  -256.81   0.000     -.062361   -.0613931
                                             iowa  |   .0000373    .000417     0.09   0.929    -.0008003    .0008749
                                           kansas  |  -.0532566   .0002944  -180.89   0.000    -.0538479   -.0526653
                                         kentucky  |   -.044231   .0004095  -108.02   0.000    -.0450535   -.0434085
                                        louisiana  |   .0213509   .0008043    26.55   0.000     .0197355    .0229663
                                            maine  |   .0762839   .0000896   851.11   0.000     .0761039     .076464
                                         maryland  |  -.0090021   .0002205   -40.82   0.000    -.0094451   -.0085592
                                    massachusetts  |   .0277593   .0019392    14.31   0.000     .0238642    .0316543
                                         michigan  |   .0440295   .0007776    56.62   0.000     .0424676    .0455914
                                        minnesota  |   .0784553   .0007606   103.15   0.000     .0769276    .0799829
                                      mississippi  |   .0312339   .0003148    99.22   0.000     .0306016    .0318662
                                         missouri  |   .0057357   .0003441    16.67   0.000     .0050445    .0064269
                                          montana  |  -.0110675    .000201   -55.07   0.000    -.0114712   -.0106639
                                         nebraska  |  -.0271213   .0002542  -106.67   0.000    -.0276319   -.0266106
                                           nevada  |  -.1383628   .0006148  -225.06   0.000    -.1395976    -.137128
                                    new hampshire  |  -.0393803   .0013241   -29.74   0.000    -.0420398   -.0367208
                                       new jersey  |   -.008576   .0009686    -8.85   0.000    -.0105214   -.0066305
                                       new mexico  |  -.0721512   .0002759  -261.53   0.000    -.0727053    -.071597
                                         new york  |  -.0349675   .0006587   -53.08   0.000    -.0362905   -.0336444
                                   north carolina  |  -.0470388   .0015972   -29.45   0.000     -.050247   -.0438307
                                     north dakota  |   .1223859   .0001326   923.07   0.000     .1221196    .1226522
                                             ohio  |  -.0353636   .0005926   -59.68   0.000    -.0365538   -.0341733
                                         oklahoma  |  -.0589539   .0000995  -592.60   0.000    -.0591538   -.0587541
                                           oregon  |   .0281229   .0004335    64.88   0.000     .0272523    .0289936
                                     pennsylvania  |  -.0778033    .000534  -145.71   0.000    -.0788758   -.0767307
                                     rhode island  |          0  (omitted)
                                   south carolina  |  -.0806422    .000304  -265.28   0.000    -.0812528   -.0800316
                                     south dakota  |    .009464   .0001519    62.30   0.000     .0091589    .0097692
                                        tennessee  |  -.0660559   .0002094  -315.40   0.000    -.0664766   -.0656353
                                            texas  |  -.0567234   .0001418  -400.00   0.000    -.0570082   -.0564386
                                             utah  |  -.0566612   .0000786  -720.98   0.000    -.0568191   -.0565034
                                          vermont  |   .0080213   .0007878    10.18   0.000     .0064389    .0096037
                                         virginia  |  -.0467071   .0003025  -154.39   0.000    -.0473148   -.0460994
                                       washington  |  -.0177072   .0005339   -33.17   0.000    -.0187795   -.0166349
                                    west virginia  |  -.0853103   .0000926  -921.41   0.000    -.0854962   -.0851243
                                        wisconsin  |   .0490889   .0003743   131.14   0.000     .0483371    .0498407
                                          wyoming  |  -.0744315   .0005945  -125.21   0.000    -.0756256   -.0732375
                                                   |
                                              year |
                                             1984  |   .0468675   .0046819    10.01   0.000     .0374636    .0562713
                                             1986  |   .0060849   .0045582     1.33   0.188    -.0030705    .0152402
                                             1988  |   .0314823   .0059296     5.31   0.000     .0195724    .0433921
                                             1990  |   .0024185   .0058917     0.41   0.683    -.0094154    .0142524
                                             1992  |   .0636051   .0061186    10.40   0.000     .0513156    .0758946
                                             1994  |   .0049109   .0066092     0.74   0.461     -.008364    .0181858
                                             1996  |   .0525019   .0072986     7.19   0.000     .0378422    .0671616
                                             1998  |    .024599   .0078077     3.15   0.003     .0089167    .0402812
                                             2000  |   .0676506   .0077914     8.68   0.000     .0520011    .0833001
                                             2002  |   .0355685   .0087095     4.08   0.000     .0180749    .0530621
                                             2004  |   .0985044   .0079113    12.45   0.000      .082614    .1143947
                                             2006  |   .0616212   .0089254     6.90   0.000     .0436939    .0795484
                                             2008  |   .1101179   .0089903    12.25   0.000     .0920603    .1281754
                                             2010  |   .0660821   .0087477     7.55   0.000     .0485119    .0836524
                                             2012  |   .1066888   .0095837    11.13   0.000     .0874394    .1259382
                                             2014  |   .0623274   .0099264     6.28   0.000     .0423896    .0822651
                                                   |
                                             _cons |   .7456745   .0057499   129.68   0.000     .7341255    .7572235
                          ------------------------------------------------------------------------------------------
                          Honestly, I don't know if the results are right or how to interpret the coefficients. Also, I wanted to ask you if you could explain to me how you built both the variables -ps- and the event-time dummy. I am also puzzled as to how the result of their interaction tells me something on the treatment effect: how it tells me that the electoral participation of young people has increased in the states that introduced the preregistration law compared to those that did not introduce it?

                          Comment


                          • #14
                            This model is not a good implementation of what is shown in the screenshot in #1. In fact, the problem arose with the code I wrotein #2, and you just copied it--so my apologies. The command should be:
                            Code:
                            reg register i.ps#i.years_since_implement i.statefip i.year , cluster(statefip)
                            Note the use of #, not ##. This will eliminate the extra colinearities, and it will leave you with an interpretable interaction coefficient.

                            When you have indicators for i.statefip and i.year, you do not want to also have i.ps and i.years_since_implement as separate terms in your model: that introduces colinear relationships and something gets dropped. In your case, one of the things that got dropped was your most important term: the interaction. By replacing ## with #, those additional terms will not be generated, and things will be more comprehensible in the output.

                            Comment


                            • #15
                              Ciao Clyde! Againg thank you!!!
                              I modified what you just said above.. and that's the result:
                              Code:
                              . reg register i.ps#i.years_since_implement i.statefip i.year, cluster(statefip)
                              note: 0b.ps#3.years_since_implement omitted because of collinearity
                              note: 1.ps#3.years_since_implement omitted because of collinearity
                              
                              Linear regression                               Number of obs     =  1,350,537
                                                                              F(15, 50)         =          .
                                                                              Prob > F          =          .
                                                                              R-squared         =     0.0188
                                                                              Root MSE          =     .41783
                              
                                                                        (Std. Err. adjusted for 51 clusters in statefip)
                              ------------------------------------------------------------------------------------------
                                                       |               Robust
                                              register |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                              -------------------------+----------------------------------------------------------------
                              ps#years_since_implement |
                                                  0 3  |          0  (omitted)
                                                  1 3  |          0  (omitted)
                                                       |
                                              statefip |
                                               alaska  |   .0308412   .0001853   166.46   0.000      .030469    .0312133
                                              arizona  |  -.0928223    .000166  -559.19   0.000    -.0931557   -.0924889
                                             arkansas  |   -.086923   .0001292  -672.86   0.000    -.0871824   -.0866635
                                           california  |   -.034797   .0001841  -188.96   0.000    -.0351668   -.0344271
                                             colorado  |  -.0173893   .0006333   -27.46   0.000    -.0186614   -.0161172
                                          connecticut  |   .0005448   .0008953     0.61   0.546    -.0012535    .0023431
                                             delaware  |  -.0352458   .0006805   -51.80   0.000    -.0366126    -.033879
                                 district of columbia  |   .0282313   .0007183    39.30   0.000     .0267886     .029674
                                              florida  |  -.0394566   .0002665  -148.08   0.000    -.0399918   -.0389214
                                              georgia  |  -.0683078    .000455  -150.12   0.000    -.0692217   -.0673939
                                               hawaii  |   -.123009    .000585  -210.28   0.000     -.124184    -.121834
                                                idaho  |  -.0603099   .0001217  -495.41   0.000    -.0605544   -.0600654
                                             illinois  |   .0048569    .000483    10.06   0.000     .0038868     .005827
                                              indiana  |   -.061877   .0002409  -256.81   0.000     -.062361   -.0613931
                                                 iowa  |   .0000373    .000417     0.09   0.929    -.0008003    .0008749
                                               kansas  |  -.0532566   .0002944  -180.89   0.000    -.0538479   -.0526653
                                             kentucky  |   -.044231   .0004095  -108.02   0.000    -.0450535   -.0434085
                                            louisiana  |   .0148772   .0001567    94.93   0.000     .0145624    .0151919
                                                maine  |   .0698102   .0007664    91.09   0.000     .0682708    .0713496
                                             maryland  |  -.0154759   .0008111   -19.08   0.000     -.017105   -.0138467
                                        massachusetts  |   .0212855   .0011464    18.57   0.000     .0189828    .0235882
                                             michigan  |   .0440295   .0007776    56.62   0.000     .0424676    .0455914
                                            minnesota  |   .0784553   .0007606   103.15   0.000     .0769276    .0799829
                                          mississippi  |   .0312339   .0003148    99.22   0.000     .0306016    .0318662
                                             missouri  |   .0057357   .0003441    16.67   0.000     .0050445    .0064269
                                              montana  |  -.0110675    .000201   -55.07   0.000    -.0114712   -.0106639
                                             nebraska  |  -.0271213   .0002542  -106.67   0.000    -.0276319   -.0266106
                                               nevada  |  -.1383628   .0006148  -225.06   0.000    -.1395976    -.137128
                                        new hampshire  |  -.0393803   .0013241   -29.74   0.000    -.0420398   -.0367208
                                           new jersey  |   -.008576   .0009686    -8.85   0.000    -.0105214   -.0066305
                                           new mexico  |  -.0721512   .0002759  -261.53   0.000    -.0727053    -.071597
                                             new york  |  -.0349675   .0006587   -53.08   0.000    -.0362905   -.0336444
                                       north carolina  |  -.0535125   .0008327   -64.27   0.000     -.055185   -.0518401
                                         north dakota  |   .1223859   .0001326   923.07   0.000     .1221196    .1226522
                                                 ohio  |  -.0353636   .0005926   -59.68   0.000    -.0365538   -.0341733
                                             oklahoma  |  -.0589539   .0000995  -592.60   0.000    -.0591538   -.0587541
                                               oregon  |   .0216492   .0003991    54.24   0.000     .0208475    .0224509
                                         pennsylvania  |  -.0778033    .000534  -145.71   0.000    -.0788758   -.0767307
                                         rhode island  |  -.0064737   .0008297    -7.80   0.000    -.0081402   -.0048072
                                       south carolina  |  -.0806422    .000304  -265.28   0.000    -.0812528   -.0800316
                                         south dakota  |    .009464   .0001519    62.30   0.000     .0091589    .0097692
                                            tennessee  |  -.0660559   .0002094  -315.40   0.000    -.0664766   -.0656353
                                                texas  |  -.0567234   .0001418  -400.00   0.000    -.0570082   -.0564386
                                                 utah  |  -.0566612   .0000786  -720.98   0.000    -.0568191   -.0565034
                                              vermont  |   .0080213   .0007878    10.18   0.000     .0064389    .0096037
                                             virginia  |  -.0467071   .0003025  -154.39   0.000    -.0473148   -.0460994
                                           washington  |  -.0177072   .0005339   -33.17   0.000    -.0187795   -.0166349
                                        west virginia  |  -.0853103   .0000926  -921.41   0.000    -.0854962   -.0851243
                                            wisconsin  |   .0490889   .0003743   131.14   0.000     .0483371    .0498407
                                              wyoming  |  -.0744315   .0005945  -125.21   0.000    -.0756256   -.0732375
                                                       |
                                                  year |
                                                 1984  |   .0468675   .0046819    10.01   0.000     .0374636    .0562713
                                                 1986  |   .0060849   .0045582     1.33   0.188    -.0030705    .0152402
                                                 1988  |   .0314823   .0059296     5.31   0.000     .0195724    .0433921
                                                 1990  |   .0024185   .0058917     0.41   0.683    -.0094154    .0142524
                                                 1992  |   .0636051   .0061186    10.40   0.000     .0513156    .0758946
                                                 1994  |   .0049109   .0066092     0.74   0.461     -.008364    .0181858
                                                 1996  |   .0525019   .0072986     7.19   0.000     .0378422    .0671616
                                                 1998  |    .024599   .0078077     3.15   0.003     .0089167    .0402812
                                                 2000  |   .0676506   .0077914     8.68   0.000     .0520011    .0833001
                                                 2002  |   .0355685   .0087095     4.08   0.000     .0180749    .0530621
                                                 2004  |   .0985044   .0079113    12.45   0.000      .082614    .1143947
                                                 2006  |   .0616212   .0089254     6.90   0.000     .0436939    .0795484
                                                 2008  |   .1101179   .0089903    12.25   0.000     .0920603    .1281754
                                                 2010  |   .0660821   .0087477     7.55   0.000     .0485119    .0836524
                                                 2012  |   .1066888   .0095837    11.13   0.000     .0874394    .1259382
                                                 2014  |   .0623274   .0099264     6.28   0.000     .0423896    .0822651
                                                       |
                                                 _cons |   .7456745   .0057499   129.68   0.000     .7341255    .7572235
                              ------------------------------------------------------------------------------------------
                              Is something still wrong? You think it's okay for the first few lines to result in "omitted"? But by removing one #,I no longer have the interaction which is actually the one that allows me to grasp the treatment effect.

                              Comment

                              Working...
                              X