Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks Jared Greathouse. Actually, my objective here is to replicate a sdid ( with no covariates) result in synth.

    However based on both Help files:

    Code:
     synth depvar predictorvars , trunit(#) trperiod(#) [ counit(numlist) xperiod(numlist) mspeperiod() resultsperiod() nested allopt unitnames(varname) figure keep(file) customV(numlist) optsettings ]
    and

    Code:
    sdid depvar groupvar timevar treatment [if] [in], vce(vcetype) [covariates(varlist, [type]) ... ]
    in synth'scase, predictorvars are not optional ( not in brackets).

    In your example #15,
    Code:
    synth fert /// classic SCM- Justin Wiltshire's command
        fert(1960) ///
        fert(1955) ///
        fert(1965) ///
        fert(1955/1965), ///
    Aren't this covariates(predictorvars) ?

    thks

    Comment


    • #17
      Hi Luis,

      The idea behind the sdid command (and of course the theory from the Arkhangelsky et al., paper on which it is based), is to match exclusively on (full) pre-treatment trends. That is, in sdid, the equivalent of synth's predictorvars are just the dependent variable in each pre-treatment time period. For this reason, it is not necessary to explicitly request this information as an argument in sdid, we take this as given as the objective in generating the synthetic control. For reference, this is Arkhangelsky et al.'s Algorithm 1. In synth, one can match on arbitrary predictorvars, including covariates, or only particular pre-treatment periods. While sdid absolutely allows for the inclusion of covariates, these are not used in the generation of the synthetic control per se, but rather are concentrated out of the treated and control unit dependent variables prior to finding the optimal synthetic control.

      Best wishes,
      Damian

      Comment


      • #18
        Originally posted by Muhammad Ibrahim Shah View Post
        Hi Professor, Daniel PV, I am trying to practise the sdid in stata using the original data. But this gives me an error:



        . sdid packspercapita state year treated, vce(placebo) seed(1213) graph g1_opt(xtitle("")) g2_opt(ylabel(0(50)150, axis(2)))
        type mismatch: exp.exp: transmorphic found where struct expected
        r(3000);
        Hi Muhammad, just to let you know, this bug is now fixed. It turns out that there was an issue in sdid when using Mata in Stata <= 14.0. We have now corrected this issue, and all should work exactly as documented in the help file of the command. If you'd like to install the most recent version of the command with this bug fix incorporated, you can do so with:
        Code:
         
         net install sdid, from("https://raw.githubusercontent.com/daniel-pailanir/sdid/master") replace
        This will be updated on the SSC in the near future.
        Best wishes,
        Damian

        Comment


        • #19
          Luis Pecht In my mind (and I suspect others may disagree with me), lags of the outcome aren't ""really"" covariates. They're outcomes we're interested in, sure, but they're not other variables outside of the outcome. I suppose some of it does boil down to semantics.

          SDID, as I understand it, doesn't just generate unit weights as normal SCM does, it also weights the donor pool by time, too. So, if you use SDID, it isn't necessary that the trends "match" directly on each other, so long as they're parallel. So, if you COULD make synth replicate the results of SDID by re-engineering it, then my next advice to you would be to write a paper on it and send it to Stata Journal.

          Speaking of which, it may make sense at some point for me to write a review paper on the SCM capabilities in Stata. SCM is developing so fast that having a go to paper which surveys and discusses this method in one place would be nice.

          Comment


          • #20
            I have a bit of question on the graphical side. Consider the following code please Damian Clarke Daniel PV
            Code:
            u "https://github.com/scunning1975/mixtape/blob/master/smoking.dta?raw=true", clear
            
            cls
            
            qui g treated = cond(state==3 & year >=1988,1,0)
            
            
            sdid ci st ye tr, vce(placebo) graph
            Here we estimate the causal effect of Prop 99 using Cunningham's data.

            One graph we get is the treated vs control averages, which under the hood produces the following dataset

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float year double __00004EControl float(__00004ETreated lambda)
            1970 141.99410930275917   123         0
            1971 145.15848191082478   121         0
            1972  149.7487874776125 123.5         0
            1973  149.1462676078081 124.4         0
            1974  150.3044495433569 126.7         0
            1975 150.56657886505127 127.1         0
            1976  154.6930844038725   128         0
            1977 152.55365639925003 126.4         0
            1978 150.81208044290543 126.1         0
            1979  146.9357661753893 121.9         0
            1980  145.4907839745283 120.2         0
            1981 144.74189530313015 118.6         0
            1982 141.83772917091846 115.4         0
            1983  137.1082665771246 110.8         0
            1984  129.2928065508604 104.8         0
            1985 127.52268949151039 102.8         0
            1986 124.41831301152706  99.7 .58608836
            1987 123.21651093661785  97.5  .4139117
            1988 117.52167113125324  90.1         .
            1989 113.48608428239822  82.4         .
            1990 108.28887414932251  77.8         .
            1991 103.45071151852608  68.7         .
            1992 101.98116411268711  67.5         .
            1993 101.88585385680199  63.4         .
            1994 100.52123700082302  58.6         .
            1995 101.26544179022312  56.4         .
            1996  99.80574382841587  54.5         .
            1997 100.54352866113186  53.8         .
            1998 100.98628042638302  52.3         .
            1999  99.10571823269129  47.2         .
            2000  92.17977494001389  41.6         .
            end
            The estimated ATT is -15.38, but when I do
            Code:
            g diff = __00004ETreated - __00004EControl
            
            mean diff if year > 1988
            we get -41.608. Now I'm no econometrician or statistician, but -41.608 is pretty far from -15.38.


            My question here, is how do I generate the predicted counterfactual, such that the difference between the counterfactual and the treated unit is the exact same causal effect generated by the SDID?

            Perhaps reshaping or multiplying by the lambda variable are involved? I want to compare the counterfactual by SDID to my estimator and normal SCM, so I wanted to show in a line graph the counterfactual generated by SDID.

            Comment


            • #21
              Hi Jared Greathouse , first: when we calculate the tau (-15.38560) in the sdid command, we follow a matrix operation:

              Code:
              u "https://github.com/scunning1975/mixtape/blob/master/smoking.dta?raw=true", clear
              qui g treated = cond(state==3 & year >=1988,1,0)
              sdid ci st ye tr, vce(placebo) graph
              keep state year cigsale    
              reshape wide cigsale, i(state) j(year)
              replace state=40 if state==3   //put treated unit at the end of matrix
              sort state
              mkmat cigsale*, matrix(Y)      //outcome matrix
              matrix O=e(omega)
              matrix O=O[1..38,1]            //weight omega
              matrix L=e(lambda)
              matrix L=L[1..18,1]            //weight lambda
              matlist (-O', J(1,1,1/1))*Y*(-L',J(1,13,1/13))' //1 treatment unit and 13 post period
              Second: to calculate tau using treatment and control data, you need to add a few things:

              Code:
              clear
              input float year double Control float(Treated lambda)
              1970 141.99410930275917   123         0
              1971 145.15848191082478   121         0
              1972  149.7487874776125 123.5         0
              1973  149.1462676078081 124.4         0
              1974  150.3044495433569 126.7         0
              1975 150.56657886505127 127.1         0
              1976  154.6930844038725   128         0
              1977 152.55365639925003 126.4         0
              1978 150.81208044290543 126.1         0
              1979  146.9357661753893 121.9         0
              1980  145.4907839745283 120.2         0
              1981 144.74189530313015 118.6         0
              1982 141.83772917091846 115.4         0
              1983  137.1082665771246 110.8         0
              1984  129.2928065508604 104.8         0
              1985 127.52268949151039 102.8         0
              1986 124.41831301152706  99.7 .58608836
              1987 123.21651093661785  97.5  .4139117
              1988 117.52167113125324  90.1         .
              1989 113.48608428239822  82.4         .
              1990 108.28887414932251  77.8         .
              1991 103.45071151852608  68.7         .
              1992 101.98116411268711  67.5         .
              1993 101.88585385680199  63.4         .
              1994 100.52123700082302  58.6         .
              1995 101.26544179022312  56.4         .
              1996  99.80574382841587  54.5         .
              1997 100.54352866113186  53.8         .
              1998 100.98628042638302  52.3         .
              1999  99.10571823269129  47.2         .
              2000  92.17977494001389  41.6         .
              end
              
              replace lambda=-lambda
              replace lambda=1/13 if lambda==.
              gen d=Treated-Control
              gen dw=d*lambda
              egen tau=sum(dw)
              basically add a constant weight for the period after.
              Let me know if this clears up your doubts!
              Last edited by Daniel PV; 18 May 2022, 11:27.

              Comment


              • #22
                Prof. Daniel PV or indeed anyone on this forum, would you kindly help with how to plot the treatment effect graph from sdid. I have come across a paper using SDID recently published in EER and they present their graph in an appealing way (see figure attached). How can I get such a graph? Please help
                .
                Click image for larger version

Name:	Screenshot 2022-07-30 at 19.03.59.png
Views:	1
Size:	114.3 KB
ID:	1675793

                Comment


                • #23
                  Hi Joe Zonda, I think I have an answer to that. You must execute the sdid command, as many times as post treatment times. For example, for the Proposition 99 case, you would run sdid 12 times (1989 to 2000), in each case using all control times (1970 to 1988) and only the t-post time. I recommend you use a loop.
                  Code:
                  webuse set www.damianclarke.net/stata/
                  webuse prop99_example.dta, clear
                  matrix tau_prop99=J(12,3,.) //create an matrix to hold the results
                  
                  local j=1
                  forval t=1989(1)2000 {
                      sdid packspercapita state year treated if year<=1988 | year==`t', vce(placebo) seed(1213) reps(100)
                      
                      *save tau and lower and upper bound
                      local tau=e(tau)[1,1]
                      local se=e(se)
                      local lci=`tau'+invnormal(0.025)*`se'
                      local uci=`tau'+invnormal(0.975)*`se'
                      matrix tau_prop99[`j',1]=`tau'
                      matrix tau_prop99[`j',2]=`lci'
                      matrix tau_prop99[`j',3]=`uci'
                      local ++j
                  }
                  
                  matlist tau_prop99
                  matrix coln tau_prop99=tau lower upper
                  clear
                  svmat tau_prop99, n(col)
                  gen year=_n+1988 //define the time variable for the graph
                  
                  #delimit ;
                  tw line tau year, || rcap lower upper year || scatter tau year, mc(black) ||
                     , yline(0,lc(balck%50) lp(dash)) legend(off) ytitle("Treatment effect by year") 
                       xtitle("") xlabel(1989(1)2000) ylabel(-50(10)20) scheme(gg_tableau);
                  graph export prop99.eps, replace;
                  #delimit cr
                  the result is something like this
                  Click image for larger version

Name:	prop99.png
Views:	1
Size:	36.2 KB
ID:	1675944
                  Attached Files

                  Comment


                  • #24
                    Daniel PV thank you so much. This is amazing. No way I could figure this one out on my own. Thank you a million times.

                    Comment


                    • #25
                      Hi all,

                      Quick question concerning the graph that the community-contributed command sdid produces, namely the one with the name of all donor pool units with a bubble representing their weight.

                      Is there any way to increase the horizontal spacing between the names of the units (and the ensuing bars and bubbles on the actual graph) in the code, and the font size of the names of these units? The reason I ask is that it is possible to have a large number of donor pool units, making the graph illegible.

                      Sure, after the estimation, the font size can be modified, however not very flexibly and only in categories (e.g. tiny, minuscule, etc.).

                      Comment


                      • #26
                        You'll likely need to get into the guts of the ado code in order to do this. Which to be fair isn't that hard in this case if you know how to program ado code, but in my experience at least it isn't readily attainable, but I could be quite wrong since I didn't write this.

                        Comment


                        • #27
                          Hi Maxence Morlet , my recommendation on bubble charts is to only do so if you have manageable control units to plot. You can change the font size using xlabel(,labsize()) in g1_opt(). About the x-axis spacing, I'm not really familiar with that issue, but my impression is that Stata does it automatically, so the only way I'd try is using xscale to modify. By the way, if you want to create your own graph, you can do it using omega and lambda matrix.
                          Code:
                          webuse set www.damianclarke.net/stata/
                          webuse prop99_example.dta, clear
                          
                          #delimit ;
                          sdid packspercapita state year treated, vce(placebo) reps(5) seed(1213) 
                               graph g1_opt(xtitle("") xscale(range(-12 50)) xlabel(,labsize(10pt))) 
                               g2_opt(ylabel(0(50)150)) graph_export(sdid_, .eps);
                          #delimit cr
                          Click image for larger version

Name:	sdid_weights1989.png
Views:	1
Size:	80.0 KB
ID:	1676021

                          Comment


                          • #28
                            Thanks Daniel! By the way I really appreciate the very useful command.

                            Comment


                            • #29
                              Thank you very much Daniel for making this package available!

                              I have an issue on mat size while using sdid. My understanding is that we can circumvent the maximum mat size (11000 in Stata MP) using mata. It appears to me that everything is done in mata in sdid.ado, so I am puzzled why I get mat size too small error. Do you have any idea on which portion of sdid.ado possibly causes this issue?

                              I would very much appreciate your response.
                              Kindest,
                              Hideto

                              Comment


                              • #30
                                Hi Hideto Koizumi , sorry I got here so late. Is the problem due to the estimation or the graphic option? I think maybe it's a problem with some matrix in the graphics section because, as you said, the sdid command and all the estimation is under the 'mata' code but there are a couple of stata's own matrices in part of the graphical output. By the way, I'm assuming you're using the latest version of sdid.

                                Comment

                                Working...
                                X