Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • parallel trend assumption

    Dear Statalists,
    I have a quick question, please.
    I have an unbalanced panel dataset over a period from 2000 to 2010. A standard was issued, and accordingly, firms started to adopt these standards, but the adoption is not compulsory, I mean there is no specific cut-off point, where the adoption process was simultaneous. To clarify, a group of firms adopted in 2005, while others adopted in 2006, and so on.
    I am using the generalized DID, and I have read that there is something called parallel trend assumption and based on what I understood that I can investigate the parallel trend assumption when the adoption at a specific cut off point, but I think I can’t use it in my case because there is no specific cut-off point. Do you think I am correct?
    Thanks.

  • #2
    Hi Omar,
    This is not a Stata question.
    What you are asking is basic shock-based causal inference practice. There are numerous papers and textbooks explaining the steps required to ensure reliable results. See, for example:
    Imbens, G., & Rubin, D. B. (2015). Causal inference: For statistics, social and biomedical sciences : An Introduction. https://doi-org.ezproxy.is.ed.ac.uk/...O9781139025751
    On your case of gradual adoption, see the section discussing "encouragement" design in randomised trails in:
    Atanasov, V., & Black, B. (2016). Shock-Based Causal Inference in Corporate Finance and Accounting Research. Critical Finance Review, 5(2), 207–304. http://dx.doi.org/10.1561/104.00000036

    Comment


    • #3
      Dear Maria,
      Thanks very much for the answer. Much appreciated.
      I apologize if this isn't a Stata question, but the thing is, about a year ago, I explained my case to decide which test I should use, and Clyde Schechter suggested the following paper:
      https://www.annualreviews.org/doi/pd...-040617-013507

      Clyde Schechter helped me so much in this regard and I am so grateful to him for the rest of my life.

      Accordingly, I have applied the generalized DID with a two-way fixed effect. The code was as below:
      Code:
      ge id= _n
      encode Companyname, gen(COMPANY)
      xtset COMPANY Year, yearly
      xtreg EM i.Event##(c.Size  c.Leverage  c.growth ) i.Year, fe cluster ( COMPANY)
      Where:
      Event: is a binary variable coded 1 for the firm-year observations after the adoption of the standards and zero for firm-year observations before the adoption of the standards.

      Kindly note that all firms by the end of the period have complied with the standards, so there is no control group, and I had to consider firm-year observation before the adoption of the standards as the control group.

      I have collected the data and run the analysis, and I have written the paper, and I have submitted it to a Journal. One of the reviewers suggested conducting the parallel trend assumption or what's called the common trends assumption.

      I just saw on page 457 in the paper above that both the simple and generalized DID rely on the common trend assumption, and I have read about the parallel trend assumption and it seems that although there is no statistical test for this assumption, visual inspection is useful when you have observations over many time points.
      So, I am wondering what is the code to graph that?

      Many thanks in advance for your help.

      Comment


      • #4
        Hi Omar,
        jumping to run a diff-in-diff before ensuring covariate balance, common support and that the parallel trends assumption holds is like regressing heart desease on coffee drinking without controlling for smoking and concluing that coffee kills. The sources I gave you in #2 are the go-to guides on proper academic practice in causal inference.
        On how to construct a graph to see if the parallel trends assumption holds, this is just a timeline of averages of all your covariates and the outcome by treated and control status. This code produces such time line graphs and saves them both as stata graphs and exports them as .png. Also it assumes that your time variable is year, and that all vars have labels and uses those as titles.
        Code:
        * put the name of your treated indicator below:
        local tr_var treated
        * put the list of your covariates and the outcome below in place of v1 v2 v3:
        local vars v1 v2 v3
        local v_max: word count `vars'
        * put the start and end years you want to graph below:
        local yr_st 2010
        local yr_end 2019
        local cond year >= `yr_st' & year <= `yr_end'
        
        forvalues i= 1/`v_max' {
        local v: word `i' of `vars'
        local lab: var label `v'
        egen mean_`v' = mean(`v'), by(`tr_var' year)
        line mean_`v' year if `tr_var' == 1 & `cond', c(L) ///
            || line mean_`v' year if `tr_var' == 0 & `cond', c(L) ///
            legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) ///
            title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') ///
            xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") ///
            saving(par_tr_`v', replace)
        graph export par_tr_`v'.png, replace
        }
        *
        Last edited by Maria Boutchkova; 11 Aug 2021, 03:14.

        Comment


        • #5
          Dear Maria Boutchkova , thanks a bunch for providing me with the code. Greatly appreciated.

          Actually, I have used it as mentioned in #4, and as below:

          Code:
          ge id= _n
          encode Companyname, gen(COMPANY)
          xtset COMPANY Year, yearly
          
          local Event treated
          local vars EM Leverage SizeAssets GrowthinTurnover
          local v_max: word count `vars'
          local yr_st 2010
          local yr_end 2019
          local cond Year >= `yr_st' & Year <= `yr_end'
          forvalues i= 1/`v_max' {
          local v: word `i' of `vars'
          local lab: var label `v'
          egen mean_`v' = mean(`v'), by(`tr_var' Year )
          line mean_`v' Year if `tr_var' == 1 & `cond', c(L) ///
              || line mean_`v' Year if `tr_var' == 0 & `cond', c(L) ///
              legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) ///
              title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') ///
              xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") ///
              saving(par_tr_`v', replace)
          graph export par_tr_`v'.png, replace}*
          But the STATA showed
          Code:
          ==1 invalid name
          r(198);
          I don't know what's wrong, could you please help.

          Many thanks in advance.

          Comment


          • #6
            the code in #5 does not include a command line defining the local "tr_var" which is later used but, since not defined, is empty; thus, you need to define that local (see the example in #4 which does define the local)

            Comment


            • #7
              Hi Rich Goldstein, Thanks very much for the response. I truly appreciate it.

              Okay, I have used the below code and I think I defined it as below:
              Code:
              local tr_var Event 
              local vars AP2 Leverage SizeAssets FCP Purchases GrowthinTurnover
              local v_max: word count `vars'
              local yr_st 2009
              local yr_end 2019
              local cond Year >= `yr_st' & Year <= `yr_end'
              forvalues i= 1/`v_max' {
              local v: word `i' of `vars'
              local lab: var label `v'
              egen mean_`v' = mean(`v'), by(`tr_var' Year )
              line mean_`v' Year if `tr_var' == 1 & `cond', c(L) ///
                  || line mean_`v' Year if `tr_var' == 0 & `cond', c(L) ///
                  legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) ///
                  title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') ///
                  xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") ///
                  saving(par_tr_`v', replace)
              graph export par_tr_`v'.png, replace}*
              The Stata showed me the following message:

              Code:
              option / not allowed
              r(198);
              What do you think?


              Comment


              • #8
                Stata does not like a / somewhere, get rid of the tripple slashes /// like so:
                Code:
                forvalues i= 1/`v_max' {
                local v: word `i' of `vars'
                local lab: var label `v'
                egen mean_`v' = mean(`v'), by(`tr_var' Year )
                line mean_`v' Year if `tr_var' == 1 & `cond', c(L) || line mean_`v' Year if `tr_var' == 0 & `cond', c(L) legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") saving(par_tr_`v', replace)
                graph export par_tr_`v'.png, replace
                }
                And note that the closing curly parenthesis of the loop sits alone on a new line.

                Comment


                • #9
                  returning to #7, the way to find which line is causing a problem is to use the -trace- command; see
                  Code:
                  help trace
                  in general, you also want to set -tracedepth- to a small number (I usually start with 1 and see if that gives me enough information)

                  Comment


                  • #10
                    Dear Maria Boutchkova,

                    Thanks a million for the codes and for the answers.

                    Actually, I have followed your recommendations and I have used the first code as below:

                    Code:
                    ge id= _n
                    encode Companyname, gen(COMPANY)
                    xtset COMPANY Year, yearly
                    
                    local tr_var Event
                    local vars AP2 Leverage SizeAssets FCP Purchases GrowthinTurnover
                    local v_max: word count `vars'
                    local yr_st 2009
                    local yr_end 2019
                    local cond Year >= `yr_st' & Year <= `yr_end'
                    forvalues i= 1/`v_max' {
                    local v: word `i' of `vars'
                    local lab: var label `v'
                    egen mean_`v' = mean(`v'), by(`tr_var' Year )
                    line mean_`v' Year if `tr_var' == 1 & `cond', c(L) 
                        || line mean_`v' Year if `tr_var' == 0 & `cond', c(L) 
                        legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) 
                        title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') 
                        xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") 
                        saving(par_tr_`v', replace)
                    graph export par_tr_`v'.png, replace
                    }
                    *
                    And the Stata showed me just one graph which is between Year and AP2

                    I will continue in next separate post to show you the results

                    Attached Files

                    Comment


                    • #11
                      Maria Boutchkova ,

                      Then, I have used the following code:

                      Code:
                      forvalues i= 1/`v_max' {
                      local v: word `i' of `vars'
                      local lab: var label `v'
                      egen mean_`v' = mean(`v'), by(`tr_var' Year )
                      line mean_`v' Year if `tr_var' == 1 & `cond', c(L) || line mean_`v' Year if `tr_var' == 0 & `cond', c(L) legend(order(1 "Treated" 2 "Controls") size(small) ) scheme(sj) title("`lab'", size(small)) ylabel(, labsize(vsmall) ) xlabel(`yr_st'(1)`yr_end') xscale(r(`yr_st' `yr_end')) xtitle("") ytitle("") saving(par_tr_`v', replace)
                      graph export par_tr_`v'.png, replace
                      }
                      And the Stata showed me a graph for each variable, I don't know if I should do that for all variables or the graph should be just one. I will show you an example as below:


                      Based on the graph, I think the assumption doesn't meet, or there might be something wrong, I don't know, what do you think?


                      Attached Files

                      Comment


                      • #12
                        You should show all graphs to your reviewer. You can combine them together to safe space. But first make sure your data is properly organised and/or the graphs show what you intend. From the 2nd graph it seems to me that there is no data for the controls after 2010. How is your event variable defined? Perhaps that is not the treated indicator we expect, but a pre-post indicator?

                        Comment


                        • #13
                          Well, thanks very much for the answer Maria Boutchkova
                          The thing is the time frame of my sample is from 2009-2019, and firms started to adopt standards from 2013, then other groups adopted in 2014, then other groups adopted in 2015, and so on, and by 2018 all firms included in the study sample adopted the standards.
                          So, I have defined the Event as a binary variable coded one for all firm-year observations after the adoption of the standards and zero for all firm-year observations before the adoption. For instance, firm A has data since 2009-2019, this firm adopted in 2014, so firm-year observations from 2009-2013 coded zero, and from 2014 till 2019 coded 1.
                          But, I am thinking from another perspective which is:
                          I have found that around 52 firms adopted in 2013, while 41 adopted in 2014, and the rest of the firms adopted in 2015. So, I am planning to do divide the sample into three subgroups, i.e., those adopted in 2013, and those adopted in 2014, and 2015. And, each firm in each group have data from 2009-2019.
                          For instance Firm X adopted in 2015, and this firm have data from 2009 until 2019. So, in this case, Event will be coded zero for all observations from 2009-2014, and 1 coded from 2015-2019. In that way, I can say that there is a cut-off point for this group which is 2015, it is like saying that the adoption is mandatory for this group in 2015, then I can perform the analysis to see the trend before and after 2015, but I am just wondering what is the usuall code for the parallel trend assumption in case we have just one cut-off point?


                          Comment

                          Working...
                          X