Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing for parallel trends with conditional means

    Dear colleagues,
    I ran a difference-in-differences model and I would like to visually validate the parallel-trends assumption using conditional means rather than ordinary means. If I am using ordinary means, I would run the commands below. But how can I test for parallel trends if I would like to condition my (mean) outcome variable on the two variables "control1" and "control2"? I attach a sample of the data I am using. Thanks in advance.
    Code:
        xtset id year
        collapse (mean) outcome, by(treated year)
        xtset treated year
        xtline outcome, overlay title(outcome_variable)
    id year outcome control1 control2 treated
    1 2015 30 14.47 5.94 0
    1 2016 45 14.47 5.94 0
    1 2017 20 14.47 5.94 0
    1 2018 15 14.47 5.94 0
    1 2019 40 14.47 5.94 0
    1 2020 65 14.47 5.94 0
    2 2015 40 42.89 3.64 1
    2 2016 55 42.89 3.64 1
    2 2017 30 42.89 3.64 1
    2 2018 25 42.89 3.64 1
    2 2019 50 42.89 3.64 1
    2 2020 100 42.89 3.64 1
    3 2015 20 48.52 7.93 0
    3 2016 35 48.52 7.93 0
    3 2017 10 48.52 7.93 0
    3 2018 5 48.52 7.93 0
    3 2019 30 48.52 7.93 0
    3 2020 55 48.52 7.93 0

  • #2
    I am assuming you are talking about parallel time trends
    without using the xtset and running
    reg outcome year control1 control2 i.id i.treated

    the coefficient of year would be the slope of the time trend

    to test if the slopes of time trend are different between treated and non-treated categories
    reg outcome year control1 control2 i.id i.treated i.treated#c.year
    if the coefficient of i.treated#c.year is significant, the the slope of time trend differs from the time trend without the interaction term


    Comment


    • #3
      Oscar Ozfidan Many thanks for your prompt response. This is one way to test for parallel trends. However, I am trying to plot the conditional mean (see Figure below). That is, I am testing for parallel trends but based on observables.
      Click image for larger version

Name:	Screen Shot 2021-05-21 at 1.25.27 PM.png
Views:	1
Size:	1.41 MB
ID:	1610980

      Last edited by amira elshal; 21 May 2021, 05:31.

      Comment


      • #4
        You need to explain what you are trying to do in a little bit more detail. Are you saying what I have shown addresses the test but now you want to plot them? Or are you saying my assumption that testing trends for treated vs untreated is not what you were looking for? If the latter, you need to clarify parallel trends of what vs what if the desired comparison is not between the categories of treated. If what remains is a strictly plotting issue, I dont think I can be of help since I have very poor knowledge of Stata graphs.

        Comment


        • #5
          Perhaps it would be less confusing if the authors used the terms "Predicted values" or "Fitted values". In short, linear regression estimates the conditional mean of the outcome variable. So run a regression and use the fitted values.

          Comment


          • #6
            Oscar Ozfidan Thanks for your message. I am sorry if I have been unclear. I am trying to test for parallel trends between treated and untreated units. But, instead of using the mean outcomes, I would like to use the conditional mean outcomes (i.e., conditioning mean outcomes on my two control variables). These two control variables, or call them observables, I include in the difference-in-differences original regression.

            Comment


            • #7
              Andrew Musau Yes, I think that is what I am trying to do. May you, please, advise on how can I use the predicted/fitted outcome values to test for parallel trends between treated and untreated units? May you, please, provide the Stata codes? I am a little bit confused here I am afraid.

              Comment


              • #8
                @Andrew Musau In the plot she shared, dotted and undotted lines are not predicted vs actual values despite the general convention to show them like that. They are actually the the data of two groups i.e <100 mile and >100 mile. I think she is interested in testing the trends that goes through the dotted line and the undotted line. So, if that is the case, she needs to drop the year from the reg and keep the year treated interaction.
                reg outcome control1 control2 i.id i.treated i.treated#c.year

                if she wants to choose a particular treated group as the base trend lets say 0 group she can use

                reg outcome control1 control2 i.id i.treated ib0.treated#c.year

                after running that the coefficient of treated==1#c.year would indicate if the trend for treated==1 is significantly different than for the trend when treated==0.

                Comment


                • #9
                  It appears that you have panel data. Are your controls time varying?


                  Code:
                  xtset id year
                  xtreg outcome controls, fe
                  predict outcomehat, xbu
                  Here, the regression controls for your specified controls and individual fixed effects. It requires that your controls are time-varying. Otherwise, with time-invariant controls, just run simple OLS

                  Code:
                  regress outcome controls
                  predict outcomehat, xb
                  where variable "outcomehat" holds your predicted outcome.
                  Last edited by Andrew Musau; 21 May 2021, 06:41.

                  Comment


                  • #10
                    Originally posted by Oscar Ozfidan View Post
                    @Andrew Musau In the plot she shared, dotted and undotted lines are not predicted vs actual values despite the general convention to show them like that. They are actually the the data of two groups i.e <100 mile and >100 mile. I think she is interested in testing the trends that goes through the dotted line and the undotted line. So, if that is the case, she needs to drop the year from the reg and keep the year treated interaction.
                    reg outcome control1 control2 i.id i.treated i.treated#c.year

                    if she wants to choose a particular treated group as the base trend lets say 0 group she can use

                    reg outcome control1 control2 i.id i.treated ib0.treated#c.year

                    after running that the coefficient of treated==1#c.year would indicate if the trend for treated==1 is significantly different than for the trend when treated==0.

                    Yes, I did not focus on the specific details. Just the fact that the authors by "conditional mean of real income" mean "predicted values of real income".

                    Comment


                    • #11
                      Andrew Musau Many thanks, Andrew. Yes, that is what I meant. But, I think, instead of "xb," it is "res" as follows:
                      Code:
                       predict outcomehat, res
                      I think those residuals are the conditional mean, as if we are obtaining the mean after taking away the variation explained by the controls. The rationale is that we account for these controls in the difference-in-differences specification. I hope that I am not mistaken.

                      Comment


                      • #12
                        I think those residuals are the conditional mean
                        I disagree. Residuals are not the conditional means of the outcome. As you state, they are defined as \(e_{i}= y_{i}- \widehat{y}_{i}\) for \(i= 1, \cdots, N\) where \(y_{i}\) is the value of the \(i_{th}\) outcome and \(\widehat{y}_{i}\) is the conditional fitted value.

                        Comment


                        • #13
                          amira elshal : you are providing information by bits and pieces which is not simplifying the life of the people wanting to you. On the basis of the header of the figure you posted I found the paper
                          The paper describes in detail what the authors do, and moreover if you go the website of the paper: https://www.aeaweb.org/articles?id=10.1257/pol.5.2.1 , there is a link to the dataset and to the do files.

                          Andrew Musau :I agree with what you say in post #12, but if you look at the do file for Figure 3, it reads:

                          Code:
                           Figure 3 - Conditional mean of real income ;
                          xi3: reg y_rel schooling age  isfemale electricity water  if  ocu500==1  & codpers==1 [pweight=factor] ;
                          capture drop detrend ;
                          predict detrend, resid ;
                          table year d2 [pw=factor] , c(mean detrend) ;
                          Sorry, I can't get the lines above to wrap properly, so I have inserted semi-colon to separate the lines of code
                          I don't know what the authors want really because in a footnote to the paper they say "The mean is conditional on schooling, age and gender of the household head, and access to piped water and electricity." I won't be coming back here soon.

                          On Edit, it wrapped !!!
                          Last edited by Eric de Souza; 21 May 2021, 08:12.

                          Comment


                          • #14
                            Eric de Souza Thanks for your help, much appreciated. I have downloaded the data and do files and will go thoroughly through them.

                            Comment


                            • #15
                              Thank you Eric de Souza for the additional information. So, here what the authors are doing is detrending the variable "y_rel" and referring to the result as the conditional mean of real income. You are correct amira elshal if you are exactly following what the authors do.

                              Comment

                              Working...
                              X