Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in differences including controls

    I am currently doing analysis in my master's thesis, which includes Difference in differences estimation using panel data from the US.
    I am doing this to provide similar estimates as a paper from Denmark, hence using the same model. However, I am struggeling to understand their way of writing the model and also I want to add some controls for state and income using Stata.

    I apologize for not being able to paste equations in here, hopefully snipped pictures will do.

    The paper of reference has written their model as follows;
    "...based on the following regression specification, which we run separately for the treatment and control groups:"
    Click image for larger version

Name:	eq1.PNG
Views:	1
Size:	8.6 KB
ID:	1654740

    "To be precise, the estimates in the richest specification in column (3) are based on the following DD regression:"
    Click image for larger version

Name:	eq2.PNG
Views:	1
Size:	16.3 KB
ID:	1654741

    However, I am not able to figure out why the sum symbol is included, how I should do it in Stata, and how I can add controls and whether they should be included in a sum symbol as well?

    I truly hope someone can help me out.


    Best,


    [IMG]file:///C:/Users/clara/AppData/Local/Temp/msohtmlclip1/01/clip_image002.gif[/IMG]

    [IMG]file:///C:/Users/clara/AppData/Local/Temp/msohtmlclip1/01/clip_image002.gif[/IMG]

  • #2
    My interpretation of this is that they are treating age as a discrete variable. So for any possible value of age, a, I(ageit = a) is an indicator ("dummy") variable for that age. The summation is then over the possible values of age. If there were only two age groups, say young and old, you would write this as byoungyoung + boldold. So sigma notation is just generalizing this to a large number of age categories. The first equation just estimates Y as a function of age and time t. The second equation represents a model that adds age#treatment interaction.

    Comment


    • #3
      What is the intervention and what units (people, cities, states) are receiving said intervention?

      Comment


      • #4
        Thank you very much, that helps me out a lot!

        So, in my model I would like to include income as a control, either as a contineous variable or as income divided into brackets. Would that be included the same way with a sigma summing all included incomes or income brackets?

        @Jared to answer your question, the intervention is the TCJA reform introduced by Trump in 2017 and the units are American households recieving treatment and being treatment groups depended on their net wealth.

        Comment


        • #5
          And also, time is included in the first equation but not in the second. But is that because time is automatically included when checking before and after the reform? Should income be included the same way as age in the second equation?

          Comment


          • #6
            Sorry for the numerous question appearing as I continue: could age and income alspo be included in X, being a vector of controls? Or might this be problematic if they are the outcomes of the treatment which are bad controls

            Comment


            • #7
              Indeed they can. I'd imagine that age causes, to some degree, income to rise, though.

              Just so you know, you don't need to worry about the sexy Greek notation, regression does the summing and multiplying and subtracting by itself. Equation 1 is unclear to me, it isn't how I would write it. Including income as categorical or continuous may be a matter of taste or not.....

              What's your outcome you're studying? The fact that you've described your question makes this so much easier now. The reason I ask, is because you may want or be interested in a slightly more advanced research design than what the authors are doing here. And, depending on your dependent variable, you may just be able to make this work...


              EDIT: Age wouldn't be an outcome of the treatment or the effects of the treatment (imagine if we got older because taxes were cut😂😂😂😂😂😂😂😂🤣🤣🤣🤣🤣🤣🤣), but income certainty could be affected by the intervention to a certain degree.
              Last edited by Jared Greathouse; 18 Mar 2022, 06:42.

              Comment


              • #8
                That is really helpful, Jared!
                And good to know. I think I would prefer to work with the notation I am used to. However, I have not worked with DD estimation before, which makes me question the way the model is built.

                I have 2 outcomes, så I'll be running the regression for those two seperately. The first is the how the extensive margin of housing decisions (whether households buy more/less houses or not) is affected from reductions in the mortgage deduction from the TCJA reform. The second is the intensive margin of housing decision (whether households that already own buys bigger/smaller or more expensive/cheaper housing).

                The be specific, the treatment group are those with a wealth > 6% of 750,000$ as 6% is the average downpayment rate required to loan for a house and 750,000$ (after the reform) is the maximum you can deduct mortgage rates from.
                I hope it makes sense?

                Makes sense with ae not being an outcome of treatment :D But I would say that disposable income would be, however not gross income?

                Comment

                Working...
                X