Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in Differences Marginsplot


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(expm hhid) int intyear double(distance treatment1 post) long regionE
     1599413.375 1010140020171 2008 .71 0 0 3
       2120390.5 1010140020171 2010 .71 0 1 3
         1864100 1010140020171 2012 .71 0 1 3
    419666.65625 1010140020284 2008 .71 0 0 3
     1179057.125 1010140020284 2010 .71 0 1 3
    end
    label values regionE regionE
    label def regionE 3 "Dodoma", modify

    Hi All,

    Expm is a continuous currency variable, intyear is year, treatment is a binary for distance at a certain threshold, and post is after the intervention takes place in 2010. The regression I'm using is:

    reg expmr treatment1##post post2 i.intyear i.regionE

    Two questions:
    1: Does this regression look correct to control for time and region?

    2. If I wanted to make a margins plot for this how would I do it?

    Kind regards,

  • #2
    Your -reg- command includes a variable, post2, that does not exist in your example data, and is not mentioned in #1. Nor is there any obvious interpretation for what it might be. It makes no sense as a (failed) try at a quadratic term for post, because post is 0/1, so post2 = post, and it will just get dropped. If you eliminate post2, or if there is really such a variable and it makes sense to include it in the model, then this looks plausible. Note that with observational data you cannot "control" anything, you adjust for the influence of time and region. (Yes, I know people often say "control" in this context and I'm being pedantic.)

    You should think about whether you really want to represent intyear as a discrete variable, or whether treating it as a continuous time trend might be better. I'm not seeing either way is right or wrong--that's a substantive question. I'm just saying don't include it discretely just out of habit or custom--think about what the real world data generating process is and model it as closely as you feasibly can.

    It's not clear what kind of plot you hope to derive from this. Since treatment1 and post are both dichotomous variables, there isn't a lot going on for graphical display. You could do something like this:
    Code:
    margins treatment1#post
    marginsplot, xvariable(post)
    which will give you a line graph showing the change in expected value of expm from before to after the treatment was implemented in both groups. But it's not a very interesting graph and I don't see how it adds much to understanding.

    Other unsolicited advice: why are you dichotomizing the distance variable. Dichotomizing a continuous variable is almost always a bad idea: at best it discards information. At worst it introduces noise, and the cutpoint for dichotomization can easily be manipulated to bias the results, too. Why not keep distance as a continuous variable? That would even lead to an interesting graph at the end, where you could plot the expected value of expm as a function of distance separately in the pre- and post-intervention eras.

    Comment

    Working...
    X