Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why do I get different results for reg y a i.b a#i.b then for reg y a##i.b ?

    Hi there,

    I have this set up for a difference in differences analysis of housing transactions, where I would like to test for the parallel trend assumption. Amongst other things, I would like to plot the year coefficients. Therefore, I would like coefficients for each year in the treated and untreated areas. Then, I would like to see non-significant coefficients in the pre-treatment periods of the year*treatment coefficients and significant coefficients in the post-treatment years for the year*treatment coefficients.

    I thought I could obtain this by regressing:

    Code:
    reg lnprice lnsize treated b2017.trans_year treated#b2017.trans_year
    From this, I get some nice coefficients:

    Click image for larger version

Name:	#.png
Views:	1
Size:	42.2 KB
ID:	1766336


    However, when I later regressed it with a double ##, I got different results; why?

    Code:
    reg lnprice lnsize treated##b2017.trans_year

    Click image for larger version

Name:	##.png
Views:	1
Size:	40.8 KB
ID:	1766337


    I was under the impression that this syntax was synonymous. Is it not?

    Is there someone that could shed some light on this? And help me out of my confusion? Does it have something to do with the reference category?

    Many thanks!




  • #2
    They are the same, it's just that the reference category has changed.

    Comment


    • #3
      Jip:
      you can add the -allbaselevels- option to better understand what's going on.
      In addition, you may want to compare the fitted values (and the residuals) obtained from the different codes, which are the same, actually.
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        The question in the title is different from the body of the post. The syntax is explained in -help fvvarlist-

        One hash mark (#) means just the multiplicative interaction term. Two hash marks (##) is expanded to the so called main effects and the interaction term.

        Comment


        • #5
          Thank you all for the swift replies!

          The -allbaselevels- option is very useful, I hadn't stumbled upon that one yet.

          However, I cannot wrap my head around this yet. How can all the untreated years be a reference? I was under the impression that they were omitted due to collinearity, but that does not seem to be the case.

          Click image for larger version

Name:	# allbaselevels.png
Views:	1
Size:	48.7 KB
ID:	1766353



          Comment


          • #6
            Jip:
            If that were the case, you should have noticed some warning messages just above the outcome tables about collinearity or lack of observations.
            In addition, please get yourself familiar with CODE delimiters, which are the best way to share what you typed and what Stata gave you back (as per FAQ). Thanks.
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              You need to carefully consider how your variables are being setup when you specify them in your regression.

              1) You told Stata to consider -treated- as a continuous variable. This is the default when you don't specify a prefix, as in -treated- (2nd coefficient in your table). You also told Stata to consider -treated- as a factor variable. This is the default in interactions, though many users prefer to make that notation explicit using -i.treated-. This is legal and valid syntax, but it is recommended to treat each variable in the same way wherever it's used in the same model, especially if later you wish to use -margins- or similar postestimation commands.

              2) It's become the standard now to use GLM-style factor variable coding. This means that one level is always a reference level, and that level is excluded from the model because all other levels are constructed to be comparisons back to this reference. Stata uses the lowest level by default to be the reference. As such, the -treated==0- level is the reference, and because of this, it's coefficient is fixed at 0 and therefore anything interacted with it will also be zero (because zero times anything is zero). This is why you see the first block of interaction coefficients labelled as "0 (base)". The interaction base level applies to all factor variables involved in interactions (that is, each one gets a reference, and all coefficients at any reference level are omitted).

              3) Related to both remarks above, you have a different reference level for -tran_year- when its own cofactor versus what it should be in the interaction.

              You probably want to specify in your model syntax

              Code:
              ... i.treated##ib2007.tran_year ... // or equivalently
              ... i.treated ib2007.tran_year i.treated#i.tran_year

              Comment

              Working...
              X