Why do I get different results for reg y a i.b a#i.b then for reg y a##i.b ?

Jip Claassens

Join Date: May 2016

Posts: 7
#1

Why do I get different results for reg y a i.b a#i.b then for reg y a##i.b ?

24 Oct 2024, 09:01

Hi there,

I have this set up for a difference in differences analysis of housing transactions, where I would like to test for the parallel trend assumption. Amongst other things, I would like to plot the year coefficients. Therefore, I would like coefficients for each year in the treated and untreated areas. Then, I would like to see non-significant coefficients in the pre-treatment periods of the year*treatment coefficients and significant coefficients in the post-treatment years for the year*treatment coefficients.

I thought I could obtain this by regressing:

Code:

reg lnprice lnsize treated b2017.trans_year treated#b2017.trans_year

From this, I get some nice coefficients:

However, when I later regressed it with a double ##, I got different results; why?

Code:

reg lnprice lnsize treated##b2017.trans_year

I was under the impression that this syntax was synonymous. Is it not?

Is there someone that could shed some light on this? And help me out of my confusion? Does it have something to do with the reference category?

Many thanks!
Tags: None
Fernando Furquim

Join Date: May 2014

Posts: 20
#2

24 Oct 2024, 10:13

They are the same, it's just that the reference category has changed.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#3

24 Oct 2024, 10:33

Jip:
you can add the -allbaselevels- option to better understand what's going on.
In addition, you may want to compare the fitted values (and the residuals) obtained from the different codes, which are the same, actually.

Kind regards,
Carlo
(Stata 19.0)
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#4

24 Oct 2024, 10:46

The question in the title is different from the body of the post. The syntax is explained in -help fvvarlist-

One hash mark (#) means just the multiplicative interaction term. Two hash marks (##) is expanded to the so called main effects and the interaction term.
1 like
Comment
Jip Claassens

Join Date: May 2016

Posts: 7
#5

24 Oct 2024, 11:37

Thank you all for the swift replies!

The -allbaselevels- option is very useful, I hadn't stumbled upon that one yet.

However, I cannot wrap my head around this yet. How can all the untreated years be a reference? I was under the impression that they were omitted due to collinearity, but that does not seem to be the case.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#6

24 Oct 2024, 14:32

Jip:
If that were the case, you should have noticed some warning messages just above the outcome tables about collinearity or lack of observations.
In addition, please get yourself familiar with CODE delimiters, which are the best way to share what you typed and what Stata gave you back (as per FAQ). Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2402
#7

24 Oct 2024, 17:52

You need to carefully consider how your variables are being setup when you specify them in your regression.

1) You told Stata to consider -treated- as a continuous variable. This is the default when you don't specify a prefix, as in -treated- (2nd coefficient in your table). You also told Stata to consider -treated- as a factor variable. This is the default in interactions, though many users prefer to make that notation explicit using -i.treated-. This is legal and valid syntax, but it is recommended to treat each variable in the same way wherever it's used in the same model, especially if later you wish to use -margins- or similar postestimation commands.

2) It's become the standard now to use GLM-style factor variable coding. This means that one level is always a reference level, and that level is excluded from the model because all other levels are constructed to be comparisons back to this reference. Stata uses the lowest level by default to be the reference. As such, the -treated==0- level is the reference, and because of this, it's coefficient is fixed at 0 and therefore anything interacted with it will also be zero (because zero times anything is zero). This is why you see the first block of interaction coefficients labelled as "0 (base)". The interaction base level applies to all factor variables involved in interactions (that is, each one gets a reference, and all coefficients at any reference level are omitted).

3) Related to both remarks above, you have a different reference level for -tran_year- when its own cofactor versus what it should be in the interaction.

You probably want to specify in your model syntax

Code:

... i.treated##ib2007.tran_year ... // or equivalently ... i.treated ib2007.tran_year i.treated#i.tran_year
1 like
Comment

Announcement

Why do I get different results for reg y a i.b a#i.b then for reg y a##i.b ?

Comment

Comment

Comment

Comment

Comment

Comment