I have two regressions using common experimental panel data collected at the subject-period level for three periods. The first regression specification is a limited version of the second. The data are organized into two levels of random assignment. The first level is to one of four "thresholds", call them 100 200 300 400. The second level is to one of three "discounts", call them 0 50 75. The outcome of interest is calculated in gallons.
The first regression includes dummy variables for each treatment group. The second one includes interactions:
The number of subjects is in the hundreds of thousands and clustering occurs at the subject level.
In an effort to show that the simpler model is "equivalent" to the interaction model, I want to calculate p-values from an F-test with the following definition:
SS_simple = sum of squared residuals for simple model
SS_full = sum of squared residuals for interaction model
df_simple = degrees of freedom from the simple model
df_full = degrees of freedom from the full model
s = df_simple - df_full
F = ((SS_simple - SS_full) / s) / (SS_full / df_full)
I would imagine that i can determine that the two models are different if F is greater than the 1 - a percentile in the F(s, df_full) distribution where a is is the level of significance.
--
I would like to save these SS, df and relevant critical values when calculating the regressions, then do the f-tests/pval afterward. I have two questions:
1) How can I get the SS, df, and cricitical values from the regression output
2) Is it possible to extract the p-val from the F test specified above in stata?
Please let me know if I can clarify anything at all and I would be happy to do so.
The first regression includes dummy variables for each treatment group. The second one includes interactions:
The number of subjects is in the hundreds of thousands and clustering occurs at the subject level.
Code:
g absorb constant = 1 *Model 1: simple model reghdfe gallons thresh_100_D thresh_200_D thresh_300_D thresh_400_D disc_0_D disc_50_D disc_75_D, absorb(absorb_constant) vce(cluster subject) *Model 2: interaction model reghdfe gallons h_100_0D h_100_50D h_100_75D h_200_0D h_200_50D h_200_75D h_300_0D h_300_50D h_300_75D h_400_0D h_400_50D h_400_75D /// absorb(absorb_constant) vce(cluster subject)
SS_simple = sum of squared residuals for simple model
SS_full = sum of squared residuals for interaction model
df_simple = degrees of freedom from the simple model
df_full = degrees of freedom from the full model
s = df_simple - df_full
F = ((SS_simple - SS_full) / s) / (SS_full / df_full)
I would imagine that i can determine that the two models are different if F is greater than the 1 - a percentile in the F(s, df_full) distribution where a is is the level of significance.
--
I would like to save these SS, df and relevant critical values when calculating the regressions, then do the f-tests/pval afterward. I have two questions:
1) How can I get the SS, df, and cricitical values from the regression output
2) Is it possible to extract the p-val from the F test specified above in stata?
Please let me know if I can clarify anything at all and I would be happy to do so.
Comment