I recently switched from Stata to Python for one project.
However, I found that the areg function was much faster than just adding dummies in a Python regression. After a bit of research, I found that areg is not actually using dummies, but subtracting group means (this is why it is so fast also with many groups). You can test this by the following code:
sysuse auto, clear
drop if missing(rep78)
foreach var of varlist price weight length foreign {
bys rep78: egen group_mean = mean(`var')
qui sum `var'
gen double `var'_star = `var' - group_mean + r(mean)
drop group_mean
}
reg price_star weight_star length_star foreign_star
It will lead to the same result as
areg price weight length foreign , absorb(rep78)
My question is now:
When I manually transform dummy variables, they will not be binary anymore! (foreign vs. foreign_star)
Can I then still interpret their coefficients as I am used to in a "normal" regression without transformation?
Thank you
However, I found that the areg function was much faster than just adding dummies in a Python regression. After a bit of research, I found that areg is not actually using dummies, but subtracting group means (this is why it is so fast also with many groups). You can test this by the following code:
sysuse auto, clear
drop if missing(rep78)
foreach var of varlist price weight length foreign {
bys rep78: egen group_mean = mean(`var')
qui sum `var'
gen double `var'_star = `var' - group_mean + r(mean)
drop group_mean
}
reg price_star weight_star length_star foreign_star
It will lead to the same result as
areg price weight length foreign , absorb(rep78)
My question is now:
When I manually transform dummy variables, they will not be binary anymore! (foreign vs. foreign_star)
Can I then still interpret their coefficients as I am used to in a "normal" regression without transformation?
Thank you
Comment