Difference: Country/year FE incorporated in one categorical variable using areg/reghdfe VS. country/year FE separately using reghdfe?

Caro Gunesch

Join Date: Jun 2018
Posts: 1

Difference: Country/year FE incorporated in one categorical variable using areg/reghdfe VS. country/year FE separately using reghdfe?

15 Jun 2018, 07:41

Hey Stata-community

I am currently analyzing the effect of moisture change (ADsm0_2moistu) on the urbanization change rate (ADurbfrac) on district (afruid) level in India, following the example by Henderson, Storeygard and Deichmann (2017) --> https://sites.google.com/site/adamstoreygard/ "Has climate change driven urbanization in Africa?" (on the provided website you'll find the data set they used and also their .do file for the regressions; I basically replicated the data set and code for India. Difference is that FE in my case is on state level instead of country level.)

The regression analysis is following:
$u_{ijt} = \beta_1 w_{ijt} + \beta_2 X^{\prime}_{ij} + \beta_3 X^{\prime}_{ij} w_{ijt} + \alpha_{jt} +\epsilon_{ijt}$

i=district, j=country, t=time
u=annualized urbanization growth
w=annualized moisture growth
X= vector of time-invariant characteristics (e.g. lndisct --> distance to coast)

--> in the literature they use the categorical variable countryyear incorporating both country/year fixed effect. They regress using the areg command.

Code:

areg ADurbfrac ADsm0_2moistu firsturbfrac lndiscst if abspctileADsm0_2moistu>6 & abspctileADurbfrac>6
> , absorb(countryyear) vce(cluster afruid)

Linear regression, absorbing indicators Number of obs = 717
F( 3, 358) = 48.46
Prob > F = 0.0000
R-squared = 0.3872
Adj R-squared = 0.3302
Root MSE = 0.0342

(Std. Err. adjusted for 359 clusters in afruid)
-------------------------------------------------------------------------------
| Robust
ADurbfrac | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
ADsm0_2moistu | -.07611 .1801611 -0.42 0.673 -.4304171 .2781971
firsturbfrac | -.0488972 .0055254 -8.85 0.000 -.0597635 -.0380309
lndiscst | .0014311 .0018852 0.76 0.448 -.0022764 .0051386
_cons | .028879 .0120392 2.40 0.017 .0052024 .0525555
--------------+----------------------------------------------------------------
countryyear | absorbed (59 categories)

Using reghdfe instead, provides the same results (it just suppresses the _const):

Code:

reghdfe ADurbfrac ADsm0_2moistu firsturbfrac lndiscst if abspctileADsm0_2moistu>6 & abspctileADurbfrac>6, absorb(countryyear) vce(cluster afruid)

However results are different when I use a country-FE and year-FE separately using reghdfe.

Code:

egen countryvar = group(iso3v10)
reghdfe ADurbfrac ADsm0_2moistu firsturbfrac lndiscst if abspctileADsm0_2moistu>6 & abspctileADurbfrac>6, absorb(countryvar year) vce(cluster afruid)

. reghdfe ADurbfrac ADsm0_2moistu firsturbfrac lndiscst if abspctileADsm0_2moistu>6 & abspctileADurbfra
> c>6, absorb(countryvar year ) vce(cluster afruid)
(converged in 18 iterations)

HDFE Linear regression Number of obs = 717
Absorbing 2 HDFE groups F( 3, 358) = 47.51
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.3315
Adj R-squared = 0.2748
Within R-sq. = 0.0795
Number of clusters (afruid) = 359 Root MSE = 0.0356

(Std. Err. adjusted for 359 clusters in afruid)
-------------------------------------------------------------------------------
| Robust
ADurbfrac | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
ADsm0_2moistu | .2010312 .1738053 1.16 0.248 -.1407765 .5428389
firsturbfrac | -.0497331 .0055398 -8.98 0.000 -.0606277 -.0388384
lndiscst | .0008949 .0018817 0.48 0.635 -.0028057 .0045955
-------------------------------------------------------------------------------

Absorbed degrees of freedom:
----------------------------------------------------------------+
Absorbed FE | Num. Coefs. = Categories - Redundant |
--------------+-------------------------------------------------|
countryvar | 30 30 0 |
year | 24 33 9 |
----------------------------------------------------------------+

So the difference is that here the fixed effects are represented as own cat variables (countryvar year), whereby countryyear (like above) incorporates both..

My questions would be:

a) why are there different results although countryyear basically absorbs both country and year effects in one value?
b) which approach is more appropriate, when I want to incorporate fixed effects on state level and for the years?

c) and a bit off the topic: How can I interpret a coefficient of ADsm0_2moistu (which is the annualized change of moisture) ? One unit more of ADsm0_2moistu means what exactly in terms of the annualized change rate of urbanization?

I hope my issue is illustrated clearly. I'll be super happy to get some answers!

Best, Carolin

Last edited by Caro Gunesch; 15 Jun 2018, 07:45.

Tags: None

Andrew Musau

Join Date: Oct 2014

Posts: 10062
#2

17 Jun 2018, 11:39

a) why are there different results although countryyear basically absorbs both country and year effects in one value?

$$u_{ijt} = \beta_1 w_{ijt} + \beta_2 X^{\prime}_{ij} + \beta_3 X^{\prime}_{ij} w_{ijt} + \alpha_{jt} +\epsilon_{ijt}$$

So in your specification above, $\alpha_{jt}$ is a country-year fixed effect controlling for country-time varying effects. Here, you multiply the time effect and the country effect. Denote these as $\eta_{j}$ and $\mu_{t}$ and you can define the country-year effect as

$$\alpha_{jt} = \eta_{j} \times \mu_{t}$$

In your second specification, you are specifying the model

$$u_{ijt} = \beta_1 w_{ijt} + \beta_2 X^{\prime}_{ij} + \beta_3 X^{\prime}_{ij} w_{ijt} + \eta_{j} + \mu_{t} +\epsilon_{ijt}$$

Here, you are controlling for both time invariant country effects and country invariant time effects. So country and year is not the same as country-year. One effect is additive and the other is multiplicative. In areg, the only way to include two levels of fixed effects is to use dummy variables, e.g.,

Code:

areg ADurbfrac ADsm0_2moistu firsturbfrac lndiscst i.year if abspctileADsm0_2moistu>6 /// & abspctileADurbfrac>6, absorb(countryvar) vce(cluster afruid)

b) which approach is more appropriate, when I want to incorporate fixed effects on state level and for the years?

This depends on your research question and identification strategy. I am sure the authors have justified their use of a country-year effect and if you are exactly replicating their research, probably their specification is correct unless you have reasons to doubt it.

c) and a bit off the topic: How can I interpret a coefficient of ADsm0_2moistu (which is the annualized change of moisture) ? One unit more of ADsm0_2moistu means what exactly in terms of the annualized change rate of urbanization?

You interpret fixed effects coefficients exactly as OLS coefficients. A one unit increase in the independent variable increases (decreases) the dependent variable by XX units holding all other variables constant and after controlling for time-year fixed effects (note time-year and not time and year). You need to know what units your independent variable and dependent variable are measured in. It could be, for example, a 1% increase in moisure levels increases the rate of urbanization by 0.2% (if both are in percentages).
1 like
Comment

Announcement

Difference: Country/year FE incorporated in one categorical variable using areg/reghdfe VS. country/year FE separately using reghdfe?

Comment