Firm fixed effects and Robust Standard Errors Clustered at the Country-Year Level

Daniela Fuji

Join Date: Jul 2017

Posts: 15
#16

15 Aug 2017, 11:17

I do indeed have firms located in different countries..
However, your #15 answers perfectly my questions and clarifies my doubts now! I can't thank you enough. Your help meant a great deal to me!
Comment
Verena Johnsson

Join Date: Aug 2017

Posts: 1
#17

17 Aug 2017, 01:13

Hi Daniela, Sergio, Andrew and stata-community,

I have a similar question that I have not found a definite answer for although reading into it. I have a panel of firms that sell several products. Products are nested in firms. I want to run a FE-regression on product level including product and year FE. I am struggling a bit on deciding upon the appropriate way to cluster Standard errors. It makes a lot of sense to cluster SE on Firms level (as products are nested in firms). Papers talking about this issue in cross-sectional data argue that clustering on the highest level is sufficient for a nested data structure. Though I wonder if this is true for a panel, too.
SE "of products" can on the one hand correlate within a firm and within a product over time. Did you already face this problem and find a good answer?
Thank you!
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#18

04 Aug 2020, 05:30

Dear Stata members
I am learning the clustering option myself and I using Stata forum for learning purpose. I have a doubt

@Andrew Musau #9

Clustering at the country level is fine if you believe that the interdependence exists within countries. However, by clustering at the country-year level, you are constraining this interdependence to particular years: Observations of firms in China in 2015 are not independent but these observations are independent to those of China in 2016. This is a very strong and precarious assumption since the observations mostly belong to the same firms i.e., you are ruling out temporal interdependence. My suggestion is to just cluster at the country level

Sorry for opening this thread again which is relatively old and I am not sure if any additional information is available or not. So I thought to open this thread and ask my doubt here.
In the above quote it is written that by clustering at the country-year /firm level, you are constraining this interdependence to particular years. Now, what does that mean? For instance, if I cluster say by industry, does that mean within industries, the firms are interdependent(correlated) and amongst industries, there is no such interdependence. Also, this interdependence amongst firms assumes that there is no such relation between firms in an industry in different years? I am not sure whether I put it correctly or not.
Also, in that case, is it not required to cluster by industry/firm along with year always
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10015
#19

04 Aug 2020, 06:08

Clustering at the country level is fine if you believe that the interdependence exists within countries. However, by clustering at the country-year level, you are constraining this interdependence to particular years: Observations of firms in China in 2015 are not independent but these observations are independent to those of China in 2016

So the data here consists of firms in different countries observed over some time period. A country-year defines observations of firms in a country and year, e.g., firms in China in 2005. If we cluster by country-year, we allow errors belonging to these firms in a year to be correlated, but not errors belonging to the same firms in different years. Abadie et al. (2017) have some new insights on when clustering is advised which I highly recommend.

https://economics.mit.edu/files/13927

Last edited by Andrew Musau; 04 Aug 2020, 06:18.
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#20

04 Aug 2020, 06:39

@Andrew Musau #19
Thank you, Andrew, before I wrap I have few doubts and if you can help me here, it can augment my learning.

if we cluster by country-year, we allow errors belonging to these firms in a year to be correlated, but not errors belonging to the same firms in different years

For this, we must do as per your advice

Code:

*TO INSTALL TYPE ssc install reghdfe reghdfe depvar indepvar, absorb(company year) vce(cluster country #year)

Am I right?

In a post, I have read, "If you wanted to cluster by industry and year, you would need to create a variable which had a unique value for each industry-year pair. These standard errors would allow observations in the same industry/year to be correlated (i.e. different firms), but would assume that observations in the same industry, but different years, are assumed to be uncorrelated. To allow observations which share an industry or share a year to be correlated, you need to cluster by two dimensions (industry and year)".

The first part is exactly what you said, Right?

Code:

*TO INSTALL TYPE ssc install reghdfe reghdfe depvar indepvar , absorb(company year) vce(cluster industry#year)

But how to account for the second part, allowing observations that share an industry or share a year to be correlated?
Extremely sorry to trouble you but I am mired in this
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10015
#21

04 Aug 2020, 06:50

Am I right?

In a post, I have read, "If you wanted to cluster by industry and year, you would need to create a variable which had a unique value for each industry-year pair. These standard errors would allow observations in the same industry/year to be correlated (i.e. different firms), but would assume that observations in the same industry, but different years, are assumed to be uncorrelated. To allow observations which share an industry or share a year to be correlated, you need to cluster by two dimensions (industry and year)".

Yes, I agree with the statement. reghdfe (SSC) now supports multi-way clustering (was not the case as at the initial post in this thread).

#1 industry-year clusters

Code:

reghdfe depvar indepvar, absorb(absorbvars) vce(cluster industry#year)

#2: industry and year clusters

Code:

reghdfe depvar indepvar, absorb(absorbvars) vce(cluster industry year)
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#22

04 Aug 2020, 06:57

Dear Andrew
In the case of code 1 it implies, errors belonging to these firms in an industry as well as in a year will be correlated, but not errors belonging to the same firms in different years. In the second code, errors will be correlated amongst firms in an industry as well as in year(two correlation)
Am I right?
If yes, then I assume I have understood a bit

Last edited by lal mohan kumar; 04 Aug 2020, 07:10.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10015
#23

04 Aug 2020, 07:11

Simple way to think about it: a cluster is a group. Rule: You allow errors to be correlated for observations belonging to the same group but not observations belonging to different groups. So in summary:
1. Identify what constitutes a cluster (group)
2. Apply the rule

In the case of code 1 it implies, errors belonging to these firms in an industry as well as in a year will be correlated, but not errors belonging to the same firms in different years. In the second code, errors will be correlated amongst firms in an industry as well as in year(two correlation)
Am I right

Yes. Maybe improve the wording by using "firms in the same industry and year" and "firms in the same industry but different years" for the first part. The second part you are simultaneously allowing errors to be correlated for firms in the same industry and for firms in the same year.
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#24

04 Aug 2020, 07:26

Thanks Andrew, Thanks a lot
Comment
Ihab Man

Join Date: Jul 2020

Posts: 56
#25

23 Sep 2020, 14:28

Dear Andrew Musau
Please, I have seen your posts related to clustering, and I hope you can help me with my questions.
Please, I have a panel data set for 500 companies from 11 countries with regular period (2000-2010) with 9 explanatory variables (not dummies). On the other hands my dependent variable is dummy if the company issue securities 1 and 0 otherwise and I will see which from 9 explanatory variables motivate the company to issue it. For example company 1 issue only in year 2007 I generate that this company equal 1 on year 2007 and the rest years equal zero and so on and those company not issued (all years equal zero.) is it correct the producers yes ?
I used xtlogit, fe but not work and many observation dropped, then I tried to use xtlogit, re but I have seen previous studies said that they are clustering the standard error at company level. Well, please, can you explain me why they did that? And it will work for my case? If yes? How can I do that in the command but without vce? And in this case can I use cluster at year level?
Another question please, can you tell me what is the correct command for my case to winsorize all variables in the model at the 1st and 99th percentiles? And please how can separate the sum statistic into two group (company issued, company not) Thank you so much in advance.
Best regards

Last edited by Ihab Man; 23 Sep 2020, 14:31.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10015
#26

23 Sep 2020, 16:57

If interested, follow here:

https://www.statalist.org/forums/for...573997-cluster
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment