Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I do indeed have firms located in different countries..
    However, your #15 answers perfectly my questions and clarifies my doubts now! I can't thank you enough. Your help meant a great deal to me!

    Comment


    • #17
      Hi Daniela, Sergio, Andrew and stata-community,

      I have a similar question that I have not found a definite answer for although reading into it. I have a panel of firms that sell several products. Products are nested in firms. I want to run a FE-regression on product level including product and year FE. I am struggling a bit on deciding upon the appropriate way to cluster Standard errors. It makes a lot of sense to cluster SE on Firms level (as products are nested in firms). Papers talking about this issue in cross-sectional data argue that clustering on the highest level is sufficient for a nested data structure. Though I wonder if this is true for a panel, too.
      SE "of products" can on the one hand correlate within a firm and within a product over time. Did you already face this problem and find a good answer?
      Thank you!

      Comment


      • #18
        Dear Stata members
        I am learning the clustering option myself and I using Stata forum for learning purpose. I have a doubt

        @Andrew Musau #9
        Clustering at the country level is fine if you believe that the interdependence exists within countries. However, by clustering at the country-year level, you are constraining this interdependence to particular years: Observations of firms in China in 2015 are not independent but these observations are independent to those of China in 2016. This is a very strong and precarious assumption since the observations mostly belong to the same firms i.e., you are ruling out temporal interdependence. My suggestion is to just cluster at the country level
        Sorry for opening this thread again which is relatively old and I am not sure if any additional information is available or not. So I thought to open this thread and ask my doubt here.
        In the above quote it is written that by clustering at the country-year /firm level, you are constraining this interdependence to particular years. Now, what does that mean? For instance, if I cluster say by industry, does that mean within industries, the firms are interdependent(correlated) and amongst industries, there is no such interdependence. Also, this interdependence amongst firms assumes that there is no such relation between firms in an industry in different years? I am not sure whether I put it correctly or not.
        Also, in that case, is it not required to cluster by industry/firm along with year always

        Comment


        • #19
          Clustering at the country level is fine if you believe that the interdependence exists within countries. However, by clustering at the country-year level, you are constraining this interdependence to particular years: Observations of firms in China in 2015 are not independent but these observations are independent to those of China in 2016
          So the data here consists of firms in different countries observed over some time period. A country-year defines observations of firms in a country and year, e.g., firms in China in 2005. If we cluster by country-year, we allow errors belonging to these firms in a year to be correlated, but not errors belonging to the same firms in different years. Abadie et al. (2017) have some new insights on when clustering is advised which I highly recommend.

          https://economics.mit.edu/files/13927
          Last edited by Andrew Musau; 04 Aug 2020, 06:18.

          Comment


          • #20
            @Andrew Musau #19
            Thank you, Andrew, before I wrap I have few doubts and if you can help me here, it can augment my learning.

            if we cluster by country-year, we allow errors belonging to these firms in a year to be correlated, but not errors belonging to the same firms in different years
            For this, we must do as per your advice
            Code:
             
             *TO INSTALL TYPE ssc install reghdfe reghdfe depvar indepvar, absorb(company year) vce(cluster country #year)
            Am I right?

            In a post, I have read, "If you wanted to cluster by industry and year, you would need to create a variable which had a unique value for each industry-year pair. These standard errors would allow observations in the same industry/year to be correlated (i.e. different firms), but would assume that observations in the same industry, but different years, are assumed to be uncorrelated. To allow observations which share an industry or share a year to be correlated, you need to cluster by two dimensions (industry and year)".

            The first part is exactly what you said, Right?

            Code:
             
             *TO INSTALL TYPE ssc install reghdfe reghdfe depvar indepvar ,  absorb(company year) vce(cluster industry#year)
            But how to account for the second part, allowing observations that share an industry or share a year to be correlated?
            Extremely sorry to trouble you but I am mired in this

            Comment


            • #21
              Am I right?

              In a post, I have read, "If you wanted to cluster by industry and year, you would need to create a variable which had a unique value for each industry-year pair. These standard errors would allow observations in the same industry/year to be correlated (i.e. different firms), but would assume that observations in the same industry, but different years, are assumed to be uncorrelated. To allow observations which share an industry or share a year to be correlated, you need to cluster by two dimensions (industry and year)".
              Yes, I agree with the statement. reghdfe (SSC) now supports multi-way clustering (was not the case as at the initial post in this thread).

              #1 industry-year clusters

              Code:
              reghdfe depvar indepvar, absorb(absorbvars) vce(cluster industry#year)
              #2: industry and year clusters

              Code:
              reghdfe depvar indepvar, absorb(absorbvars) vce(cluster industry year)

              Comment


              • #22
                Dear Andrew
                In the case of code 1 it implies, errors belonging to these firms in an industry as well as in a year will be correlated, but not errors belonging to the same firms in different years. In the second code, errors will be correlated amongst firms in an industry as well as in year(two correlation)
                Am I right?
                If yes, then I assume I have understood a bit
                Last edited by lal mohan kumar; 04 Aug 2020, 07:10.

                Comment


                • #23
                  Simple way to think about it: a cluster is a group. Rule: You allow errors to be correlated for observations belonging to the same group but not observations belonging to different groups. So in summary:
                  1. Identify what constitutes a cluster (group)
                  2. Apply the rule

                  In the case of code 1 it implies, errors belonging to these firms in an industry as well as in a year will be correlated, but not errors belonging to the same firms in different years. In the second code, errors will be correlated amongst firms in an industry as well as in year(two correlation)
                  Am I right
                  Yes. Maybe improve the wording by using "firms in the same industry and year" and "firms in the same industry but different years" for the first part. The second part you are simultaneously allowing errors to be correlated for firms in the same industry and for firms in the same year.

                  Comment


                  • #24
                    Thanks Andrew, Thanks a lot

                    Comment


                    • #25
                      Dear Andrew Musau
                      Please, I have seen your posts related to clustering, and I hope you can help me with my questions.
                      Please, I have a panel data set for 500 companies from 11 countries with regular period (2000-2010) with 9 explanatory variables (not dummies). On the other hands my dependent variable is dummy if the company issue securities 1 and 0 otherwise and I will see which from 9 explanatory variables motivate the company to issue it. For example company 1 issue only in year 2007 I generate that this company equal 1 on year 2007 and the rest years equal zero and so on and those company not issued (all years equal zero.) is it correct the producers yes ?
                      I used xtlogit, fe but not work and many observation dropped, then I tried to use xtlogit, re but I have seen previous studies said that they are clustering the standard error at company level. Well, please, can you explain me why they did that? And it will work for my case? If yes? How can I do that in the command but without vce? And in this case can I use cluster at year level?
                      Another question please, can you tell me what is the correct command for my case to winsorize all variables in the model at the 1st and 99th percentiles? And please how can separate the sum statistic into two group (company issued, company not) Thank you so much in advance.
                      Best regards

                      Last edited by Ihab Man; 23 Sep 2020, 14:31.

                      Comment


                      • #26
                        If interested, follow here:

                        https://www.statalist.org/forums/for...573997-cluster

                        Comment

                        Working...
                        X