Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Three dimensional panel data regression

    Good Morning everybody,

    I'm pretty new with Stata.

    As I anticipated in the title I have a dataset with country, industry sector and year dimensions(i,j,t structure).

    The dependent variable is Foreign direct investments flows into 5 different countries and 23 sectors for 5 years each.

    Some of my independent variables, for example GDP growth, vary with country and year while others, like the value added in each sector vary with country, sector and year.

    There must be some way to analyse such data making a fixed effect regression, but I am unaware of it. It seems thet you can put just one id variable but I have two.

    Can anyone help me?

    Thank you in advance for your time and availability.

    Alberto

  • #2
    Hello, Alberto!

    I am pretty new too with Stata, so I can't help you with you question.

    But I would like to give you a couple of suggestions to structure better your question, so that the most experienced members can help you. Please consider:
    - putting a sample of you data
    - using -describe- and posting the results
    - and, read this FAQ of the forum - http://www.statalist.org/forums/help - it has lots of tips.

    Take care,
    Clarice

    Comment


    • #3
      Thank you very much Clarence.
      'll try to create a simple sample of the problem so that the question will be clearer.
      Cheers
      Alberto

      Comment


      • #4
        Hi Alberto,
        There are many options for modeling such data using panel fixed effects.
        unless there are strict restrictions in your analysis regarding the combination of industry and country fixed effects, one option is to "combine" the information of both, and set up your panel as follows:
        egen cn_ind=group(country industry)
        sort cn_ind year
        xtset cn_ind year
        and then just estimate your model using standard commands:
        xtreg y x i.year, fe (For a model with time fixed effects and country/industry fixed effects)
        If on the contrary you dont want to combine the country and industry fixed effect, you can use the following options:
        xtset country
        xtreg y x i.year i.industry, fe

        or use some user written commands like -gpreg- (use findit gpreg)
        gpreg y x i.year, i(country) j(industry)

        HTH
        Fernando







        Comment


        • #5
          Thank you very much Fernando! they seem to work properly!

          Best Regards,

          Alberto

          Comment


          • #6
            these are useful information. However, I want to know that is there any tool of data mining (such as stepwise regression) in fixed and random effect modeling in stata?
            Regards

            Comment


            • #7
              Help..
              When I used
              xtreg Dep Ind i.Time i.Sector. fe
              then i find that all industry data is ommited due to collinearity. Plz can you explain what does it mean?
              Regards

              Comment


              • #8
                Hi Umar,
                There is not enough information in our post, so i ll guess details here.
                If sector (Im guessing your industry variable) is being drop, I guess your panel identifier perfectly overlaps all industry cases. For example, the panel ID is industry code at 4 digits, but the variable sector is the industry code at 1 digit.
                Regarding the other question, Im not aware of general data mining tools for panel data. You would do better if you have a specific idea of what you want to do, and start from there.
                Best regards,
                Fernando

                Comment


                • #9
                  Hola Alberto, hola Fernando,

                  I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

                  Code:
                  regress respvar expvars i.country##i.industry
                  The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

                  Best,
                  Alfonso Sanchez-Penalver

                  Comment


                  • #10
                    Hi Alfonso,
                    You are absolutely right. That is why I suggested an option if one doesnt one to "combine" the country/industry effect, and refer to the -gpreg- Command.
                    In the end, it really depends on what Alberto is trying to infer from his results, and the capacity of the computer he is using.
                    I wonder, however, if one could test the same thing as you suggest by making a test not on the fixed effects, but on the parameters of the rest of the variables. The reason for this comes because there could be cases where directly estimation of the fixed effects (like in employer employee linked data), estimating the fixed effects is unfeasible.
                    In those cases, My suggestion is to "absorb" the fixed effects from all other variables, run the OLS on the transformed data, and then correct Sigmas for degrees of freedom.
                    Fernando

                    Comment


                    • #11
                      Originally posted by Alfonso Sánchez-Peñalver View Post
                      Hola Alberto, hola Fernando,

                      I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

                      Code:
                      regress respvar expvars i.country##i.industry
                      The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

                      Best,

                      Hi Alfonso,

                      Why don`t you use the following code:
                      xtset country
                      xtreg respvar expvars i.year i.country##i.industry, fe r

                      Look forward to your replay!

                      Buyou
                      Last edited by ershibuyou; 16 Aug 2014, 12:32.

                      Comment


                      • #12
                        Originally posted by Alfonso Sánchez-Peñalver View Post
                        Hola Alberto, hola Fernando,

                        I have a question about how the grouping works. It's my understanding that by using egen, group, you are effectively creating a panel for each combination of the two variables. Notice, then, that the actual effect that is being accounted for is the effect that the industry had in a given country. I say this because in doing that you will not be able to separate the effects of the country from the effects of the industry, and you are assuming that the effects of an industry are country specific, something that is not necessarily true. In reality you could have country specific effects, industry specific effects, and country-industry specific effects. In your estimation, if my understanding is correct, you would only be taking into account country-industry effects. You can simply use OLS including interaction of the two categorical variables, i.e.

                        Code:
                        regress respvar expvars i.country##i.industry
                        The only downfall to this estimation is the loss of degrees of freedom because of the inclusion of all the binary variables, but with a large dataset that shouldn't be a problem. It allows for testing whether the country specific and industry specific effects are jointly insignificant which is what you seem to be assuming to start with.

                        Best,

                        Hi Alfonso,

                        Why don`t you use the following code:
                        xtset country
                        xtreg respvar expvars i.year i.country##i.industry, fe

                        Many thanks!

                        Buyou

                        Comment


                        • #13
                          The fixed effects estimator is equivalent to an OLS estimation where you include dummy variables for the categories of the group variable you set in xtset.

                          Thus your code
                          Code:
                          xtset country
                          xtreg respvar expvars i.year i.country##i.industry, fe
                          will drop the dummy variables for country because of collinearity. Both will provide the same results, except that in mine I didn't include fixed effects for the years that you do. But

                          Code:
                          reg respvar expvars i.year i.country##i.industry
                          will return the same slopes for all the other variables others than the country dummies, which xtreg will not return

                          Best,

                          Alfonso
                          Alfonso Sanchez-Penalver

                          Comment

                          Working...
                          X