Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtset command for panel data - what to do with several observations of the same firm ID and year

    Dear Stata community,

    I am currently analyzing a dataset which contains M&A deals and acquiror and target specific variables.
    Now I would like to test, whether fixed or random effects, or maybe an OLS is the preferred model.

    When I want to use the xtset command: xtset ID_acq Year, I receive the error message that there are "repeated time values within panel". This is because there are acquiring firms that have several acquisitions in one year.
    However, I need to keep all of those observations, as the target companies and the deals are different.

    As a result, I tried it with the following code:
    xtset Year
    xtreg Bid_premium E_Pillar_acq S_Pillar_acq G_Pillar_acq $CONTROLS_ACQ E_Pillar_tar S_Pillar_tar G_Pillar_tar $CONTROLS_TAR $DEALCONTROLS, fe
    estimates store fixed
    xtreg Bid_premium E_Pillar_acq S_Pillar_acq G_Pillar_acq $CONTROLS_ACQ E_Pillar_tar S_Pillar_tar G_Pillar_tar $CONTROLS_TAR $DEALCONTROLS, re
    estimates store random
    hausman fixed random

    In this example, the hausman test gives out a p-value of 0.8326 which means that random effects is the preferred model.

    When I set xtset ID_acq instead of Year:
    xtset ID_acq
    and then test again, the hausman test gives out a p-value of 0.0000 indicating that the fixed effects model is the preferred one.

    Thus my question:
    What do I have to do to find out the correct model?
    Do I have to combine the xtset command together with ID and Year or is it also correct to just specify xtset Year OR xtset ID?

    If I have to use xtset ID Year together, what can I do to solve the problem of having repeated time values within my panel? Checking the firms for which is the case, I only found one firm that has 2 acquisitions in the same year... However, I dont want to delete that one observations that creates the problem as my sample is already quite limited.

    If I dont need to combine xtset ID Year, which other xtset command do I use and why? Only xtset Year or xtset ID, as the outcome is a different model?

    Thank you for your helpful answers in advance.

    Best regards,
    Nils



  • #2
    https://www.statalist.org/forums/for...s-within-panel

    Comment


    • #3
      Thank you for your fast response Noah.
      However, this link doesnt really help me with my specific problem, as I do not have missing panel IDs.

      My first question is whether it is okay to only use xtset ID or xtset Year, and if yes, which of those two options.
      Secondly, I would like to know if there is any way to set xtset ID Year while keeping the two acquisitions of the same firm within the same year (as the deal specific data and the data for the target firms are different in these two observations).

      Furthermore, I just dropped one of those acquisitions and then I was able to set xtset ID Year. The hausman test in this case indicated the same as just setting xtset ID, namely a p-value of 0.0000 which indicates a fixed effects models. But I guess that there exists smoother way than deleting an observation, testing random vs. fixed effects and then continuing with the full dataset again...

      Comment


      • #4
        Nils:
        if you have repeated time values..., you can -xtset- your dataset with -panelvar- only, provided that you do not plan to use time-series related commands, such as lags and leads.
        This fix still allows you to plug -i.year- as a predictor in the right-hand side of your regression equation.
        Conversely:
        Code:
        xtset Year
        is not correct.

        As an aside, if you have >30-50 panels, a cluster-robust standard error may be necessary.
        If that were the case, please note that -hausman- does not support it and you should switch to the community-contributed command -xtoverid- (that, in turns, does not support -fvvarlist- notation; see -xi:- as a workaround).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          thank you for your answer, but I am not sure if I fully understood your answer.
          I have 165 total observations in my dataset (from 2009-2019).
          Of those 165 observations, I have 137 different acquiring firms, meaning that sometimes I have the same acquiring firm two times in my dataset, as it made another acquisition.
          Using the xtset ID_Year command, the hausman test indicates to use the fixed effects model (p-value = 0.0000).
          As I understand it, using xtset ID_year would then lead to 137 panels (for every firm). Unfortunately, every panel would then consist mostly of one observations (sometimes maybe of 2).

          Building on the results of the fixed effects model, I tried to test for time fixed effects using the command -testparm- and the code:
          xi: xtreg Bid_premium E_Pillar_acq S_Pillar_acq G_Pillar_acq $CONTROLS_ACQ E_Pillar_tar S_Pillar_tar G_Pillar_tar $CONTROLS_TAR $DEALCONTROLS i.Year, fe
          testparm _IYear*

          However, this gives me the following output which gives out no values and I am not sure how to proceed here:

          xi: xtreg Bid_premium E_Pillar_acq S_Pillar_acq G_Pillar_acq $CONTROLS_ACQ E_Pillar_tar S_Pillar_tar G_Pillar_tar $CONTROLS_TAR $DEALCONTROLS i.Year, fe
          i.Year _IYear_2009-2019 (naturally coded; _IYear_2009 omitted)
          note: _IYear_2017 omitted because of collinearity.
          note: _IYear_2018 omitted because of collinearity.
          note: _IYear_2019 omitted because of collinearity.

          Fixed-effects (within) regression Number of obs = 165
          Group variable: ID_acq Number of groups = 137

          R-squared: Obs per group:
          Within = 1.0000 min = 1
          Between = 0.0002 avg = 1.2
          Overall = 0.0003 max = 3

          F(28,0) = .
          corr(u_i, Xb) = -0.9994 Prob > F = .

          -------------------------------------------------------------------------------
          Bid_premium | Coefficient Std. err. t P>|t| [95% conf. interval]
          --------------+----------------------------------------------------------------
          E_Pillar_acq | -1.05556 . . . . .
          S_Pillar_acq | 7.91958 . . . . .
          G_Pillar_acq | -0.91781 . . . . .
          size_acq | -5.44e+02 . . . . .
          roe_acq | 0.14812 . . . . .
          mtb_acq | -41.74974 . . . . .
          lev_acq | 14.82379 . . . . .
          liquid_acq | 32.80790 . . . . .
          E_Pillar_tar | 6.34667 . . . . .
          S_Pillar_tar | -0.54939 . . . . .
          G_Pillar_tar | -5.65155 . . . . .
          size_tar | 104.39573 . . . . .
          roe_tar | -0.85024 . . . . .
          mtb_tar | -2.41580 . . . . .
          lev_tar | -7.36230 . . . . .
          liquid_tar | -7.07357 . . . . .
          Relative_Size | -2.94e+03 . . . . .
          Same_Industry | 93.34213 . . . . .
          Cross_Border | -2.81e+02 . . . . .
          Cash_financed | 50.67471 . . . . .
          Competition | 168.73893 . . . . .
          _IYear_2010 | -21.44162 . . . . .
          _IYear_2011 | 238.09637 . . . . .
          _IYear_2012 | -1.71e+02 . . . . .
          _IYear_2013 | -1.01e+03 . . . . .
          _IYear_2014 | -1.63e+02 . . . . .
          _IYear_2015 | -34.53379 . . . . .
          _IYear_2016 | -49.27261 . . . . .
          _IYear_2017 | 0.00000 (omitted)
          _IYear_2018 | 0.00000 (omitted)
          _IYear_2019 | 0.00000 (omitted)
          _cons | 3.08e+03 . . . . .
          --------------+----------------------------------------------------------------
          sigma_u | 729.35513
          sigma_e | .
          rho | . (fraction of variance due to u_i)
          -------------------------------------------------------------------------------
          F test that all u_i=0: F(136, 0) = . Prob > F = .

          . testparm _IYear*

          ( 1) _IYear_2010 = 0
          ( 2) _IYear_2011 = 0
          ( 3) _IYear_2012 = 0
          ( 4) _IYear_2013 = 0
          ( 5) _IYear_2014 = 0
          ( 6) _IYear_2015 = 0
          ( 7) _IYear_2016 = 0

          F( 7, 0) = .
          Prob > F = .

          If you could name me, what I did wrong and what is the error, I would be very thankful.

          Best regards,
          Nils

          Comment


          • #6
            Nils;
            what does
            Code:
             ID_Year
            refer to? Years? -panelid-? A result of the -group- function from -egen-?:
            Code:
            egen ID_Year=group(ID year)
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Oh sorry, this was a typing error. Its just xtset ID_acq (the acquiring firms ID)
              I created this ID by using the code "egen ID_acq = group(ISIN_acq)"

              Comment


              • #8
                Nils:
                I'm under the impression that your dataset have a too limited within-panel variation (some years are perfectly collinear with the firm fixed effect), that kills the -fe- estimator.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Hi Carlo,

                  so I understand that even though the hausman test indicates to use the fe model, random effects might be the better model for me?
                  For my understanding, I use the fe model when I believe that my entities are influenced over time.
                  However, as most firms just appear once in my dataset I have no within panel variation, as my panel (firm ID of the acquirer) consists just of one, in very few cases of two observations. Do I understand that correct?

                  As a result, the random effects model (or the OLS) might fit better, although the hausmann test indicates the fe model.
                  As I am observing mostly firms and their characteristics and how they influence my dependent variable, I also believe that using the random effects model makes more sense.

                  Would this be a valid argument to use the random effects model instead of the fe model?

                  Comment


                  • #10
                    Nils;
                    1) your interpretation of the reason why the -fe- estimator failed is correct;
                    2) however, it does not allow an automatic switch to the -re- estimator, which is not consistent (read: its coefficients are unreliable) if -fe- is the way to go;
                    3) I'd consider a pooled OLS, with -i.year- among the set of predictors.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Dear Carlo,
                      thank you a lot for your helpful advice and recommendations. I know much better how to proceed now.
                      Thank you and best regards,
                      Nils

                      Comment

                      Working...
                      X