Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data with shock

    Hi,

    I am using a panel data which consists of individual's employment data from 1998-2017. I am trying to utilize a shock that occured in 2013.

    Dv: labor_outcome (1 if labor market movement occurs and 0 otherwise)
    Independent variable: years_beyond_2013 (=1 starting year 2013 through 2017 and 0 for previous years)

    When I run the following command, the year 2017 gets dropped in addition to year 1998. I understand why year 1998 gets dropped due to the i.year and 1998 becomes the base year. I am trying to figure out why the year 2017 gets dropped.

    xtreg labor_outcome years_beyond_2013 i.year, fe

    Is this the right specification for me? Also, if 2017 gets dropped due to the colinearity caused by years_beyond_2013 and possible year fixed effects, how can I set the baseline year that gets dropped for that dummy?

    Any help would be appreciated.

    Best,
    Sukhun

  • #2
    I am trying to figure out why the year 2017 gets dropped.
    So, think about it. One of the years from 2013 through 2017 must get dropped because if all of them are retained in the model you have the linear equation years_beyond_2013 = 2013.year + 2014.year + 2015.year + 2016.year + 2017.year. Stata will, by default, make it the final year that gets omitted. If you don't like that choice, you can override it. Let's say you want to omit 2015 instead:

    Code:
    xtreg labor_outcome i.years_beyond_2013 i(1999/2014).year 2016.year 2017.year, fe
    will do that.

    Or, perhaps somewhat simpler:
    Code:
    fvset base 2015 year
    xtreg labor_outcome years_beyond_2013 i.year, fe



    Comment


    • #3
      Sukhun:
      as an aside to Clyde's excellent advice, if the shock you mention affects all the panels, error might be correlated across panels, too (and this should be investigated).
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Dear Clyde & Carlo, thanks for your helpful comments!

        To your comment on correlation of standard errors, I am using vce(cluster individual) option to take care of that, but do let me know if this doesn't resolve the issue.

        I wrote down the entire formula and see why year 2017 gets dropped and, yes, it is obvious why one year has to get dropped. That's was very dumb of me.

        So the problem is that my years_beyond_2013 fails to capture the intended effect which is the effect of period 2013-2017 on labor market outcome in comparison to period 1998-2012.

        Model #1 Without years_beyond_2013
        xtreg labor_outcome i.year, fe vce(cluster individual)
        When I run it without years_beyond_2013, the year 2017 gets coefficient X.

        Model #2 With years_beyond_2013
        xtreg labor_outcome years_beyond_2013 i.year, fe vce(cluster individual)
        When I run it with years_beyond_2013, the years_beoynd_2013 gets the exact coefficient X (which was in front of year 2017 in above model #1).

        A. Basically, the model #2 is showing an effect of year 2017 while my objective is to capture the overall effect of period 2013-2017.

        My overall goal is to examine the heterogeneous effect of the shock that occured in 2013 across gender.

        Model #3 Full model
        xtreg labor_outcome i.years_beyond_2013##i.female i.year, fe vce(cluster individual)

        This will drop female because it doesn't vary within a person. So the coefficient of interest is the interaction term.

        I think the model is not doing what I really want to do. Is dummy for a period incompatible with year fixed effects? Comment #A maybe showing why it isn't working, but I am not clear why that is the case. Any suggestion on the model that can capture the effect of period 2013-2017 on labor market in comparison to period 1998-2012 while including year fixed effects?

        Any suggestion would be greatly appreciated.

        Comment


        • #5
          Is dummy for a period incompatible with year fixed effects?
          Incompatible might be too strong a word, but, basically, yes. You can't have it both ways. The linear relationship among the period indicator and the individual year indicators implies that any model that incorporates all of these is unidentifiable: it is always possible to re-allocate outcome affects in many different ways among these variables. It also follows that whatever constraint you then apply to identify the model (and omitting a variable is just a simple way of constraining its coefficient to 0) has implications for the results you will get for the remaining variables. It is simply a mathematical impossibility to separate in a unique and meaningful way the contribution of the period from the contribution of the individual years. Or rather, there are infinitely many different, equally valid, ways to do so. Consequently the results you get from any way of doing that are more a reflection of the particular methodological choice you made, than they are a reflection of something in the real world.

          In some situations, considerations external to your data may justify a particular identifying constraint as being "true of the real world," in which case you could put greater faith in the results that arise from that particular analysis. But those situations are not common, and in the absence of such a consideration, you basically are forced to choose between year effects and period effects: you cannot have both.

          In a situation like yours, where you are estimating the effect of a single shock that begins simultaneously in all your units of analysis, and where the effect of that shock over the period is the focus of your research goals, the sensible and usual approach is to use the period indicator and omit the year variables. That said, if there are a small number of individual years in which something particularly salient happened you could add those in to the model. (For example, in situations like this it is common to add a 2008.year variable into the model to represent the effects of the global financial crisis.)

          Comment


          • #6
            Thanks, Clyde. Your clarification helps. I was too naive to think that my specification was correct in the place.

            It is not ideal since the data is panel and I would love to utilize the fixed effects, but I guess that is not possible in this case.

            Thanks so much!

            Comment


            • #7
              Well, you can still use the individual-level fixed effects. Only the year indicators are a problem.

              Comment


              • #8
                Yes, of course! Thanks.

                Comment

                Working...
                X