Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effect regression with time dummies, choosing reference year and years to drop

    Hi, I'm attempting to run a fixed effect regression with time dummies, however, I want to measure the impact of the Covid-19 pandemic so I made a dummy for the years 2020 to 2022, to avoid collinearity I can't include those years so I tried dropping them (including 2019 for reference year). But dropping the pandemic years also makes it impossible to include the dummy pandemic. Its a weighted least squares regression with population weights, but that doesn't matter.

    This is the code:

    *generate a dummy for each year
    Code:
    tabulate year, generate(year_dummy)
    *drop 2019 to 2022
    Code:
    drop if year >= 2019 & year <= 2022
    *dummy variable for the pandemic years, from 2020 to 2022
    Code:
    gen pandemic = (year >= 2020 & year <= 2022)



    Code:
    xtreg Commuting Commutingdistance Population_growth Migration Real_income Real_house unemployment pandemic i.year [pweight=sqrt_pop_avg], fe vce(cluster numeric_id)
    Running this code it appears it dropped the years 2019 to 2022 as well as the pandemic dummy, in addition to the first year which is 2008. I want to drop the years 2020 to 2022 as well as 2019 as the reference year instead of 2008 while including the dummy pandemic.

  • #2
    Code:
    ... pandemic ib2019.year o(2020/2022).year ...

    Comment


    • #3
      Andrew Musau gives the solution in #2, but does not explain to O.P. why his original approach failed.

      -drop if year >= 2019 & year <= 2022- removes all the pandemic years' data from the dataset. Consequently, there were no observations left in the data set to which the pandemid indicator ("dummy") could apply--hence it ended up being omitted. Indeed, with this -drop- command you were left with a data set that is as uninformative as possible about pandemic efffects because all pandemic-related information had been purged.

      What you needed to do is block those pandemic years from having separate indicators. That is what the code in #2 does.

      Comment


      • #4
        Originally posted by Andrew Musau View Post
        Code:
        ... pandemic ib2019.year o(2020/2022).year ...
        Thank you it worked.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Andrew Musau gives the solution in #2, but does not explain to O.P. why his original approach failed.

          -drop if year >= 2019 & year <= 2022- removes all the pandemic years' data from the dataset. Consequently, there were no observations left in the data set to which the pandemid indicator ("dummy") could apply--hence it ended up being omitted. Indeed, with this -drop- command you were left with a data set that is as uninformative as possible about pandemic efffects because all pandemic-related information had been purged.

          What you needed to do is block those pandemic years from having separate indicators. That is what the code in #2 does.
          Hi, am I right in assuming that if all entities in the panel experience an upward trend, time dummies can capture this trend, ensuring that any remaining variation is stationary? My dependent variable is not stationary when I perform a Levin-Lin test, as are some of the explanatory variables.

          Comment


          • #6
            That is correct with respect to the dependent variable. If independent variables are trending over time and time indicators are included, they will be correlated with each other on that basis. But they may also be correlated with each other due to other shared variance, or even reverse correlated. So there is really no simple rule to say what the impact will be. All you can say is that the presence of the time indicators is likely to result in changes of the estimated effects of other independent variables on the outcome, and the magnitude or direction of those changes could be pretty much anything.

            Comment


            • #7
              Clyde Schechter If there were trends in the independent variables, would you want to model those by interacting the independent variables with the time dummies?

              Comment


              • #8
                If there were trends in the independent variables, would you want to model those by interacting the independent variables with the time dummies?
                Possibly. In general, I would. But it would depend on the size of the data set (both number of observations and number of variables affected by trend) and its ability to support the required number of interaction df. If it can't, I might look for a more "economical" way to express the time variation than indicators, say a simple linear trend, or perhaps linear and quadratic terms, or a spline, or a more coarse-grained set of time indicators.

                Comment

                Working...
                X