Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct Regression Methodlogy for Panel Data Set

    Hi all,

    I currently have a panel data set for home-sharing listings from 2014-2021 for each month (84 total observations per property listing) in the Seattle area. I am looking at the effect of a tax on home-sharing that took effect in January of 2018. The tax affected each observation in my data set at the exact same time, so I am struggling right now on how to run the correct regression using only the pre-tax observations as a control group. Is it possible to use the data to look at the effect of the tax in this way, and if so, what STATA commands do you recommend I use?

  • #2
    Your issue is a design issue, not a statistical one.

    In my opinion, you want a control group that was never exposed to the intervention (i.e., houses/properties in other cities). Otherwise, this is a situation where there are no counterfactuals, only factuals. It would be different if houses/areas were treated at different times, but this isn't even that, literally all units are treated at the same time.

    To use a silly example, if we wanted to see how ice cream giving affects the test performance of 8th graders, if everyone gets ice cream before the test, how can we see what the effect of ice cream is when every human was given ice cream at the same time? We'd have no control group, no comparison group, nothing to judge the unexposed students against. Your issue isn't a statistical one or even a Stata one, it's a conceptual one.

    Oh, and hey Millis. Welcome to Statalist!

    Comment


    • #3
      Jared Greathouse is right; you have a design problem. I would put it in a slightly different perspective, however. What you have is a pre-post comparison design. You can contrast house-sharing rates before January 2018 with house-sharing rates after Jan 2018. To use Jared's example, you could give everybody ice cream before today's test and compare their scores to how they performed on a similar test last month with no ice cream.

      So this is a very weak design. It's better than nothing, but just barely. The problem is that there are many things that will have changed as of Jan 2018, not just the tax you are interested in. It is possible that the rate of house sharing was going to change in 2018 just because of changes in culture, or the job market, or the weather, or...--this is not my area so you can probably think of more things than I can. With only the pre-post data you cannot distinguish these possibilities from the effect of the tax. To do that you need another group of houses that was unaffected by the tax but that experienced all those other things going on during the 2014-2021 era. That's a control group. You might be able to find that by looking for a jurisdiction that did not adopt that tax in 2018 but is in other respects similar to the one with the tax. And you need to verify that it is, in fact, "in other respects similar" by showing that its house-sharing trends before 2018 match those of your taxed sample. This would give you a difference-in-differences design, which is a much stronger approach.

      Comment


      • #4
        The only design that I COULD offer from the top of my head (short of adding more data) is a series of single-group interrupted time series designs. Again, better than nothing, but just barely.

        Honestly if this were my problem and getting other data simply weren't possible, I'd likely calculate partial correlations or some similar effect size from the relevant regression coefficients from a single-group interrupted time series design. Here, we'd be treating each regression like it's a separate "study". I'd likely throw in a little meta-analysis to impress the reviewers...... but even this would have its own issues, and I'd need to write complex code, and I suspect you don't want to do that.

        So, with that aside (and I imagine there are other approaches, this is just what I thought of in 5 seconds), you'll just need more data.

        Comment


        • #5
          This is what I assumed would be the issue, but some of my peers convinced me otherwise and had me confused. Thank you for your help and quick responses!

          Comment

          Working...
          X