Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble replicating paper—sample size and estimates don't match

    Hello everyone,

    I’m currently attempting to replicate a paper, but I’m struggling to match the authors' estimates. Despite following their methodology closely, I haven’t been able to get near their reported sample size, let alone the coefficients.

    Here’s what I’ve tried so far:
    1. I’ve run step-by-step regressions, slowly incorporating the authors' conditions, such as restricting the sample, dropping certain states or ages, as they mention in their paper.
    2. I’ve also looked into missing variables, and even after accounting for those, as well as following their exact descriptions of their constructed dataset, my results still don’t match.
    3. Despite these efforts, my sample size is consistently off by around 2,000 observations.
    I’d appreciate any advice on how to systematically diagnose where I might be going wrong. What strategies can I use to pinpoint the discrepancy in sample size and then move on to the estimates?

  • #2
    The systematic way to do this is to come up with some hypothesis as to why your results might be different, then test the hypothesis. For example, is it possible some of the variables are spatially or temporally lagged? If so, that might explain the difference. If you run out of ideas, you could always try reaching out to the corresponding author to see if they have any idea what might explain the difference.

    Even if you can't get exactly the same results, you might look to see if the regression coefficients lead to similar interpretations. If the direction of the coefficients tend to be the same, the p-value on the coefficients are similar, and the size of the coefficients are similar to the original, then you are probably at least on the right track. If only one or two coefficients are substantively different in the interpretation, that might provide a clue as to what is going on. I would be really surprised if you were getting completely different results using the same variables and dataset.

    Comment

    Working...
    X