Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Jackknife coefficient estimates dropping one dummy at a time

    Dear All,

    my question here is probably methodological rather than Stata-technical.

    I want to estimate a quite simple regression model that boils down to this:

    Y = x + i.country

    where X is my variable of interest and country1... country20 are country dummies.

    To check robustness of my results and see whether any specific country is "driving" my estimates for X, my idea was to obtain jackknife estimates (first time for me!) for X, dropping one country at a time. That is, in Stata I write:

    jackknife coef_x = _b[x], eclass cluster(country): reg y x i.country

    which replicates the regression 20 times, dropping one cluster (country) at a time.

    Stata won't let me do this because by dropping one country at a time, in each replication the set of coefficients to be estimated is different. The precise error message I get is:

    "collinearity in replicate sample is not the same as the full sample, posting missing values"
    (after each replication)

    and

    "insufficient observations to compute jackknife standard errors
    no results will be saved
    r(2000);"
    (at the end of the command execution)

    So I understand that the "real" jackknifing does not really support this type of application, because it requires homogeneous sets of coefficients to be estimated across replication subsamples.

    What I could do is run my model by hand, dropping one country at a time, and present the 20 different estimates for X. But is this the fairest / most useful / concise thing to do here? There WILL be differences in the estimates of X across the 20 subsamples - but is this telling me something useful? Is there no way to deliver a "summary" measure of stability of my coefficient?

    I like the jackknife approach because it delivers ONE "robust" coefficient estimate - obtained (as far as I understand) by averaging the different replication estimates.

    Should I then average the 20 different estimates for X that I get by manually dropping one country at a time, and present that result as a "jackknife" estimate? And how do I obtain standard errors for it?

    Thank you in advance, anyone who has thoughts on this!!

    Zelda





  • #2
    You didn't get a quick answer. You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    I don't do jackknife, but as I understand it, it does standard errors by repeatedly dropping observations or clusters. What you are doing is quite different.

    You're worried that the observations on x for one country might be overly influencing the results. A couple of ways to look at this. You could look at influence diagnostics - you should see a cluster of influential observations if one country is dominating things. Alternatively, give up on the jackknife for now and set up a loop to run the model dropping a country at a time giving you 20 parameter and standard error estimates on x. I suspect you can then test for parameter equality across the multiple runs. Ideally, the parameters will be so similar the test is not needed. I also wonder if you might be better off with xtreg instead of the country dummies. xtreg will automatically adjust the panel controls when you change the sample which gets you out of any problems created by redundant dummies.

    Comment


    • #3
      Thank you for your answer, Phil!

      I know my question was not among the most "attractive" and "quick-to-reply" ones - as I said, I'm having more of a methodological doubt!

      I'll look into influence diagnostics - thanks for the heads up on that.

      So far, I am doing exactly what you suggest - a 20-times loop, getting 20 different (mostly similar) estimates. I was just wondering whether there was a better, more concise way to approach this - and I thought the jackknife philosophy would be suitable.

      Thanks again!

      Zelda




      Comment

      Working...
      X