Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should I Use Weights?

    Dear Statalist,

    I have this design for my empirical work, and there is one thing I got confused about.

    I have individual-level panel (household survey) data and convert it into community-level panel data to deal with some issues by averaging out the outcome of individuals.

    However, each observation (community) has a different number of individuals combined (numppl) and it ranges around from 50 to 150.

    To deal with this difference in the number, I use the weighting in Stata by adding [aweight=numppl].

    However, OLS estimates gave lower power and magnitude compared to the one without weighting.

    For the type of data and design for my research, I am not sure whether weighting is necessary or not.

    I read some articles and papers about weighting, but it was not so clear for me to reflect it to my research.

    Or should I as a control variable instead for village level analysis?

    I would appreciate it if you could give me some advice on that. Thank you in advance.

    Last edited by Shisho Jakas; 20 Apr 2023, 08:57.

  • #2
    Weighting is confusing in econometrics. Generally, I think, it is discouraged. But it is done a lot.

    You'd think of the difference as the coefficients measuring effects of communities when unweighted. When weighted, it is more like a person result.

    What's happening, I suspect, is that there's not much going on with the larger communities, which is why the results get weaker.

    When you collapsed the data, did you use survey weights?

    Comment


    • #3
      The idea behind aweights is that you weight each observation in inverse proportion to its error variance. When a value is created by averaging N observations of the same case, the error variance is 1/N times the error variance of a value based on a single observation of that unit. That's why the number of observations used to create the average serves as an aweight.

      There is a theorem that says that if you have a series of estimates of some parameter, x1, x2, x3, ..., xn, each with different error variance, then the "best" (in some sense I no longer remember, under conditions I no longer remember, but probably mild ones) estimate of the parameter based on those estimates is the weighted average of the x's, weighted in inverse proportion to their error variance.

      I agree with George Ford that the explanation for the difference in results is that most of the effect you observed in the unweighted analysis is coming from the smaller communities, with less in the larger communities.

      Comment


      • #4
        This paper by Jeff Wooldridge and co-authors is my go-to paper on the topic. (It also boasts possibly the best title ever to appear in the economics literature.) https://jhr.uwpress.org/content/50/2/301

        Comment


        • #5
          Also see
          David Radwin
          Senior Researcher, California Competes
          californiacompetes.org
          Pronouns: He/Him

          Comment

          Working...
          X