Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I would do scatter plots of your outcome variable versus income itself and against each of the alternative versions of income (including charities/income) of interest and see which comes out looking most linear.

    Comment


    • #17
      Thank you, Clyde.

      Comment


      • #18
        Hi Clyde,

        You mentioned something in your second comment that really caught my attention: "Unless you are doing an experiment, or using a matched design, you do not and cannot actually "control" for their effects. In observational studies all you can do is include them in the model to adjust for them."

        I used a matched sample in one of my recent papers but I am being challenged why I created a matched sample rather than just "controlled" (adjusted, I guess) for the matching variables in the regression models. What is the best defense for matching? I ask you because you say about that you do not actually control for the covariates effects unless you use a matched design (or experiment). So is using a matched sample with the matching variables included as controls superior to just doing an unmatched sample with those same controls?

        Thanks for the help and maybe pointing me towards some helpful references if you know any off hand.

        Comment


        • #19
          In an ideal world, matching would be superior to an unmatched design with the same variables included as covariates. Adjusting for other variables by including them as covariates is, frankly, just a kludge. It is only effective to the extent that the actual relationships between the covariates and the outcome and main predictors are correctly specified in the analytic model. If the relationship, for example, is logarithmic, but you use the raw covariate, then the adjustment does not fully correct for confounding bias, and in fact may contribute additional bias. Matching over comes this: it zeroes out the effect of the matching variable(s) in the data so that those effects are completely controlled, and this works without regard to any assumptions about the form of the relationships. That is, the matching works without parametric assumptions. It completely eliminates confounding bias by the matched variables.

          In the real world, however, there are trade-offs. If you try to match two samples on several variables, you are likely to find that there are some cases for whom no matching control can be found, or vice-versa. Even with only one matching variable this can happen if the distribution of the matching variable has long tails. So now we have a problem. The unmatched entities are necessarily excluded from the analysis. But this may well introduce a new bias: selection bias, because the unmatched entities are likely to be drawn from the tails of the matching-variable distribution. Also, sometimes the number of unmatchable cases is so large that there is an appreciable reduction in sample size, leading to loss of power. These difficulties can sometimes be overcome by "relaxing" the match criteria--instead of exact matching, we accept matches that are "close." But now, by virtue of how we operationalize "close," we have suddenly introduced tacit assumptions about the nature of the relationships between the match variables and the outcome and key predictors. So now, we are perhaps no better off than we would have been just doing an adjusted analysis instead.

          So you really have to look at the details of your particular situation. If you had no difficulty finding matches for every entity in your analysis, then matching has worked perfectly to control for confounding bias. But if you had unmatched entities, then some part of the advantage of matching has been lost, perhaps a small part, perhaps a great deal or all, depending on how severe the problem was. Similarly, if the matching-variables have simple linear relationships with the outcome and key predictors, then just including them in a regression model would have been a correct specification of the relationships and led to correct adjustments.

          In brief, and in simple terms, matching, when it can be completely accomplished, is superior to covariate adjustment. But in reality complete matching is not feasible, and then matching may be just as problematic as covariate adjustment.

          Comment

          Working...
          X