Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable Significance Changes After Adding Regional Dummies

    Dear All,

    I have two simple yet complicated question.

    I run my regressions on survey data using the OLS model in which productivity is my dependent and gender and age are my independent and explanatory variables

    In the first run I do not add regional dummies (provinces) and in the second regression I add the regional dummies.

    I find that my independent variable is NOT significant when I exclude regional dummies and in the second case I add regional dummies and find that it IS SIGNIFICANT. What does this mean and how do I interpret this


    In another case I find that my other explanatory variable in the "no regional dummy model" is significant at the 90% confidence level and in the "regional dummy model" at the 95% confidence level

    Thanks.

  • #2
    Based on your use of capital letters, I infer that you attach a great deal of importance to statistical significance here. The American Statistical Association begs to differ, and, in facts, recommends that the concept of statistical significance be abandoned altogether, in part because it is widely misused and almost hopelessly confusing. See https://www.tandfonline.com/doi/full...5.2019.1583913.

    Along that line of reasoning, redact the p-values from your output and focus your attention where it is most useful: on the coefficients themselves. Are the coefficients really very different from a practical perspective? They may or may not be. If they aren't, then this is just a good illustration of how noisy the concept of statistical significance is and why you shouldn't use it. If they are, read on.

    Suppose the coefficients are, in fact, materially different. This happens commonly when variables are added to or removed from a model, and it is not a problem. The coefficients of variables in a model mean different things depending on what else is in the model. The phenomenon is known as Simpson's paradox, or, in the context of regression, it is sometimes called Lord's paradox. Coefficients can change dramatically--even change sign. It's a real phenomenon and it means you have to think carefully about the meaning of the coefficient in each model and understand which meaning (if either) is the one that is relevant to your research question.

    The Wikipedia page on Simpson's paradox is quite good, and I recommend you read it for a full explanation. There it is explained in terms of contingency tables, but the concepts work exactly the same way with a regression analysis.

    Comment


    • #3
      Thanks Clyde, schemed through the paper and looks like an interesting read - will read it in detail and thanks for the info on Simpson's paradox

      Comment

      Working...
      X