Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Change signs when sample size changes

    Dear Statalisters,

    I read here that it is possible that a coefficient may change signs if the sample size changes. May I please ask why is that? What aspects of the regression or the variable in question should I pay attention to when this happens?


    Second, for the same sample size, I have a variable for which I use math scores and language scores as a proxy in two distinct regressions, but one variable X is statistically significant and positive for the regression where I use math scores and statistically significant and negative for the regression where I use language scores. I aslo wanted to ask how should I interpret this. What should I check for?

    Thank you!
    Best

  • #2
    If the sample changes, anything is possible. Different data, different results. If you think the sign should always be negative (e.g., a demand curve), and it's positive, then you've likely have an omitted variable (or endogeneity, which is an OV problem). In a well specified model, you shouldn't get big changes across samples (though for poorly estimated coefficients, they can switch signs).

    Look this over for the second question. Pay attention to sign on length.

    Code:
    sysuse auto , clear
    
    reg price length weight
    reg price length mpg
    reg price length displacement
    reg price length gear_ratio
    
    correl length weight mpg displacement gear_ratio


    Comment


    • #3
      Expanding slightly on george Ford's answer, I don't consider that sample size as such is what it is important here.

      The question is how do the smaller sample(s) and the larger sample(s) differ.

      Some possibilities:

      1. It's just part of sampling variation that a coefficient is sometimes slightly positive, sometimes slightly negative.

      2. A larger sample may differ systematically from a smaller sample even if the smaller sample is included in the larger sample. Example: my city differs systematically from the region it's in in many ways, the region differs from the country it's in, the country differs from the continent it's in.

      3. A small sample is more prone to sampling variation any way.

      4. That can't be a complete list.

      Comment

      Working...
      X