Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is this a correct to standardize variables in panel data?

    Dear Statalists,

    I have learnt that many people prefer not to standardize variables before regressions. However, if we leave this aside, I have a question about how to standardize variables in panel database.

    I simply use -egen std(var)- to generate standardized value for each variable, before panel regression. But I am not sure if this is correct.

    Moreover, I find that regression with standardized variables and regression with logarithm variables produce very different results, in terms of statistical significance of individual variables. I have no idea at all about which I should use.

    Many thanks!


  • #2
    You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    You can standardize the entire data set but all you're doing is changing variances and the zero points (for interactions). As Clyde has pointed out on this listserve, if the original variables have a meaningful metric, then you're often best not standardizing. I may understand the meaning of adding $1 million in sales more easily than the impact of adding 1 sd of sales.

    Logging and standardizing are totally different things. Logging really changes the functional form so you are estimating a fundamentally different model. Usually, standardizing retains functional form but changes the parameters. You need a good reason to log variables.

    Comment


    • #3
      Originally posted by Phil Bromiley View Post
      You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

      You can standardize the entire data set but all you're doing is changing variances and the zero points (for interactions). As Clyde has pointed out on this listserve, if the original variables have a meaningful metric, then you're often best not standardizing. I may understand the meaning of adding $1 million in sales more easily than the impact of adding 1 sd of sales.

      Logging and standardizing are totally different things. Logging really changes the functional form so you are estimating a fundamentally different model. Usually, standardizing retains functional form but changes the parameters. You need a good reason to log variables.
      Dear Phil,

      Thank you very much! I have read that it is a misunderstanding that variables must be normal, and thus correcting right skewness or making data normally distributed are not the reason to log variables. Many economics paper simply log variables like GDP (per capita), geographical distance and population, and leave variables in ratio and year unchanged

      So is it right to log only variables that can be better understood in terms of percentage change (e.g. GDP and population) without consideration of the distribution of variables (and heteroskedasticity and stationarity)?

      Comment

      Working...
      X