Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Log School-variable

    Hello all,

    does it make sense to transform an independent school-variable (school = years of schooling) into a log-variable? Especially, when my dependent variable is logwage (natural logarithm of wage rate per hour)? How would I interpret them then?

    Thank you
    TS

  • #2
    You didn't get quick response. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    Whether logging a variable makes sense is really a question about the substance of your area - you must be the expert on that. I'd check what others have done in your discipline. Just because you log the dv does not mean you should log the iv's. [The exception would be if you start with a multiplicative model like a Cobb-Douglas production function and log everything to produce a linear regression.] After you use factor variable notation for your interactions (if any), you should use the margins routine to help you understand your results.

    Comment


    • #3
      Tom:
      as an aside to Phil's helpful reply, it is also a metter of easiness of interpretation (especially when you have a mix of logged and non-logged predictors).
      In your case the coefficient of logged years of schooling is the elasticity of wage with respect to years of schooling.
      In the following toy-example, a 1% increase in -price- corresponds to a -0.82% in -mpg-:
      Code:
      sysuse auto.dta
      g ln_price=ln(price)
      g ln_mpg=ln(mpg)
      reg ln_price ln_mpg
      reg ln_price ln_mpg
      
            Source |       SS           df       MS      Number of obs   =        74
      -------------+----------------------------------   F(1, 72)        =     31.00
             Model |  3.37819527         1  3.37819527   Prob > F        =    0.0000
          Residual |  7.84533782        72  .108963025   R-squared       =    0.3010
      -------------+----------------------------------   Adj R-squared   =    0.2913
             Total |  11.2235331        73  .153747029   Root MSE        =     .3301
      
      ------------------------------------------------------------------------------
          ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            ln_mpg |   -.826847   .1484986    -5.57   0.000    -1.122873   -.5308204
             _cons |   11.14146   .4507755    24.72   0.000     10.24286    12.04007
      ------------------------------------------------------------------------------
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Logging education years as a predictor implies that the difference in effect between 2 and 6 years is the same as that between 6 and 18 years. That's rather pessimistic for the readers of this forum who are immensely more likely to have reached Master's or doctoral level than to have dropped out of education at age 11 or so.

        In real modelling, however, clear and clean effects are all too likely to be muddied by other predictor effects, noise and outliers.

        Comment


        • #5
          I have decided to not log SCHOOL. This is because when I plot LOGWAGE and SCHOOL, it has linear tendencies. Or is another explanation more consistent? Thx!

          Comment

          Working...
          X