Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Centering variables

    I have two questions related to centering variables. First, when is it appropriate to center a variable around the first value of it's range? Second, when estimating a piecewise linear model, can you grand mean center?

  • #2
    It depends on your objective. One centers before creating interactions to force the coefficient on the main effect to be the partial effect at the centered value. Usually the mean or median of a covariate is more interesting than its smallest value, but not always.

    Comment


    • #3
      Centering at the lowest value is sometimes nice when you center a variable like age, where you always think about getting a year older (and can only dream about getting a year younger).

      I actually don't like centering at the mean or median, as that is often quite peculiar depending on how sample was drawn and non-response. Instead I tend to just choose values that makes sense to my intended audience. Centering at the mean occupational status is more abstract and harder to imagine then centering at the status of a watch-maker.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Just to add a bit more: I agree that, with a variable like age, centering at the smallest value can make sense before using a quadratic. So, if 25 is the smallest age in the sample, use (age - 25)^2. This forces the coefficient on age to be the marginal effect at age = 25.

        I also agree with Maarten that centering discrete variables often doesn't make sense. Do we want to estimate the effect of job training for a person who is a fraction of a union member? Having said that, if one doesn't like mean centering generally then that means you're not a fan of average treatment effects. In models with interactions, mean centering covariates before interacting with a treatment dummy is identical to estimating the average treatment effect. Or, the average treatment effect on the treated. So, I would think mean centering is often very interesting because it gives us the average treatment effect estimated across the entire sample or an interesting subsample.

        Comment


        • #5
          Originally posted by Jeff Wooldridge View Post
          I also agree with Maarten that centering discrete variables often doesn't make sense.
          I have fallen in the trap of assuming everybody knows the jargon used in my discipline, sorry about that: occupational status is typically treated as a continuous variable, while occupational class is used for an ordinal or categorical represenation of occupational groups. An occupational status variable is typically created by asking for the respondent occupation as an open question, which is than coded in a detailed occupational classification, e.g. 4 digit ISCO codes. There are maps that relate these classification codes to status codes like ISEI, so that is why you can talk about the status of a watch maker.

          I take your point on average treatment effects. However, I would emphasize the importance of properly accounting for the sampling design and differential response rates, to make sure that the mean actually corresponds to something in the population.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Many thanks, Maarten and Jeff. This is very helpful.

            Comment

            Working...
            X