Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to correctly specify dummy variables in regression equations

    Hello.

    I have a theoretical question. If I have Y (dep. var.), X1, X2 (indep. var.) + a dummy "RACE" in the long format, containing 3 categories (white, black, other).

    What is the difference in running:

    Y X1 X2 RACE

    and

    Y X1 X2 i.RACE

    I understand the meaning of i.RACE (providing the estimates for 2 categories against the benchmark category) but I don't know what the coefficient for RACE alone means (how to interpret it). Which is the correct approach, and why? I am a little bit confused, since I found mixed opinions on the internet.

    Thx

    Jack

  • #2
    in your first model, RACE will be treated as continuous - is this what you want? (note that if there were only 2 categories, the two would be "the same")

    Comment


    • #3
      Hello Rich.

      Yes, I am conscious of the fact that using just "RACE", this variable is used as continuous variable. Just, I was trying to understand the meaning of the related coefficient. What does it mean, how can I interpret the coefficient, exactly?

      Thx

      Comment


      • #4
        Hi Jack,
        Race as a continuous variable really doesn't make sense here. Unless you had something like skin tone (where a one unit increase in the skin tone would yield a b increase in Y--but this is NOT race), I cannot imagine that you would want to use race in this way.
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          that's what I thought. Thanks.
          Jack

          Comment

          Working...
          X