Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Understanding/Reading Regression Output With Categorical Variables

    Hello;

    I have experience with regression, I just haven't done it in a while and am SUPER confused at the moment, and thought it would be worth it to ask; its a really easy question.



    I am doing a regression analysis, and some of my control variables are categorical values with values ranging from 1-4.

    For example, one variable is health, where 1=excellent, 2=good, 3=fair, and 4=poor

    If I were to run a regression, what value would the coefficient use as a reference?

    Would I be reading it as if health is excellent, or poor? Etc?

    Thanks!

  • #2
    You can't use a single variable. You need dummies for 3 of the 4, thus excluding 1 to avoid the dummy trap. I'd exclude "poor," so all dummies are in reference to "poor".

    could use i.health as a variable, or b4.health which sets "poor" as the base.

    Comment


    • #3
      You should definitely follow George's suggestion to use Stata's factor variable notation rather than create separate indicator (dummy) variables. If you are unfamiliar with factor variable notation, consult the output of
      Code:
      help fvvarlist

      Comment


      • #4
        Thank you!
        Last edited by Paige LaPierre; 27 Sep 2021, 16:43.

        Comment


        • #5
          Originally posted by George Ford View Post
          You can't use a single variable. You need dummies for 3 of the 4, thus excluding 1 to avoid the dummy trap. I'd exclude "poor," so all dummies are in reference to "poor".

          could use i.health as a variable, or b4.health which sets "poor" as the base.
          Speaking of that, Stata will always use the lowest number as the base category. You can manually tell it to use a certain category, e.g. as already stated, ib4.health will tell Stata to use level 4 as the base.

          say you decide that your coding scheme isn’t what you really wanted. The recode command is a nice shortcut.
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment

          Working...
          X