Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating dummy variables

    I'm working with data on countries, running separate country models. I have one variable that is coded as 0, 1, 2 and 3 and wish to create dummy variables from it. As one country has values of 0, 1 and 2 (i.e., there is no 3) on this variable, I created three dummy variables with one as the reference category. For a different country which had values of 0 and 1 on this variable, I created two dummy variables with one as the reference category. I was wondering if this is correct? Thank you.

  • #2
    You likely don't need to create the dummies as most estimation commands nowadays support factor variables. See

    Code:
    help fvvarlist
    You can specify the same base across regressions using

    Code:
    ib#.catvar
    where you replace # with the specific level and "catvar" with the name of the categorical variable.

    Comment


    • #3
      Thank you Andrew. Even so, does the way I created the dummy variables make sense? Many thanks.

      Comment


      • #4
        At most you need one indicator variable (dummy in your terminology) fewer than the number of distinct values. So, one if there are two distinct values, two if there are three, so forth.

        But Andrew Musau is right. It's most unlikely that you need to generate any extra variables at all. Also, if you are running separate models for each country, you should still use the same syntax as far as possible.

        Comment


        • #5
          gen dummy_med_tna_sum=0
          replace dummy_med_tna_sum=1 if tna_sum>=dummy_med_tna_sum

          please is the above command correct in generating a variable

          Comment


          • #6
            Originally posted by NANA YAW POKU View Post
            gen dummy_med_tna_sum=0
            replace dummy_med_tna_sum=1 if tna_sum>=dummy_med_tna_sum

            please is the above command correct in generating a variable
            One needs to know the context to say whether or not the code does what it is supposed to do. You can simplify the code to:

            Code:
            gen dummy_med_tna_sum = tna_sum>0 & !missing(tna_sum)
            This will assign a value of 1 to the generated variable if and only if "tna_sum" is positive, 0 otherwise.

            Comment


            • #7
              I created the dummies using the commands below, which I think does the same thing. These are the commands I'm used to using for dummy variables (though slightly less neater than your ones).

              generate var=.
              *I then copied the values of this variable (coded as 0 or 1) into the data editor
              generate var1=0
              replace var1 = 1 if var == 0
              generate var2 = 0
              replace var2 = 1 if var == 1

              tab var1
              tab var2

              I ran the frequencies on the dummies just to check and everything looked fine.

              Thank you very much.

              Comment

              Working...
              X