Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Binary Variable Creation

    Hi, It's my first time using stata and I wanted to know how creating binary variables is possible. I've looked through the help command, Stata forum, and online PDF instructions.
    So I am trying to create binary variables for region continent and for sex from categorical variables/ stream variables.
    I keep getting back r 109; for a type mismatch with the following code:
    generate Sex_numerical = 0
    replace Sex_numerical = 1 Sex == "Female"
    replace Sex_numerical = . if Sex == .
    destring Sex_numerical, replace force

    Noting Sex is the variable already in the dataset.

    For the continent dummy, I want to do similar where 0 = not in continent 1= in continent for Oceania, Asia, North America, South America, Africa, and Europe.
    Using Country_Code variable to generate these.


    I'm sorry this is probably a simple question but thanks for any help you can provide.
    Last edited by Jame Warrell; 17 Nov 2023, 06:27.

  • #2
    This would be legal if Sex is string

    Code:
    generate Sex_numerical = 0
    replace Sex_numerical = 1 if Sex == "Female"
    but this is shorter and to the point

    Code:
    generate Sex_numerical = Sex == "Female"
    Now

    Code:
    replace Sex_numerical = . if Sex == .
    That's illegal because we're supposing that Sex is string. So it can't be equal to a numeric value.

    If you want an indicator that is say 1 if Female, 0 if Male and missing if anything else, there would need to be different code. Or, we need to see a table

    Code:
    tabulate Sex, missing
    and your rules about what you want for each reply mentioned.

    Code:
    destring Sex_numerical, replace force
    That's legal but pointless because
    Code:
    Sex_numerical
    is numeric, as you created it.

    More at

    https://journals.sagepub.com/doi/pdf...36867X19830921

    https://journals.sagepub.com/doi/pdf...867X1601600117

    https://www.stata.com/support/faqs/d...true-and-false

    Don't be disheartened by these little errors, which are part of the learning process.

    Comment


    • #3
      you should read the Cox/Schechter article in the Stata Journal: https://www.stata-journal.com/sj19-1.html

      you should also please read the FAQ

      Comment


      • #4
        Thank you so much for the help!

        Comment


        • #5
          Hi Jame Warrell. Welcome to Statalist. You've already received some good advice from Nick & Rich, but as you are a brand new Stata user, I'll chip in with one more offering that might be helpful to you.

          Code:
          * Read in some data to illustrate.
          clear
          input str8 Sex
          "Female"
          "female"
          "FEMALE"
          "fEmAlE"
          "Male"
          "male"
          "MALE"
          "MaLe"
          ""
          "."
          "missing"
          end
          
          * If you want 1=Female & 0=Male for your indicator
          * variable, I suggest naming the variable Female
          * rather than Sex_numerical. And if there might
          * be a mixture of uppercase & lowercase as in
          * the current data, use strlower() function
          * to make the code more bullet-proof.
          
          generate byte Female = 1 if strlower(Sex) == "female"
          replace Female = 0 if strlower(Sex) == "male"
          list, clean
          
          * Notice that Female is missing when Sex is not
          * some variant of "Female" or "Male".  I assume
          * this is what you want.
          Here is the result:
          Code:
          . list, clean
          
                     Sex   Female  
            1.    Female        1  
            2.    female        1  
            3.    FEMALE        1  
            4.    fEmAlE        1  
            5.      Male        0  
            6.      male        0  
            7.      MALE        0  
            8.      MaLe        0  
            9.                  .  
           10.         .        .  
           11.   missing        .  
          
          .
          . * Notice that Female is missing when Sex is not
          . * some variant of "Female" or "Male".  I assume
          . * this is what you want.
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 18.5 (Windows)

          Comment

          Working...
          X