Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining Three or More Categorical Variables Into One, But Keeping the Originals As They Are, Possible?

    Hi,

    I would like to combine three different categorical variables into one variable, keeping the original variables the same. I have heard of the egen command for this, but I am told this command works best for numerical variables. Which stata command would work best for what I wish to do? If you have recommendations, would you consider also placing the proper syntax here?

  • #2
    Hello Roman. It will help a lot if you post a couple short data listings showing what the data look like now and what you want it to look like afterwards. See the FAQ for details on using -dataex-, code delimiters, etc. HTH.

    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      egen works fine here, contrary to misinformation.

      https://www.stata-journal.com/sjpdf....iclenum=dm0034 is a miniature review.

      As Bruce advises, do give a data example if you need more detailed help.
      Last edited by Nick Cox; 08 Jun 2018, 09:14.

      Comment


      • #4
        I would like this combine three variables: citizenp, geobrth, and yrsinus.

        dataex citizenp yrsinus geobrth

        ----------------------- copy starting from the next line -----------------------
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte(citizenp yrsinus geobrth)
        2 3 3
        2 4 3
        1 5 3
        1 5 2
        1 5 3
        2 4 3
        2 5 3
        2 5 3
        2 2 3
        2 3 3
        1 5 3
        2 5 3
        1 5 3
        1 5 2
        1 5 2
        2 4 3
        2 5 3
        1 3 3
        2 5 3
        2 5 3
        2 4 3
        2 5 3
        2 5 3
        1 5 3
        1 5 3
        2 4 3
        2 5 3
        2 5 3
        1 5 2
        2 5 3
        1 5 3
        2 4 3
        2 5 3
        2 5 3
        2 2 3
        2 2 3
        1 5 3
        1 5 3
        2 4 3
        2 4 3
        2 4 3
        1 5 3
        1 5 3
        1 5 3
        1 5 3
        2 4 3
        1 5 3
        1 3 3
        1 5 3
        2 4 3
        2 5 3
        1 5 3
        1 5 3
        2 5 3
        2 3 3
        2 2 3
        2 3 3
        2 5 3
        1 5 3
        2 3 3
        2 5 3
        1 5 3
        1 5 3
        2 5 3
        1 5 3
        1 5 3
        2 5 3
        1 3 3
        1 5 3
        1 4 3
        1 5 3
        2 5 3
        2 5 3
        2 4 3
        1 4 3
        1 4 2
        2 2 3
        2 4 3
        1 5 3
        1 5 3
        2 5 3
        1 5 3
        1 5 3
        1 4 3
        1 5 3
        1 5 3
        2 5 3
        2 3 3
        1 5 3
        1 5 3
        2 5 3
        2 4 3
        1 5 3
        2 5 3
        2 5 3
        1 3 2
        1 5 3
        1 5 3
        2 5 3
        1 5 2
        end
        ------------------ copy up to and including the previous line ------------------

        Listed 100 out of 1358 observations
        Use the count() option to list more

        I would like category 3 of geobrth, categories 3 4 5 of yrsinus, and 2 of citizenp combined into one new variable that I will name "naturalized". How would I do this?

        Comment


        • #5
          Is your combination

          Code:
          gen naturalized = (geobrth == 3) & inlist(yrsinus, 3, 4, 5) & (citizenp == 2) 
          or should there be | instead of & there?

          Comment


          • #6
            Yes, I believe so because I want each of those things to be true at once. Would I use the egen command instead of the gen command?

            Comment


            • #7
              If you want all conditions to be true, you need & as operator.

              There is no choice here. egen doesn't support expressions of the kind in #5.

              Comment


              • #8
                Thanks, Nick:

                I see that I have two categories (0 & 1) that should all just represent those who are naturalized citizens. Is this correct?

                dataex naturalized

                ----------------------- copy starting from the next line -----------------------
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float naturalized
                1
                1
                0
                0
                0
                1
                1
                1
                0
                1
                0
                1
                0
                0
                0
                1
                1
                0
                1
                1
                1
                1
                1
                0
                0
                1
                1
                1
                0
                1
                0
                1
                1
                1
                0
                0
                0
                0
                1
                1
                1
                0
                0
                0
                0
                1
                0
                0
                0
                1
                1
                0
                0
                1
                1
                0
                1
                1
                0
                1
                1
                0
                0
                1
                0
                0
                1
                0
                0
                0
                0
                1
                1
                1
                0
                0
                0
                1
                0
                0
                1
                0
                0
                0
                0
                0
                1
                1
                0
                0
                1
                1
                0
                1
                1
                0
                0
                0
                1
                0
                end
                ------------------ copy up to and including the previous line ------------------

                Listed 100 out of 1358 observations
                Use the count() option to list more

                .

                Comment


                • #9
                  Hi Roman. If you used the code Nick gave in #5, variable naturalized = 1 for each subject who meets all 3 conditions, and naturalized = 0 for subjects who do not meet all 3 conditions. HTH.
                  --
                  Bruce Weaver
                  Email: [email protected]
                  Version: Stata/MP 18.5 (Windows)

                  Comment


                  • #10
                    Thank you, Bruce.

                    Comment

                    Working...
                    X