Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • auto creating an ordinal categorical variable from a continous variable

    Hi, I would like to create an ordinal categorical variable (4 values) from a continuous variable in stata, what is the command I can use?

    For example, if i have a continuous variable that runs from 1-100, how do I use a command to automatically create 4 ordinal categories? Thank you!

  • #2
    There is nothing "automatic" about creating an ordinal variable from a continuous variable. The cutpoints used to divide the range of the continuous variable should be chosen thoughtfully, reflecting whatever it is you are trying to accomplish by doing this. It requires taking into account the real-world meaning of the original variable and what the real-world associations of the continuous values are and what they entail for your planned analyses.

    Finally, I can't leave this without pointing out that categorizing a continuous variable is usually a bad idea. It discards information and adds noise. It treats values close to but on opposite sides of a cutpoint as radically different even though in the real world they are usually quite similar. And it treats values at the opposite ends of the range of one of the categories as the same, even though in the real world they are often quite different. So you should only do this if you really have a good reason that outweighs these drawbacks, or in the fortunate situation of being able to choose cutoffs that minimize or eliminate this problem.

    You might want to read https://www.fharrell.com/post/errmed/#dichotomania for more about the drawbacks of categorizing continuous variables. While much of his discussion there is specifically about dichotomizing, everything said there applies equally to any number of categories.
    Last edited by Clyde Schechter; 27 Dec 2024, 13:42.

    Comment


    • #3
      Thanks, Clyde. I understand your perspective.

      I thought there was something that could be done with the egen command, but I can't figure it out.

      Comment


      • #4
        while I strongly agree with what Clyde Schechter says, and while your use of "auto creating" is not clear to me, the "cut" function in the -egen- command may help; see
        Code:
        h egen

        Comment


        • #5
          I echo Clyde Schechter's comments on the drawbacks of carving quantitative variables into categories. In case you are interested, here is a short conference presentation (with a rather cheeky title) I gave on this topic several years ago.
          Free Advice for Power-Hungry Researchers: Do Not Categorize Quantitative Variables!
          Cheers,
          Bruce
          --
          Bruce Weaver
          Email: [email protected]
          Version: Stata/MP 18.5 (Windows)

          Comment


          • #6
            Thanks all. I'm just trying to show some categorization to describe characteristics of individuals in a study to input into a table 1 and thought there was a command in which stata auto generated ordinal variables from continuous variables.

            Comment


            • #7
              I agree with the previous comments regarding the drawbacks of categorizing a continuous variable, particularly the loss of information. However, to answer your question, you can automate the generation of a categorical variable if the cutoffs are deterministic. If they are additionally evenly spaced, e.g., if the original scale is 1–100 and you need 1–25 = 1, 26–50 = 2, 51–75 = 3, and 76–100 = 4, a simple application of the floor or ceiling function will achieve this. Consider the following example:

              Code:
              clear
              set obs 30
              set seed 12282024
              gen continuous= runiformint(1, 100)
              gen wanted= ceil(continuous/25)
              Res.:


              Code:
              . l, sep(0)
              
                   +-------------------+
                   | contin~s   wanted |
                   |-------------------|
                1. |       62        3 |
                2. |       83        4 |
                3. |       19        1 |
                4. |       66        3 |
                5. |       60        3 |
                6. |       73        3 |
                7. |       10        1 |
                8. |        9        1 |
                9. |       19        1 |
               10. |       15        1 |
               11. |       69        3 |
               12. |       75        3 |
               13. |       95        4 |
               14. |       22        1 |
               15. |       75        3 |
               16. |       88        4 |
               17. |      100        4 |
               18. |       52        3 |
               19. |       61        3 |
               20. |       97        4 |
               21. |       22        1 |
               22. |       42        2 |
               23. |       28        2 |
               24. |       66        3 |
               25. |       15        1 |
               26. |       90        4 |
               27. |       60        3 |
               28. |       12        1 |
               29. |       75        3 |
               30. |       28        2 |
                   +-------------------+
              Last edited by Andrew Musau; 28 Dec 2024, 03:00.

              Comment

              Working...
              X