Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create a non mutually exclusive categorial variable

    Hello, I know that this topic has been brought up already but I have been reading several posts, and I could not find one that quite worked for me. I have this non mutually exclusive variable where clinicians are asked the age of their patients: 0-2, 3-5, 7-10, 10-18, 18+. They can select more than one answer. Unfortunately, it is a string variable which I cannot use so (it would give me in a column something like 1, 3, 4) I created dummy variables for each option but now I would like to recreate the initial variable but where it would tell me how many clinicians said 0-2, how many said 3-5, and it would not add up to a 100. As some of you pointed out, just to create a new variable with gen and replace commands does not work. Any ideas? Thanks

  • #2
    It’s difficult to follow what you are asking. Please provide an example of how your variable looks and your intended results, preferably using the dataex command. About five observations should be enough.

    Code:
    help dataex

    Comment


    • #3
      I agree with Andrew Musau. I've read #1 several times but am not clear at all on what you seek. There is no data example. You allude to previous posts without citing any. Are you working from a string variable or from the dummy (indicator) variables?

      Comment


      • #4
        Initially, it was a string variable Ageofpatients but I cannot work with this.So i destring it so that I gives me dummy variables for each category. Initially, what it would give me was this:

        Ageofpatients
        1,3,4 (for someone who said 0-2, 7-10, 10-18)
        1, 3 (for someone who said 0-2, 7-10)
        etc.

        I cannot work with this so I destring it but the only way was to create dummy variables for each category: age0-2, age7-10, age10-18, age18+.

        Now I would like to combine the dummy variables into one:

        Age of patients
        0-2 number of observations
        3-6 number of observations
        etc
        and the total will exceed my number of respondents.

        I cannot use the gen and replace commands, because the categories are non mutually exclusive, they get swallowed by the previous category.....I am hoping it makes more sense now. It should not be that complicated to do, not sure why I have not been able to find some simple solutions to it. Thanks for any help you can provide.

        Comment


        • #5
          Hi Jennifer, It's not a matter of whether what you say makes sense or not. Andrew and Nick are asking for a data example (generated using the -dataex- command) so that they can see the structure of the data themselves and so they can work with an example of the data on their computers.

          I'd like to know what you think the combined variable should look like. For example, should there be a different category for every possible combination of each range of ages? if you have 4 ranges, thats (2^4)-1 or 15 categories in the resulting variable.

          Also, why do you need to combine these ranges in the first place? Why not use separate dummy variables in (e.g.) a regression? Is this the outcome you are modeling?

          Comment


          • #6
            Daniel Schaefer makes very good points. My own guess is that -tabsplit- from the package tab_chi on SSC may help. Search the forum for mentions. I am away from a computer at the moment.

            Comment

            Working...
            X