Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reporting the largest value from an observation

    Dear Statausers

    I have a dataset containing total percentage of religious adherents looking like this:

    year state christianity islam buddhism
    1987 Norway 67 21 8

    67 % of the Norwegian population is christian, 21 % muslim etc.

    I want to create a new variable called religion which report which of the three religions most Norwegians adhere to.

    E.g
    gen religion=0
    replace religion=1 if christianity is the largest of the three
    replace religion=2 if islam is the largest of the three
    replace religion=3 if buddhism is the largest of the three

    Any help would be appreciated.

  • #2
    What happens with ties?

    Code:
    gen max = max(christianity, islam, buddhism)
    
    gen largest = cond(christanity == max, 1, cond(islam == max, 2, 3))

    Comment


    • #3
      Dear Nick

      Thank you for a quick respond to my post.

      The first code reports back the highest value, e.g. 67 since christianity is the highest. While the second command, reports back the followin:

      gen largest= cond(christ==max, 1) cond(islm==max, 2) con(budd==max, 3))
      invalid syntax
      r(198);

      Regards

      Comment


      • #4
        Thomas:
        I would check the brackets in your code.
        Kind regards,
        Carlo
        (StataNow 18.5)

        Comment


        • #5
          Thomas,

          there is at least one typo in your version of Nick's command: "con" when it should be "cond"

          Comment


          • #6
            I meant what I suggested! You tried something quite different and didn't say why.

            Here's proof of concept.

            Code:
            clear 
            input christianity islam buddhism
            67 21 8
            1 2 97 
            30 40 40 
            end 
            
            gen max = max(christianity, islam, buddhism)
            
            gen largest = cond(christianity == max, 1, cond(islam == max, 2, 3))
            
            list 
            
                 +---------------------------------------------+
                 | christ~y   islam   buddhism   max   largest |
                 |---------------------------------------------|
              1. |       67      21          8    67         1 |
              2. |        1       2         97    97         3 |
              3. |       30      40         40    40         2 |
                 +---------------------------------------------+

            Comment


            • #7
              I'm sorry. I didn't explain the entire structure of my data.

              I have observations for several states, not just Norway. E.g.

              year state christianity islam buddhism
              1987 Norway 67 21 8
              1988 Norway 67 22 7
              1987 Sweden 55 41 3
              1988 Sweden 53 43 3

              Would the command "by state year: gen max...." do the trick?

              Comment


              • #8
                The structure makes no difference here to my recommendation. You want maximum across an observation; what's in other observations makes no difference.

                Comment


                • #9
                  OK. No I got it. Thanks a lot. I think some of the variables was codes as strings.

                  If I had more than 3 religions. Would the coding look like this:

                  gen max = max(christianity, islam, judaism, buddhism, shintoism, confusionism)
                  gen largest = cond(christianity == max, 1, cond(islam == max, 2, cond(judaism == max, 3, cond(buddhism == max, 4).... I tried it would the total of 14 religions and I got this error: . gen largest = cond(christ == max, 1, cond(jue == max, 2, cond(islm == max, 3, cond(bud == max, 4, cond(zoro == max, 5, cond(hind == max, 6, cond(sihk == max, 7, cond(shint == max, 8, cond(bah == max, 9, cond(tao == max, 10, cond(jai == max, 11, cond(confu == max, 12, cond(sync == max, 13, 14)) too few ')' or ']'

                  Comment


                  • #10
                    Stata is telling you the problem as you need to end with as many ) as you put down ( -- just as in school algebra -- so 13 parentheses.

                    With 14 cases I wouldn't want to count up to 13.

                    I would try something different, an extension of

                    Code:
                    clear
                    input christianity islam buddhism
                    67 21 8
                    1 2 97
                    30 40 40
                    end
                    
                    gen max = .
                    gen biggest = .
                    local j = 1
                    quietly foreach v in christianity islam buddhism {
                        replace max = max(max, `v')
                        replace biggest = `j' if `v' == max
                        local ++j
                    }
                    
                    list
                    
                         +---------------------------------------------+
                         | christ~y   islam   buddhism   max   biggest |
                         |---------------------------------------------|
                      1. |       67      21          8    67         1 |
                      2. |        1       2         97    97         3 |
                      3. |       30      40         40    40         3 |
                         +---------------------------------------------+
                    See also https://www.stata-journal.com/sjpdf....iclenum=pr0046
                    Last edited by Nick Cox; 04 Jun 2018, 13:18.

                    Comment

                    Working...
                    X