Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compute the sum of the first occurence within Group(s) in Stata?

    I have a small long format dataset below.
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte (gender period pr)
    1 1 11
    1 1 11
    1 1 11
    1 2 15
    1 2 15
    1 2 15
    2 1 11
    2 1 11
    2 1 11
    2 2 12
    2 2 12
    2 2 12
    end

    I want to create a new variable sum_pr,
    which equals to the sum of the first value within gender and period like below:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte (gender period pr sum_pr)
    1 1 11 26
    1 1 11 26
    1 1 11 26
    1 2 15 26
    1 2 15 26
    1 2 15 26
    2 1 11 23
    2 1 11 23
    2 1 11 23
    2 2 12 23
    2 2 12 23
    2 2 12 23
    end

    How can I use Stata code to achieve it?
    Thank you!




  • #2
    Code:
     generate seq = _n
    
    . bysort gender (period seq): egen sum_pr = total(cond(period!=period[_n-1],pr,0))
    
    . drop seq
    
    . list, sepby(gender)
    
         +-------------------------------+
         | gender   period   pr   sum_pr |
         |-------------------------------|
      1. |      1        1   11       26 |
      2. |      1        1   11       26 |
      3. |      1        1   11       26 |
      4. |      1        2   15       26 |
      5. |      1        2   15       26 |
      6. |      1        2   15       26 |
         |-------------------------------|
      7. |      2        1   11       23 |
      8. |      2        1   11       23 |
      9. |      2        1   11       23 |
     10. |      2        2   12       23 |
     11. |      2        2   12       23 |
     12. |      2        2   12       23 |
         +-------------------------------+

    Comment


    • #3
      Thank you!

      Comment


      • #4
        I have to bother your guys again. Sorry, I think I didn't express my thoughts very well.
        What I really want is listed as follows, and it should be running sum instead of sum.
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte (gender period pr sum_pr)
        1 1 11 11
        1 1 11 11
        1 1 11 11
        1 2 15 26
        1 2 15 26
        1 2 15 26
        2 1 11 11
        2 1 11 11
        2 1 11 11
        2 2 12 23
        2 2 12 23
        2 2 12 23
        end

        Can someone keep helping me in Stata?
        Thank you!

        Comment


        • #5
          Sorry, no, not the way you have posted this data. Please take a moment to read the output of -help dataex- and post code and output within the code delimiters. This will make it easy for us to read and, if necessary, copy data into Stata.

          Comment


          • #6
            If I did something wrong, please forgive me and I have to open a new post here to ask for help.

            Comment


            • #7
              Thank you for editing your post in #4 to make the dataset usable after dataex. It can be a bit tricky to ask questions here and get the formatting right, etc. No need to apologize, it's a learning curve.

              This will get you a running sum by gender and period.

              Code:
              clear
              input byte (gender period pr sum_pr)
              1 1 11 11
              1 1 11 11
              1 1 11 11
              1 2 15 26
              1 2 15 26
              1 2 15 26
              2 1 11 11
              2 1 11 11
              2 1 11 11
              2 2 12 23
              2 2 12 23
              2 2 12 23
              end
              
              egen first = tag(gender period)
              by gender (period), sort: gen want = sum(pr) if first
              by gender period, sort: replace want = want[1] if mi(want)
              list, sepby(gender period)
              Result

              Code:
              . list, sepby(gender period)
              
                   +----------------------------------------------+
                   | gender   period   pr   sum_pr   first   want |
                   |----------------------------------------------|
                1. |      1        1   11       11       1     11 |
                2. |      1        1   11       11       0     11 |
                3. |      1        1   11       11       0     11 |
                   |----------------------------------------------|
                4. |      1        2   15       26       1     26 |
                5. |      1        2   15       26       0     26 |
                6. |      1        2   15       26       0     26 |
                   |----------------------------------------------|
                7. |      2        1   11       11       1     11 |
                8. |      2        1   11       11       0     11 |
                9. |      2        1   11       11       0     11 |
                   |----------------------------------------------|
               10. |      2        2   12       23       1     23 |
               11. |      2        2   12       23       0     23 |
               12. |      2        2   12       23       0     23 |
                   +----------------------------------------------+

              Comment


              • #8
                Thank you very much!

                Comment


                • #9
                  You're welcome.

                  Comment

                  Working...
                  X