Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate the cumulative mean with different weights by groups?

    How to calculate the cumulative mean with different weights by groups?
    The weight is variable: order

    I know in order to calculate the cumulative mean by groups, we can use

    Code:
     rangestat  (mean) cumulative=score ,interval(order -18 -1) by(ID)
    However, if I want to set the weights of cumulative mean as order, how to calculate it?

    For example, for observation 2, the expected value is 68.8*1, for observation 3, the expected value is 68.8*1/(1+2)+73.7*2(1+2).

    Many thanks!

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double score byte order float ID
                 68.8  1 1
                 73.7  2 1
    76.46000000000001  3 1
                71.74  4 1
                 57.8  5 1
                 70.2  6 1
                 62.4  7 1
    77.97999999999999  8 1
    69.46000000000001  9 1
                 79.1 10 1
    67.96000000000001 12 1
    69.03999999999999 13 1
                68.16 15 1
    76.03999999999999 16 1
                 63.9  1 2
                 68.8  2 2
                   60  3 2
                 64.8  4 2
                 78.1  5 2
                 75.9  6 2
                 71.2  7 2
                 58.2  8 2
                 59.5  9 2
                 64.2 10 2
                 60.6 11 2
                 67.9 12 2
                 74.5 13 2
                 70.3 14 2
                 66.8 15 2
                 64.4 16 2
                 77.3 17 2
    end
    Last edited by Fred Lee; 13 Dec 2021, 06:53.

  • #2
    rangestat is from SSC, as you are asked to explain.

    Does this help?

    Code:
    clear
    input double score byte order float ID
                 68.8  1 1
                 73.7  2 1
    76.46000000000001  3 1
                71.74  4 1
                 57.8  5 1
                 70.2  6 1
                 62.4  7 1
    77.97999999999999  8 1
    69.46000000000001  9 1
                 79.1 10 1
    67.96000000000001 12 1
    69.03999999999999 13 1
                68.16 15 1
    76.03999999999999 16 1
                 63.9  1 2
                 68.8  2 2
                   60  3 2
                 64.8  4 2
                 78.1  5 2
                 75.9  6 2
                 71.2  7 2
                 58.2  8 2
                 59.5  9 2
                 64.2 10 2
                 60.6 11 2
                 67.9 12 2
                 74.5 13 2
                 70.3 14 2
                 66.8 15 2
                 64.4 16 2
                 77.3 17 2
    end
    
    bysort ID (order) : gen double numer = sum(order * score)
    by ID: gen denom = sum(order)
    
    gen double wanted = numer / denom 
    
    list, sepby(ID)
    
         +--------------------------------------------------+
         | score   order   ID     numer   denom      wanted |
         |--------------------------------------------------|
      1. |  68.8       1    1      68.8       1        68.8 |
      2. |  73.7       2    1     216.2       3   72.066667 |
      3. | 76.46       3    1    445.58       6   74.263333 |
      4. | 71.74       4    1    732.54      10      73.254 |
      5. |  57.8       5    1   1021.54      15   68.102667 |
      6. |  70.2       6    1   1442.74      21   68.701905 |
      7. |  62.4       7    1   1879.54      28   67.126429 |
      8. | 77.98       8    1   2503.38      36   69.538333 |
      9. | 69.46       9    1   3128.52      45   69.522667 |
     10. |  79.1      10    1   3919.52      55      71.264 |
     11. | 67.96      12    1   4735.04      67   70.672239 |
     12. | 69.04      13    1   5632.56      80      70.407 |
     13. | 68.16      15    1   6654.96      95   70.052211 |
     14. | 76.04      16    1    7871.6     111   70.915315 |
         |--------------------------------------------------|
     15. |  63.9       1    2      63.9       1        63.9 |
     16. |  68.8       2    2     201.5       3   67.166667 |
     17. |    60       3    2     381.5       6   63.583333 |
     18. |  64.8       4    2     640.7      10       64.07 |
     19. |  78.1       5    2    1031.2      15   68.746667 |
     20. |  75.9       6    2    1486.6      21   70.790476 |
     21. |  71.2       7    2      1985      28   70.892857 |
     22. |  58.2       8    2    2450.6      36   68.072222 |
     23. |  59.5       9    2    2986.1      45   66.357778 |
     24. |  64.2      10    2    3628.1      55   65.965455 |
     25. |  60.6      11    2    4294.7      66   65.071212 |
     26. |  67.9      12    2    5109.5      78    65.50641 |
     27. |  74.5      13    2      6078      91   66.791209 |
     28. |  70.3      14    2    7062.2     105   67.259048 |
     29. |  66.8      15    2    8064.2     120   67.201667 |
     30. |  64.4      16    2    9094.6     136   66.872059 |
     31. |  77.3      17    2   10408.7     153   68.030719 |
         +--------------------------------------------------+
    
    .
    Code:
    
    


    With any loosely similar problem, I would probably prefer exponential smoothing.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      rangestat is from SSC, as you are asked to explain.

      Does this help?

      Code:
      clear
      input double score byte order float ID
      68.8 1 1
      73.7 2 1
      76.46000000000001 3 1
      71.74 4 1
      57.8 5 1
      70.2 6 1
      62.4 7 1
      77.97999999999999 8 1
      69.46000000000001 9 1
      79.1 10 1
      67.96000000000001 12 1
      69.03999999999999 13 1
      68.16 15 1
      76.03999999999999 16 1
      63.9 1 2
      68.8 2 2
      60 3 2
      64.8 4 2
      78.1 5 2
      75.9 6 2
      71.2 7 2
      58.2 8 2
      59.5 9 2
      64.2 10 2
      60.6 11 2
      67.9 12 2
      74.5 13 2
      70.3 14 2
      66.8 15 2
      64.4 16 2
      77.3 17 2
      end
      
      bysort ID (order) : gen double numer = sum(order * score)
      by ID: gen denom = sum(order)
      
      gen double wanted = numer / denom
      
      list, sepby(ID)
      
      +--------------------------------------------------+
      | score order ID numer denom wanted |
      |--------------------------------------------------|
      1. | 68.8 1 1 68.8 1 68.8 |
      2. | 73.7 2 1 216.2 3 72.066667 |
      3. | 76.46 3 1 445.58 6 74.263333 |
      4. | 71.74 4 1 732.54 10 73.254 |
      5. | 57.8 5 1 1021.54 15 68.102667 |
      6. | 70.2 6 1 1442.74 21 68.701905 |
      7. | 62.4 7 1 1879.54 28 67.126429 |
      8. | 77.98 8 1 2503.38 36 69.538333 |
      9. | 69.46 9 1 3128.52 45 69.522667 |
      10. | 79.1 10 1 3919.52 55 71.264 |
      11. | 67.96 12 1 4735.04 67 70.672239 |
      12. | 69.04 13 1 5632.56 80 70.407 |
      13. | 68.16 15 1 6654.96 95 70.052211 |
      14. | 76.04 16 1 7871.6 111 70.915315 |
      |--------------------------------------------------|
      15. | 63.9 1 2 63.9 1 63.9 |
      16. | 68.8 2 2 201.5 3 67.166667 |
      17. | 60 3 2 381.5 6 63.583333 |
      18. | 64.8 4 2 640.7 10 64.07 |
      19. | 78.1 5 2 1031.2 15 68.746667 |
      20. | 75.9 6 2 1486.6 21 70.790476 |
      21. | 71.2 7 2 1985 28 70.892857 |
      22. | 58.2 8 2 2450.6 36 68.072222 |
      23. | 59.5 9 2 2986.1 45 66.357778 |
      24. | 64.2 10 2 3628.1 55 65.965455 |
      25. | 60.6 11 2 4294.7 66 65.071212 |
      26. | 67.9 12 2 5109.5 78 65.50641 |
      27. | 74.5 13 2 6078 91 66.791209 |
      28. | 70.3 14 2 7062.2 105 67.259048 |
      29. | 66.8 15 2 8064.2 120 67.201667 |
      30. | 64.4 16 2 9094.6 136 66.872059 |
      31. | 77.3 17 2 10408.7 153 68.030719 |
      +--------------------------------------------------+
      
      .
      Code:
      
      


      With any loosely similar problem, I would probably prefer exponential smoothing.
      Thank you! Since I want to cululative previous mean, this works:
      Code:
      bysort ID (order) : gen double numer = sum(order * score)-order * score
      by ID: gen denom = sum(order)-oreder
      gen double wanted = numer / denom
      I am wondering whether there is more easier way, especially setting an option of weight when calculate mean?

      Comment


      • #4
        This is perhaps easier than what you had

        Code:
        bysort ID (order) : gen double numer = sum(order * score)
        by ID: gen denom = sum(order)
        by ID: gen double wanted = numer[_n-1] / denom[_n-1]

        especially setting an option of weight when calculate mean
        an option of which command?

        You want several means at once. summarize for example can't help you, except within a loop.

        Note that you could rewrite the code above as one line. After 4 years using Stata and > 300 posts here, that could be an exercise.....

        Comment


        • #5
          Originally posted by Nick Cox View Post
          This is perhaps easier than what you had

          Code:
          bysort ID (order) : gen double numer = sum(order * score)
          by ID: gen denom = sum(order)
          by ID: gen double wanted = numer[_n-1] / denom[_n-1]



          an option of which command?

          You want several means at once. summarize for example can't help you, except within a loop.

          Note that you could rewrite the code above as one line. After 4 years using Stata and > 300 posts here, that could be an exercise.....
          Thanks, Nick! I will try!

          Comment


          • #6
            OK

            Detail: There is no point to quoting the entirety of a previous post. You can just refer to #5, or whatever The point of quotation is to be selective.

            Comment

            Working...
            X