Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating cumulative percentage for panel data

    Hi Everyone,

    I am looking for some help to calculate the cumulative percentages for a panel data.

    Code:
             year         id   income
     2001    1    34   
    2001    1    34    
    2001    1    34    
    2001    2    45    
    2001    2    45    
    2001    2    45   
    2002    1    23   
    2002    1    23   
    2002    2    67    
    2002    2    67    
    2002    2    67    
    2002    2    67
    This is a panel data sorted by the year and the id. The income is the mean income of all the individuals with a particular id. For example, in 2001, the mean income for all the individuals with id 1 is 34.

    The result should be like this:
    Code:
    Year    Id    income    Rank    Frequency    cumfreq    cummPercentage
    2001    1    34    2    3    3    50
    2001    1    34    2    3    3    50
    2001    1    34    2    3    3    50
    2001    2    45    1    3    6    100
    2001    2    45    1    3    6    100
    2001    2    45    1    3    6    100
    2002    1    23    2    2    2    33
    2002    1    23    2    2    2    33
    2002    2    67    1    4    6    100
    2002    2    67    1    4    6    100
    2002    2    67    1    4    6    100
    2002    2    67    1    4    6    100
    Please note that since this is a panel data, I need the cumulative percentages for each year.

    I have gone through some informative threads on related topics, but most of them demonstrate for cross sectional data. I could not find anything on panel data.

    I would appreciate any help.

    Thanks a ton,

    J
    Last edited by Jason Cruso; 21 May 2021, 00:40.

  • #2
    I only follow some of this. For example, if id 1 has mean 34 in 2001 and id 2 has mean 45, then the cumulative percents are not 50 and 100.

    Also, is the calculation supposed to ignore the different numbers in each household?

    You may be able to adapt this to your real problem It may be easier to work out total income for each household first.

    Code:
    clear 
    input year id   income
    2001    1    34   
    2001    1    34    
    2001    1    34    
    2001    2    45    
    2001    2    45    
    2001    2    45   
    2002    1    23   
    2002    1    23   
    2002    2    67    
    2002    2    67    
    2002    2    67    
    2002    2    67
    end 
    
    sort year id 
    egen tag = tag(year id)
    by year (id) : gen total = sum(tag * income)
    bysort year id (total) : replace total = total[1]
    by year: gen percent = 100 * total/total[_N]
    
    list, sepby(id year)
    
        +---------------------------------------------+
         | year   id   income   tag   total    percent |
         |---------------------------------------------|
      1. | 2001    1       34     1      34   43.03798 |
      2. | 2001    1       34     0      34   43.03798 |
      3. | 2001    1       34     0      34   43.03798 |
         |---------------------------------------------|
      4. | 2001    2       45     1      79        100 |
      5. | 2001    2       45     0      79        100 |
      6. | 2001    2       45     0      79        100 |
         |---------------------------------------------|
      7. | 2002    1       23     1      23   25.55556 |
      8. | 2002    1       23     0      23   25.55556 |
         |---------------------------------------------|
      9. | 2002    2       67     1      90        100 |
     10. | 2002    2       67     0      90        100 |
     11. | 2002    2       67     0      90        100 |
     12. | 2002    2       67     0      90        100 |
         +---------------------------------------------+

    Comment


    • #3
      Hi Nick,
      Thank you so much for your inputs!
      You have made me think if I am doing this right.
      I was interpreting cumulative percentage as:
      cumulative percentage = (cumulative frequency รท n) x 100
      Where n is the total count of the individuals for a given year.

      J

      Comment


      • #4
        I can't tell you what you want. This code works on total household income.

        Code:
        clear 
        input year id   income
        2001    1    34   
        2001    1    34    
        2001    1    34    
        2001    2    45    
        2001    2    45    
        2001    2    45   
        2002    1    23   
        2002    1    23   
        2002    2    67    
        2002    2    67    
        2002    2    67    
        2002    2    67
        end 
        
        bysort year id : gen hhtotal = income * _N 
        egen tag = tag(year id)
        by year (id) : gen total = sum(tag * hhtotal)
        bysort year id (total) : replace total = total[1]
        by year: gen percent = 100 * total/total[_N]
        
        list, sepby(id year)
        
             +-------------------------------------------------------+
             | year   id   income   hhtotal   tag   total    percent |
             |-------------------------------------------------------|
          1. | 2001    1       34       102     1     102   43.03798 |
          2. | 2001    1       34       102     0     102   43.03798 |
          3. | 2001    1       34       102     0     102   43.03798 |
             |-------------------------------------------------------|
          4. | 2001    2       45       135     1     237        100 |
          5. | 2001    2       45       135     0     237        100 |
          6. | 2001    2       45       135     0     237        100 |
             |-------------------------------------------------------|
          7. | 2002    1       23        46     1      46   14.64968 |
          8. | 2002    1       23        46     0      46   14.64968 |
             |-------------------------------------------------------|
          9. | 2002    2       67       268     1     314        100 |
         10. | 2002    2       67       268     0     314        100 |
         11. | 2002    2       67       268     0     314        100 |
         12. | 2002    2       67       268     0     314        100 |
             +-------------------------------------------------------+

        Comment


        • #5
          Thanks Nick! I get it now.
          I appreciate your help.

          Comment


          • #6
            Hi!
            I am new to Stata and I'm trying to make a very simple graph. I have units that do a certain action, in a given year, and units that don't ever perform the action. My aim is to show that the percentage of units doing that action increases over time. I start with a % around 0 (very few units join the "program", so to speak) and am likely to end up with 20% of units doing so.
            I would like to do this with a by(var) and overlay, in order to plot three different sectors of units growing in percentage of take-up with respect to their peers, as time goes by.

            Also, my data is from a survey, so i need to add probabilistic weights to the percentages to get them right with respect to the population. The survey is also stratified (with 420 strata). I used svy to tell Stata this, but it is not compatible with plots and graphs.

            Comment

            Working...
            X