Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating variable that is average of different time periods by id

    Hi everyone,

    I am trying to generate a variable that is the average of four different periods. So, I have shown you below my variables of modate (month-date), hh_zip (zip code) and base_share. I want to do the following: say, for zip code 15210, I want to create a variable that shows the average of base_share for period 2, 3, 4, and 5 (I know it doesn't show up here on all periods but that is just because you don't see all of my data here). So it would take the .655+.604+etc+etc and give me a variable that is average over the four periods. How would I do something like this? Thanks!

    Code:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float modate2 long hh_zip float base_share
    1 15110  .7272727
    2 15136  .3714286
    2 15212  .6041667
    2 15136  .3714286
    2 15212  .6041667
    2 15212  .6041667
    2 15136  .3714286
    2 15212  .6041667
    3 15210  .6559139
    3 15210  .6559139
    3 15210  .6559139
    3 15210  .6559139
    3 15210  .6559139
    3 15210  .6559139
    3 15210  .6559139
    4 15332         .
    4 15132  .3611111
    4 15101       .25
    4 15216 .16666667
    4 15101       .25
    4 15101       .25
    4 15219  .8095238
    4 15132  .3611111
    4 15147        .7
    4 15136  .3714286
    4 15071         0
    4 15212  .6041667
    4 15221  .6153846
    4 15025  .5882353
    4 15136  .3714286
    4 15110  .7272727
    4 15122  .5714286
    4 15205  .4210526
    4     .  .3076923
    4 48503         .
    4 15223     .3125
    4 15235  .7142857
    4     .  .3076923
    4     0         .
    4 15101       .25
    4 15227        .5
    4 15132  .3611111
    4 15221  .6153846
    4 15136  .3714286
    4 15206  .6891892
    4 15120  .8235294
    4 15214  .6153846
    4 15202  .7142857
    4 15202  .7142857
    4 15210  .6559139
    4 15211 .14285715
    4 15202  .7142857
    4 15147        .7
    4 15132  .3611111
    4 15214  .6153846
    4 15215  .6666667
    4 15206  .6891892
    4 15129       .75
    4 15120  .8235294
    4 15065  .6363636
    4 15065  .6363636
    4 15206  .6891892
    4 15210  .6559139
    4 15221  .6153846
    4 15237  .6666667
    4 15132  .3611111
    4 15132  .3611111
    4 15137  .3333333
    4 15219  .8095238
    4 15136  .3714286
    4 15219  .8095238
    4 15219  .8095238
    4 15215  .6666667
    4 15136  .3714286
    4 15025  .5882353
    4 15229  .7142857
    4     0         .
    4 15101       .25
    4 15212  .6041667
    4 15136  .3714286
    4 15212  .6041667
    4 15202  .7142857
    4 15136  .3714286
    4 15216 .16666667
    4 15057         1
    4 15204  .4210526
    4 15108 .58536583
    4 15206  .6891892
    4 15202  .7142857
    4 15207       .68
    4 15211 .14285715
    4 15212  .6041667
    4 15122  .5714286
    4 15236  .5263158
    4 15132  .3611111
    4 15132  .3611111
    4     .  .3076923
    4 15136  .3714286
    4 15212  .6041667
    4 15215  .6666667
    end

  • #2
    I believe you want
    Code:
    by hh_zip (modate2), sort: egen wanted = mean(cond(inlist(modate2, 2, 3, 4, 5), ///
        base_share, .))
    Note: As you acknowledged, your example data does not provide a good substrate for this code. It has only a single observation for which modate2 is anything other than 2, 3, 4, or 5. And in that hh_zip, the value of base_share is exactly the same for that observation as it is for the sole other observation in that hh_zip; consequently it is impossible to distinguish the restricted mean from the overall mean. Nevertheless, I believe the above code will do what you want.

    Comment

    Working...
    X