
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to get summary statistics across groups rather than within group?

    Hi everyone,

    I'm using Stata to deal with a hierarchical dataset, where I have level one data (individual) and level two data (household). My data looks like this:

    ind_id    household    income    housesize
    1            1            12000        57
    2            1            39000        57
    3            2            27560        140
    4            2            39000        140
    5            2            25000        140
    6            3            27435        120
    7            3            22390        120

    So here each individual from the same household has same housesize. I want to get the mean and standard deviation of housesize, for instance with the data here it should be (57+140+120)/3 Which command should I use?

    And since I have many variables like housesize, and I want mean, sd, max of all of them in one file, is there any command that do this directly?

    Thank you so much!
    Last edited by Lucrecia Lei; 24 Apr 2022, 12:31.

  • #2
    egen tag= tag(household)
    sum housesize if tag
    egen wanted1= mean(cond(tag, housesize, .))
    egen wanted2= sd(cond(tag, housesize, .))


    • #3
      Originally posted by Andrew Musau View Post
      egen tag= tag(household)
      sum housesize if tag
      egen wanted1= mean(cond(tag, housesize, .))
      egen wanted2= sd(cond(tag, housesize, .))
      Thank you Andrew! This is exactly what I want!


      • #4
        And just figure out another way is to use collapse, but restore is annoying to me, tag is much better!

