Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summarize command for panel data

    With the -summarize- command for panel data, what exactly does the mean show? For example, with the income variable, does it show the mean income amongst all individuals in one year or the mean income across all individuals in all years?

  • #2
    -summarize- will give you the mean income across all individuals in all years. -summarize- is not aware of panel structure in your data: its results are exactly what you would get if your data were not panel data. If you would like to get within and between statistics for your panel data, the command you need is -xtsum-, not -summarize-.

    Comment


    • #3
      What does overall, between and within mean for xtsum?

      Comment


      • #4
        The overall results are the same results you would get from -summarize-

        The between results are those you would get by taking the average value in each group, and then looking at the distribution of those averages (with only one observation for each group represented).

        The within results are gotten by differencing each observed value from the mean in its group and providing descriptive statistics for those. Well, sort of. The mean difference from the group mean is necessarily 0 within groups. So to make it more comparable, the grand mean is added back.

        Comment


        • #5
          Would you say that the between results is best when trying to describe the mean and standard deviation for each variable?

          Comment


          • #6
            And with the within I don't get any mean results, why is that?

            Comment


            • #7
              The mean is the same whether you are looking overall, between, or within, so Stata only shows it once. The standard deviations and ranges, however, differ, so Stata reports them separately.

              As for which is the "best" one to report, it depends on what you're trying to describe. In my own work, I usually report overall results when describing my sample. To the extent that I want to show how the nesting of observations within groups alters things, I usually just report an intraclass correlation, but if a more detailed presentation is needed, then the between and within results could be useful as well. It really depends on what your audience is interested in seeing.

              Comment


              • #8
                Thank you! Also, why is there a large "N" and a small "n" in the observations column?

                Comment


                • #9
                  N refers to the number of observations in the entire data set. Small n refers to the number of groups.

                  Comment

                  Working...
                  X