Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating person mean

    I am working on a long data with repeated measures. I wanted to calculate a person mean, wherein each person will have a mean score of their responses across three time points. How do I do it in Stata? I am a new user so any help will be appreciated. Thanks a lot!

  • #2
    Well, hoping that your scantily described data matches what I imagine you have:
    Code:
    by person_id (time), sort: egen wanted = mean(response)
    In the future, when asking for help with code, it is best to leave nothing to the imagination about the organization of your data set. Even small details are often critical. And describing data sets in words, though necessary to some extent, is only rarely sufficient. Instead, show example data, and use the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Quick question, Clyde Schechter. I noticed that you included time in the above code, but is that necessary or perhaps more appropriately, when would that be necessary?

      For example, using the pig data, which is balanced, the following commands give the same value for each pig's weight:
      Code:
      webuse pig, clear
      bysort id (week): egen p_wt = mean(weight)
      bysort id: egen p_wt2 = mean(weight)
      With the following results:
      Code:
        +--------------------------+
        | id       p_wt      p_wt2 |
        |--------------------------|
        |  1   48.66667   48.66667 |
        |  1   48.66667   48.66667 |
        |--------------------------|
        |  2   51.33333   51.33333 |
        |  2   51.33333   51.33333 |
        |--------------------------|
        |  3   48.38889   48.38889 |
        |  3   48.38889   48.38889 |
        |--------------------------|
        |  4   48.55556   48.55556 |
        |  4   48.55556   48.55556 |
      Last edited by Erik Ruzek; 28 Jan 2024, 13:56. Reason: Fixed code

      Comment


      • #4
        The inclusion of -(time)- is not necessary for the purpose of calculating the group means.

        However, it serves a secondary purpose. In order to use -by-, the data must be -sort-ed on person_id. But person_id is not a unique identifier in this data set. So if I just code -by person_id, sort: egen wanted = mean(response)-, the sorting may randomize the order of the observations within person_id. That side effect will sometimes be undesirable. At the least, the sudden scrambling of the order of the observations to somebody viewing the data may be puzzling. At the worst it may necessitate a re-sorting of the data back to its original order before proceeding with further data management or analysis. So, where possible and practical, it is my practice when -by ..., sort:-ing to include one or more parenthesized variables that will, jointly with the by-group variable(s), uniquely identify observations and will either preserve the existing sort order, or establish a "canonical" sort order (such as panel_variable time_variable, or chronological order as here in a repeated-measures data set).

        I adopted this practice many years ago, and I have been wondering when somebody would raise this question. You are the first to have done so.

        Comment


        • #5
          Thank you for the clarification, Clyde! That makes a lot of sense. It seems like a practice worth incorporating into my own code.

          Comment


          • #6
            I appreciate this a lot, Clyde! I will try this out. By the way, thanks for the tip on -dataex-.

            Comment

            Working...
            X