
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate variable using difference from the mean of others in same category


    I'm fairly new to state and I'm trying to pull something that is a bit complex for what I know for a project.

    In this case, I have two relevant variables: income (int), and country (string).

    I want to generate a new variable: the difference of the variable 'income', from the mean of all other observations with the same value for 'country'. In other words, difference in income from one's country's mean income.

    I would like to post what I've tried so far, but in reality I'm sorta lost.

    Thanks if you can help.

  • #2
    bysort country: egen mean = mean(income)
    gen wanted = mean - income
    Edited to correct an error pointed out in #3. Note that mean(income) in #3 should be mean_income.
    Last edited by Ali Atia; 18 Apr 2022, 14:23.


    • #3
      by country, sort: egen mean_income = mean(income)
      gen diff_from_mean = income - mean(income)
      In the future, when asking for help with code, it is a good idea to show example data. While your description proved adequate for the present question, in general data descriptions do not provide all the information needed for writing code. So use the -dataex- command to create example data and post it here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      Added: Crossed with #2, which starts out the same as my solution. But I believe the second line in #2 is incorrect and will produce an error message pointing out that no variable named difference can be found.


      • #4
        Originally posted by Ali Atia View Post
        bysort country: egen mean = mean(income)
        gen wanted = mean - difference
        Thank you very much, this is exactly what I needed!


        • #5
          For completeness, note the typo in #3 from Clyde Schechter

           gen diff_from_mean = income - mean(income)
          should be

          gen diff_from_mean = income - mean_income

