Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing the variable's missing observations with the existing mean of that variable

    Hello Everyone,
    I have data on income and I would like to replace the variable's missing observations with the existing mean of that variable itself. The command that I am using now comprises three lines. I would highly appreciate if you could advise me on whether I could write in that command in a single line, more efficiently. Thank you very much in advance!

    The command that I have now is the following:

    Code:
         egen income_mean=mean(income_recoded)
         replace income_recoded = income_mean if income_recoded==.
         drop income_mean
    And below please find that variable:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float income_recoded
     .75
    2.75
    2.25
    2.25
    2.75
    2.75
    2.25
     .25
    4.75
    1.75
    4.25
    3.25
    1.75
    6.25
    6.25
     .75
    3.25
    2.25
    2.25
    2.25
    3.75
    4.75
    3.75
    3.75
    3.25
    1.75
    4.25
    3.25
    1.25
    1.75
    3.25
    3.75
    3.25
    3.25
    1.25
    1.75
    2.25
    4.75
    2.75
    2.75
    4.25
    3.75
    3.75
    3.75
    2.25
    3.25
    3.75
    4.75
    1.25
    6.25
    6.25
    3.25
    6.25
    8.75
     .75
    2.25
    4.25
    3.75
    4.75
    4.25
    6.25
    3.75
    1.75
    1.75
    2.75
    4.25
    4.75
    3.75
    1.75
    2.25
    3.75
    2.25
    3.75
    4.25
    1.25
    3.25
    2.75
    2.75
    3.75
    2.25
    6.25
       .
       .
    3.75
    1.75
    1.25
    1.75
     .75
    3.25
    2.25
    2.75
    6.25
       .
    2.75
    2.75
    2.75
    6.25
       .
    1.75
    2.25
    end
    Last edited by Nick Baradar; 08 Jun 2022, 05:16.

  • #2
    The same effect is achieved with

    Code:
    su income_recoded, meanonly 
    replace income_recoded = r(mean) if income_recoded == .
    without bothering with a new variable. The bigger deal is that this would be widely reckoned to be the simplest and also the most problematic imputation method imaginable, but I guess you know that -- not to mention that mean income is an unstable summary.

    Comment

    Working...
    X