Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create code for averaging marks with missing information in the data?

    Hi, I have the following data. I have only given the sample for one student. But the whole data includes 100+ students.

    However, for some subjects, I do not know which semester they belong to in a given academic year. So, when I want to note their semester-wise marks, I can't do it.

    Hence, I want to tell Stata to take the average of those subjects whose semester number is unknown and then put it in the missing values of the semesterwise marks. For example, in the following case, I want to take the average of Eng and Science marks for the academic year 2010-11 which is 52.5 and insert this number for the missing cells in the semesterwise marks column. Similarly, I want to put the average of those subjects whose semester number is unknown in the missing cells in the academic year 2011-12 of the semesterwise marks column. Can anyone suggest a code for this? Because I want to do this for the whole data for 100+ students?
    Student number Year Subject name semester number Marks Semesterwise Marks
    1 2010-11 Maths Semester 1 60 60
    1 2010-11 English Don't know (missing) 50 .
    1 2010-11 Science Don't Know 55 .
    1 2010-11 History Semester 2 62 62
    1 2011-12 Geography Don't know 63 .
    1 2011-12 Maths-II Semester 1 56 56
    1 2011-12 Eng-ll Don't know 60 .
    1 2011-12 Sci-II Semester 2 57 57

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte studentnumber str7 year str9 subjectname str20 semesternumber byte(marks semesterwisemarks)
    1 "2010-11" "Maths"     "Semester 1"           60 60
    1 "2010-11" "English"   "Don't know (missing)" 50  .
    1 "2010-11" "Science"   "Don't Know"           55  .
    1 "2010-11" "History"   "Semester 2"           62 62
    1 "2011-12" "Geography" "Don't know"           63  .
    1 "2011-12" "Maths-II"  "Semester 1"           56 56
    1 "2011-12" "Eng-ll"    "Don't know"           60  .
    1 "2011-12" "Sci-II"    "Semester 2"           57 57
    end
    
    bys studentnumber year: egen mean= mean(cond(missing(semesterwisemarks), marks, .))
    replace semesterwisemarks= mean if missing(semesterwisemarks)

    Res.:

    Code:
    . l, sepby( studentnumber year )
    
         +---------------------------------------------------------------------------------+
         | studen~r      year   subject~e         semesternumber   marks   semest~s   mean |
         |---------------------------------------------------------------------------------|
      1. |        1   2010-11       Maths             Semester 1      60         60   52.5 |
      2. |        1   2010-11     English   Don't know (missing)      50       52.5   52.5 |
      3. |        1   2010-11     Science             Don't Know      55       52.5   52.5 |
      4. |        1   2010-11     History             Semester 2      62         62   52.5 |
         |---------------------------------------------------------------------------------|
      5. |        1   2011-12   Geography             Don't know      63       61.5   61.5 |
      6. |        1   2011-12    Maths-II             Semester 1      56         56   61.5 |
      7. |        1   2011-12      Eng-ll             Don't know      60       61.5   61.5 |
      8. |        1   2011-12      Sci-II             Semester 2      57         57   61.5 |
         +---------------------------------------------------------------------------------+
    
    .

    Comment


    • #3
      Thank you

      Comment

      Working...
      X