Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Use Stata Code to Count How Many Students ever took More Than 1 Time Exam?

    I have a small dataset and it just has 2 variables. ID---student ID, exam_times----how many times a student took.
    I want to use Stata code to find how many students took more than 1 time exam?
    Thank you.

    clear
    input str10 id byte exam_times
    1 1
    1 2
    2 1
    3 1
    3 2
    3 3
    3 4
    4 1
    4 2
    4 3
    end

  • #2
    This looks like homework (coursework) to me, see the Stata Forum's extra advice #4.

    Using -help- you should have a look at -count-, -if-, and -missing-.

    The latter is important. For example, how many students would you expect to have taken an exam more than once if an additional 5 cases of your example have missing values of "exam_times"? To get a wrong answer to this expanded exercise, try this:
    Code:
    set obs 15
    count if exam_times > 1

    Comment


    • #3
      I don't think your code is correct. Thank you!

      Comment


      • #4
        Why do you think the code is wrong?

        Comment


        • #5
          Just 4 persons in the dataset and the result is 6 with your code.

          Comment


          • #6
            There are 3 persons in the dataset who had taken exam more than once. (id=1;3; and 4)

            Comment


            • #7
              It seems that you didn't read my answer carefully enough.

              Using your data and my code the wrong answer you get is 11 (not 6). There are 10 cases in you dataset (not 4) and when using my code there are 15 cases. This is why you should have a look not only at -count- and -if- but also at -missing- (if you want to use a Stata code that will produces the correct answer under all circumstances).

              Comment


              • #8
                My dataset is correct. This is the original status. I just want to count the number of the students who took the exam more than once.

                Comment


                • #9
                  I have a small dataset and it just has 2 variables. ID---student ID, exam_times----the nth exam the student took
                  I want to use Stata code to find how many students took more than 1 time exam?
                  Thank you.

                  clear
                  input str10 id byte exam_times (the nth exam the student took)
                  1 1
                  1 2
                  2 1
                  3 1
                  3 2
                  3 3
                  3 4
                  4 1
                  4 2
                  4 3
                  end

                  above is the updated note of the variable name.

                  Comment


                  • #10
                    If any student took more than 1 exam, then at some point they took a 2nd exam and

                    Code:
                    count if exam_times ==  2
                    counts each such student once only. Isn't that how you could solve the problem without Stata?

                    Comment


                    • #11
                      Sorry, it was me that did not read your question carefully enough: I overlooked that the student_ID issue. A general solution (taking into account possible missing values of exam_times) could be:
                      Code:
                      bys id: egen max_exam = max(exam_times)
                      egen pick_id = tag(id)
                      count if pick_id & max_exam > 1

                      Comment


                      • #12
                        Sorry again (too hasty today): #11 is not general concerning possible missings, it should have been:
                        Code:
                        bys id: egen max_exam = max(exam_times)
                        egen pick_id = tag(id)
                        count if pick_id & max_exam > 1 & max_exam < .
                        But as Nick Cox cleverly pointed out, it is sufficient that the number of exams is 2. My solution is useful only if you have messy data where students have missing values (e.g. with respect to the 1st and 2nd exam).

                        Comment


                        • #13
                          Originally posted by Nick Cox View Post
                          If any student took more than 1 exam, then at some point they took a 2nd exam and

                          Code:
                          count if exam_times == 2
                          counts each such student once only. Isn't that how you could solve the problem without Stata?
                          Thank you. The real dataset has 456000 observations or so and that is why I asked this question here. In fact, I want to find how many students took the exam more than 3 times because all students took at least the exam once and some of them took the exam more than 3 and even 10 times or so. The real dataset has no missing value.
                          Last edited by smith Jason; 17 Apr 2023, 06:27.

                          Comment


                          • #14
                            Originally posted by Dirk Enzmann View Post
                            Sorry, it was me that did not read your question carefully enough: I overlooked that the student_ID issue. A general solution (taking into account possible missing values of exam_times) could be:
                            Code:
                            bys id: egen max_exam = max(exam_times)
                            egen pick_id = tag(id)
                            count if pick_id & max_exam > 1
                            Thank you.

                            Comment


                            • #15
                              It’s the same problem with a different value of 2! Any student who sat more than 3 times took an exam for the 4th time.

                              Comment

                              Working...
                              X