Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata Find Smallest Value Greater Than 0

    I have data such as

    Code:
    input student score
    1 1
    1 0
    1 2
    2 0
    2 6
    2 7
    3 9
    3 2
    3 0
    4 0
    4 0
    4 0
    and wish to make data such as:

    Code:
    input student score WANT
    1 1 1
    1 0 1
    1 2 1
    2 0 6
    2 6 6
    2 7 6
    3 9 2
    3 2 2
    3 0 2
    4 0 .
    4 0 .
    4 0 .
    where WANT equals to the SMALLEST value of 'score' greater than 0 for each 'student' and if all student scores equals to 0 then it sets to missing.

  • #2
    I extended your data example to include a few more cases, because you don't say what should happen if scores are missing (presumably they should be ignored).

    Code:
    input student score
    1 1
    1 0
    1 2
    2 0
    2 6
    2 7
    3 9
    3 2
    3 0
    4 0
    4 0
    4 0
    5 .
    5 0
    6 1
    6 2
    7 .
    8 0
    end
    
    // Start here
    bysort student: egen min_score = min(score) if inrange(score, 0, .)
    by student: egen max_score = max(score) if inrange(score, 0, .)
    by student: gen int want = min_score if min_score != max_score & max_score != 0
    
    list, sepby(student)
    And the listing output

    Code:
         +----------------------------------------------+
         | student   score   min_sc~e   max_sc~e   want |
         |----------------------------------------------|
      1. |       1       1          0          2      0 |
      2. |       1       0          0          2      0 |
      3. |       1       2          0          2      0 |
         |----------------------------------------------|
      4. |       2       0          0          7      0 |
      5. |       2       6          0          7      0 |
      6. |       2       7          0          7      0 |
         |----------------------------------------------|
      7. |       3       9          0          9      0 |
      8. |       3       2          0          9      0 |
      9. |       3       0          0          9      0 |
         |----------------------------------------------|
     10. |       4       0          0          0      . |
     11. |       4       0          0          0      . |
     12. |       4       0          0          0      . |
         |----------------------------------------------|
     13. |       5       .          .          .      . |
     14. |       5       0          0          0      . |
         |----------------------------------------------|
     15. |       6       1          1          2      1 |
     16. |       6       2          1          2      1 |
         |----------------------------------------------|
     17. |       7       .          .          .      . |
         |----------------------------------------------|
     18. |       8       0          0          0      . |
         +----------------------------------------------+

    Comment


    • #3
      Leonardo Guizzetti Thank you but that is not quite what I am looking for if you look at the 'WANT' column in my example.

      Comment


      • #4
        Quite right, my mistake. In that case, this is modified by one extra pass of egen for the min and max score.

        Code:
        bysort student: egen min_score = min(score) if score > 0
        bysort student: egen min_score2 = min(min_score)
        by student: egen max_score = max(score) if score > 0
        by student: egen max_score2 = max(score)
        
        by student: gen int want = min_score2 if min_score2 != max_score2 & max_score2 != 0
        drop min_score* max_score*
        which now shows the results you are after.

        Code:
             +------------------------+
             | student   score   want |
             |------------------------|
          1. |       1       1      1 |
          2. |       1       0      1 |
          3. |       1       2      1 |
             |------------------------|
          4. |       2       0      6 |
          5. |       2       6      6 |
          6. |       2       7      6 |
             |------------------------|
          7. |       3       9      2 |
          8. |       3       2      2 |
          9. |       3       0      2 |
             |------------------------|
         10. |       4       0      . |
         11. |       4       0      . |
         12. |       4       0      . |
             |------------------------|
         13. |       5       .      . |
         14. |       5       0      . |
             |------------------------|
         15. |       6       1      1 |
         16. |       6       2      1 |
             |------------------------|
         17. |       7       .      . |
             |------------------------|
         18. |       8       0      . |
             +------------------------+

        Comment


        • #5
          Here is another way to do it:


          Code:
          clear 
          input student score WANT
          1 1 1
          1 0 1
          1 2 1
          2 0 6
          2 6 6
          2 7 6
          3 9 2
          3 2 2
          3 0 2
          4 0 .
          4 0 .
          4 0 .
          end 
          
          egen wanted = min(cond(score > 0, score, .)), by(student)
          
          l, sepby(student)
          
               +---------------------------------+
               | student   score   WANT   wanted |
               |---------------------------------|
            1. |       1       1      1        1 |
            2. |       1       0      1        1 |
            3. |       1       2      1        1 |
               |---------------------------------|
            4. |       2       0      6        6 |
            5. |       2       6      6        6 |
            6. |       2       7      6        6 |
               |---------------------------------|
            7. |       3       9      2        2 |
            8. |       3       2      2        2 |
            9. |       3       0      2        2 |
               |---------------------------------|
           10. |       4       0      .        . |
           11. |       4       0      .        . |
           12. |       4       0      .        . |
               +---------------------------------+
          
          .

          For an attempt to embed this within a more systematic discussion, see Section 9 of https://www.stata-journal.com/articl...article=dm0055

          Comment

          Working...
          X