Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with stored `r(min)' command from a Sum() command

    Hello Statalisters,
    I am having an issue with using the stored `r(min)' value from a sum command. I am looking at test results for schools and I want to keep all schools that have test data less than or equal to the minimum value of the conditions that I have set. When I run the code, it is not keeping the school that has the minimum value found in the sum command. What am I doing wrong?

    Here is a sample of my data
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long schl float(keep percent_on_gl2015Math)
    7846 .     0
    3849 . .0363
    6492 . .0388
    3847 . .0467
    7855 . .0533
    8148 1 .0588
    8308 . .0588
    8303 . .0625
    7852 . .0714
    5286 . .0714
    6172 . .0714
    7506 . .0769
    7922 . .0816
    5275 . .0909
    7245 . .0921
    3857 . .0956
    7293 . .1026
    3851 . .1038
    6675 .  .104
    7255 . .1041
    end

    Here is a simplified example of the issue I am running into.
    When I run the following sum command to get the minimum test value for all school that have a 1 in the keep variable, i get:
    Code:
    sum percent_on_gl2015Math if keep ==1
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
    percen~5Math |         14    .3142571    .1791266      .0588      .6587
    I then run the following to try to keep all values that are less than or equal to .0588 (the minimum value in the above sum output)
    Code:
    . keep if percent_on_gl2015Math <= `r(min)'

    It does not keep the 2 observations that are equal to the minimum value:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long schl float(keep percent_on_gl2015Math)
    7846 .     0
    3849 . .0363
    6492 . .0388
    3847 . .0467
    7855 . .0533
    end
    Thank you all for your help!

    I am using IC 15.1 for Windows 64 bit

  • #2
    It's a precision problem. With your example data, compare


    Code:
    . list if percent <= 0.0588 
    
         +------------------------+
         | schl   keep   percen~h |
         |------------------------|
      1. | 7846      .          0 |
      2. | 3849      .      .0363 |
      3. | 6492      .      .0388 |
      4. | 3847      .      .0467 |
      5. | 7855      .      .0533 |
         +------------------------+
    
    . 
    . list if percent <= float(0.0588)
    
         +------------------------+
         | schl   keep   percen~h |
         |------------------------|
      1. | 7846      .          0 |
      2. | 3849      .      .0363 |
      3. | 6492      .      .0388 |
      4. | 3847      .      .0467 |
      5. | 7855      .      .0533 |
         |------------------------|
      6. | 8148      1      .0588 |
      7. | 8308      .      .0588 |
         +------------------------+
    
    . di %21x 0.0588
    +1.e1b089a027525X-005
    
    . di %21x float(0.0588)
    +1.e1b08a0000000X-005

    Comment


    • #3
      I believe the code should be:
      Code:
      keep if percent_on_gl2015Math <= r(min)
      Red Owl
      Stata/IC 16.0 (Windows 10, 64-bit)

      Comment


      • #4
        Thank you Nick!

        I have updated my relative code to be
        Code:
        keep if percent_on_gl2015Math <= float(`r(min)')
        it works correctly.

        Best,
        Sam

        Comment


        • #5
          I agree with Red Owl that in general it's better to use r(min) than its local macro persona. The difference is usually small but for non-integers it's not guaranteed to be zero.

          Comment


          • #6
            Thank you Red Owl as well.

            I will play with both options in future projects.

            Best,
            Sam

            Comment


            • #7
              They aren't really different options. `r(min)' and friends are odd beasts even in Stata terms, but the broad truth is that local macros can hold less precision than scalars, so there is no reason to use local macros for calculation when scalars are available. The difference is usually trivial but it will bite in knife-edge decisions, which is where we came in.

              Comment

              Working...
              X