Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Second minimum - Function

    Dear users,

    I go straight to the point asking maybe a silly question.

    Is there a function to get the 2nd, 3rd, 4th,.... minimum value from a variable?

    I know about
    Code:
    egen wanted=min(variable)
    But it gives you the first minimum. And looking to other STATA function related to
    Code:
    min
    seems to be no help

    Would you be so kind to give some advice on that?

    Thanks

  • #2
    The smallest four and the largest four are reported in:
    Code:
    summarize A_VAR, detail
    Another way is to generate a rank for it using -egen rank()-:
    Code:
    sysuse auto, clear
    egen order = rank(mpg), unique
    sort order
    There are a few different functions for rank() in deciding how to treat ties. Use -help egen- to learn more.

    Comment


    • #3
      There is no built-in function to do this. In any case, your question is ambiguous. Suppose the data are: 1, 2, 2, 3, 3, 3, 4. Is the fourth minimum value 3 (it is fourth in the data set if you sort the data), or is it 4 (it is the fourth minimum distinct value in the data). If you want the former, just sort the data and use the i'th observation of the variable. If you want the latter, it's a bit more complicated:

      Code:
      clear*
      
      //  CREATE A TOY DATA SET TO ILLUSTRATE THE CODE
      set obs 100
      set seed 1234
      gen x = rnormal(0, 1)
      //  BUILD IN SOME TIED VALUES
      replace x = x[_n-1] if _n > 1 & runiform() < 0.2
      
      sort x
      gen  vgroup = sum(x != x[_n-1])
      by vgroup, sort: replace vgroup = . if _n > 1
      
      forvalues i = 1/5 {
          summ x if vgroup == `i', meanonly
          display as text "`i'th minimum value = " `r(mean)'
      }
      Added: Crossed with #2.

      Comment


      • #4
        And another way is just to sort your data:

        Code:
        . sysuse auto, clear
        (1978 Automobile Data)
        
        . sort rep price
        
        . by rep: gen smallest = price[1]
        
        . by rep: gen secondsmallest = price[2]
        etc. And again there is the question what to do when you have ties, and when you have say only 2 observations in a group, but you want to calculate the third smallest.

        Comment


        • #5
          Dear Ken Chui , Clyde Schechter and Joro Kolev

          Thanks for taking care the issue. I really appreciate.

          Yes the question might be ambiguos and not totally clear. My variable of interests is a daily number of transaction with security as unit of analysis.
          I was able to pull out the minimum from each of them and I struggled with the second minimum.

          The second minimum value has to be the second distinct minimum value (no matter the order).

          Joro Kolev , I'm wondering whether your solution still work for my setting or is better to use the solution of Clyde Schechter (I'm a bit puzzled to get immediately the flow of the codes)

          The final results should be a variable named, let's say, "secondminimum", which it displays for each security and each month, the second minimum value of number of transaction (the number of transactions are daily ones)

          Many thanks for your kindness and patience
          Last edited by Marco Errico; 27 Apr 2021, 09:03.

          Comment


          • #6
            No, what I showed does not work for what you want. If you cannot adapt Clyde's solution, you might try my command -levelstovar-.

            First, you put the levelstovar.ado that I am attaching to this post on your ado path, for example on my computer the file is in
            c:\ado\personal\l\levelstovar.ado

            Then you do something like this:

            Code:
            bysort security (transactions): levelstovar leveltransact = transactions
            by security: secondlow =   leveltransact[2]
            For example with the auto data:

            Code:
            . sysuse auto, clear
            (1978 Automobile Data)
            
            . keep rep price
            
            . bysort rep (price): levelstovar levelsprice = price
            
            . by rep: gen secondlow = levelsprice[2]
            
            . list in 1/20, sepby(rep)
            
                 +--------------------------------------+
                 | rep78   levels~e    price   second~w |
                 |--------------------------------------|
              1. |     1       4195    4,195       4934 |
              2. |     1       4934    4,934       4934 |
                 |--------------------------------------|
              3. |     2       3667    3,667       4010 |
              4. |     2       4010    4,010       4010 |
              5. |     2       4060    4,060       4010 |
              6. |     2       4172    4,172       4010 |
              7. |     2       5104    5,104       4010 |
              8. |     2       5886    5,886       4010 |
              9. |     2       6342    6,342       4010 |
             10. |     2      14500   14,500       4010 |
                 |--------------------------------------|
             11. |     3       3291    3,291       3299 |
             12. |     3       3299    3,299       3299 |
             13. |     3       3895    3,895       3299 |
             14. |     3       3955    3,955       3299 |
             15. |     3       4082    4,082       3299 |
             16. |     3       4099    4,099       3299 |
             17. |     3       4181    4,181       3299 |
             18. |     3       4187    4,187       3299 |
             19. |     3       4296    4,296       3299 |
             20. |     3       4482    4,482       3299 |
                 +--------------------------------------+

            Originally posted by Marco Errico View Post
            Dear Ken Chui , Clyde Schechter and Joro Kolev

            Thanks for taking care the issue. I really appreciate.

            Yes the question might be ambiguos and not totally clear. My variable of interests is a daily number of transaction with security as unit of analysis.
            I was able to pull out the minimum from each of them and I struggled with the second minimum.

            The second minimum value has to be the second distinct minimum value (no matter the order).

            Joro Kolev , I'm wondering whether your solution still work for my setting or is better to use the solution of Clyde Schechter (I'm a bit puzzled to get immediately the flow of the codes)

            The final results should be a variable named, let's say, "secondminimum", which it displays for each security and each month, the second minimum value of number of transaction (the number of transactions are daily ones)

            Many thanks for your kindness and patience
            Attached Files
            Last edited by Joro Kolev; 27 Apr 2021, 10:01.

            Comment


            • #7
              Thanks Joro Kolev for your codes. really helpful and I implemented it in my analysis.
              Could'nt find other solutions.

              Really kind of you!

              Comment

              Working...
              X