Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Niels: Some confusion here. My comment in #13 refers to program varlast. It is entirely relevant to that program if used by itself. Gerald's post in #12 was not what I was replying to. It wasn't visible to me when I posted my comment.

    Anyone: Note further that varlast's criterion of not missing is not equal to system missing. It therefore counts extended missings .a to .z as non-missing.

    Comment


    • #17
      Sorry Nick. I misunderstood. And wrt
      Code:
       st_numscalar("r(last)", max(select((1..rows(st_data(.,.)))', st_data(., varname) :!= .)))
      in varlast you are of course correct. But I think that it would be more fair to mention that by changing != to < that would be solved, ie
      Code:
       st_numscalar("r(last)", max(select((1..rows(st_data(.,.)))', st_data(., varname) :< .)))
      This can also be seen by:
      Code:
      . mata
      : x = 1,2,.,.a, .g
      
      : x :< .
             1   2   3   4   5
          +---------------------+
        1 |  1   1   0   0   0  |
          +---------------------+
      Kind regards

      nhb

      Comment


      • #18
        Niels: You are of course right that using < rather than != gets you non-missings in Stata's usual sense, but I have no idea whether Gerald was deliberate in coding that way. I just pointed out how the program behaves and (perhaps wrongly) took it that the change to get other behaviour was evident.

        Comment


        • #19
          Thanks guys, I incorporated that change. Thank you for the explanation, I am atill long ways from getting any decent at mata so this was much appriciated.

          Comment


          • #20
            Just a quick question on this. How do we keep the last non missing maximum row by group.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(group value)
            1   10
            1   20
            2   10
            2   30
            3   10
            3   30
            3    .
            4   10
            5 49.5
            5   50
            end
            Edit: Just to add context. I would like to keep the row that contains the highest value for each group. As there is a missing value I would like to ignore that and keep the last non-missing row in that group. Hope this helps
            Last edited by Fahad Mirza; 22 Aug 2021, 12:39.

            Comment


            • #21
              It sounds like you want:
              Code:
              egen mvalue = max(value), by(group) // does not treat missing as max
              gen tiebreaker = runiform()
              bysort group mvalue (tiebreaker): keep if _n == 1

              Comment


              • #22
                #20 is a non-Mata question. See https://www.stata.com/support/faqs/d...t-occurrences/ in any case. In addition as you don't want to keep missings, you can simplify things by dropping them directly and then going

                Code:
                drop if missing(value) 
                bysort group (value) : keep if _n = _N
                As Mike Lacy hints, what happens with ties for maximum may need some thought.

                Code:
                bysort group (value) : keep if value == value[_N]
                keeps the ties for maximum.

                Comment

                Working...
                X