Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get a vector of chars

    Hi
    I would like to get a vector of chars, ie strings of length 1.
    If I use char as below I get a string scalar.
    Code:
    : f = jumble((1,1,2,2,3,4,4,4,5,5,5)')
    
    : char((f :+ 96)')
      acdeeebdabd
    The Mata function -tokens- might be a solution, but in that case I can't figure out what the separator should be.

    Looking forward to hear from you
    Kind regards

    nhb

  • #2
    Niels --

    While I don't know if it's efficient in your case, you could install and use the moremata function mm_pieces. I.e.:
    Code:
    : f = jumble((1,1,2,2,3,4,4,4,5,5,5)')
    
    : C=char((f:+96)')
    
    : chars=mm_pieces(C,1)
    Hope that helps!

    Matt Baker

    Comment


    • #3
      Hi Matthew Sorry for not answering back before.
      Thank you very much. I'll look into it.
      I had moremata installed already.
      I'd just didn't notice that one.
      Thank you very much.
      Kind regards

      nhb

      Comment


      • #4
        Niels --

        Glad to help! It also occurred to me that if you had to do this with a matrix, you could write a little loop that breaks things up manually:
        Code:
        mata:
        strings="abcdefg" \ "bdfd"
        chars=J(rows(strings),0,"")
        maxlen=max(strlen(strings))
        
        for (i=1;i<=maxlen;i++) chars=chars,substr(strings,i,1)
        This code works even when the individual strings are of different length and can handle a bunch of strings at once if speed is important.

        Best,

        Matt Baker



        Comment


        • #5
          Hi Matthew
          Thank you very much for your elegant code sample.
          It is nice to see that substr returns an empty string instead of making an error when the 3. argument is out of range.


          I've been working in another direction inspired by Python and functional programming.
          First I define a function that use the same function on all values in a list, eg a vector:
          Code:
          function map(f, row)
          {
              if (cols(row) == 1){
                  return((*f)(row[1]))
              } else {
                  return((*f)(row[1]), map(&(*f), row[2::cols(row)]))
              }
          }
          Note that I use a function pointer here.

          With this function present I can do like:
          Code:
          : f = jumble((1,1,2,2,3,4,4,4,5,5,5)')
          : function tochars(nbr) return(char(nbr))
          
          : strofreal(f), map(&tochars(), (f :+ 96)')'
                  1   2
               +---------+
             1 |  3   c  |
             2 |  5   e  |
             3 |  5   e  |
             4 |  2   b  |
             5 |  5   e  |
             6 |  1   a  |
             7 |  4   d  |
             8 |  4   d  |
             9 |  4   d  |
            10 |  1   a  |
            11 |  2   b  |
               +---------+
          Regrettably I have to invent my own char function, tochar, for getting this to work, otherwise:
          Code:
          : map(&char(), (f :+ 96)')
          char() built-in, may not evaluate address
          r(3000);
          Another function that I miss in Mata is reduce:
          Code:
          function reduce(f, row)
          {
              if (cols(row) == 1){
                  return(row[1])
              } else {
                  return((*f)(row[1], reduce(&(*f), row[2::cols(row)])))
              }
          }
          Usage could be like:
          Code:
          : function product(x, y) return(x*y)
          
          : reduce(&product(), f')
            96000
          I can't garantee that the functions map and reduce are efficient, but it would be nice if they were built-in in Mata.
          Hence they would be optimized and probably able to use built-in's like char directly.
          I know that we can do a lot by the "element by element operators" like :* and :==, but not near enough.

          So here I'm starting the wish list for Stata 15 or later.
          I wish for the functions map and reduce to be built-in somehow in Mata!
          Maybe map could be integrated somehow in the J function?

          If I missed some elegant way of doing what map and reduce do above I do apologize (And of course in that case I withdraw my wish)
          Kind regards

          nhb

          Comment

          Working...
          X