Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inrange in string values

    Hi,

    I have a huge data, with a string variable that has so many empty values (which is ok) and that is why it is meaningless to use dataex in this case. The string variable has values that start with different letters and digits. What I want is to count how many obs that string variable has values that within range of "V01" up to "V899". Bearing in mind that some values has only two digits following the letter and some has three digits. For instance, V01, V010, V019, V020, ..., V89, V899. So it will be hectic to include all range. I tried to use inrange with astrix like "V01*", "V89*" but it produce incorrect results than if I use inrange(stringvar,"V01","V899"). So can you please help which command is the best in this case, as it is near impossible to check if the results is matching or not and whether it captures all obs within that specified range!

    Thanks in advance for any suggestion!

  • #2
    I think you want
    Code:
    gen int numeric = real(substr(variable, 2, .))
    gen byte wanted = (substr(variable, 1, 1) == "V") & inrange(numeric, 1, 899)

    Comment


    • #3
      Thanks Clyde, that is exactly what I want. But does this command would include also values that start with V01 and V010, V02 and V020 , V019, V029, V210, V219, etc?

      Comment


      • #4
        Yes, it would. As long as the first character is "V" and the rest of the string is a number between 1 and 899, it will be captured.

        Added; Wait, no! "Start with???" You didn't say that in #1. This will catch values of the variable that are, in their entirety, V01 through V899. But if you have something like V234X, no, that will not be picked up by this code. And it also won't pick up V8990 (but it would pick up V0899).
        Last edited by Clyde Schechter; 18 Jun 2023, 13:17.

        Comment


        • #5
          Yes that what is I mean, V01 up to V899, the values has only two or three digits maximum after the initials. I was confused because in the command
          (numeric, 1, 899) and I was wondering what about those with zeros like V01 V010 V02 and V020. I thought it will only capture V01 , V011 ... V019.

          Comment

          Working...
          X