Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with truncating long string var, using real(substr(var,.,.))

    Hi everyone,

    I am trying to truncate an identifier variable in my data (string variable). The identifier is made of 16 digits.
    For example: 0104038003105701
    I want to keep only the first 14 digits, i.e, I would like to transform the above identifier into 01040380031057

    I have used many time the real(substr) command but for some reason stata here returns me a variable full of missing observations.
    I am running:

    gen int hhid2 = real(substr(hhid,1,14)) where hhid is my original identifier.

    It works with lower numbers, for example: gen int hhid2 = real(substr(hhid,1,3)) would work.

    Another problem is that for identifiers starting with 0 as in the example above, the 0 gets ommited. So for the identifier 0104038003105701 , the command real(substr(hhid,1,3)) would only return 10 , and not 010 as I want

    Many thanks for your help,

    Regards

    Basile

  • #2
    you are trying to generate an "int" variable and that is too small for 14 digits; the max of an "int" is 32,740; see "h int"

    also, why not make a string variable and leave it as such?

    Comment


    • #3
      Thanks for your quick reply.
      I'm not sure what you mean by "why not make a string variable and leave it as such?"
      I am doing panel analysis and in the second wave of data, identifiers are 16 digits long, while they are 14 digits long in the first wave. The information contained in the extra 2 digits in wave 2 is of no interest to me, so i'd like to truncate them so that I can match identifiers across waves.
      Thanks for your help

      Comment


      • #4
        the implication of your subject is that they are already string variables; thus,
        Code:
        gen hhid2=substr(hhid,1,14)
        should do the trick (based on what you say in #1

        since this is an identifier, there appears no reason for it to be numeric

        also, you don't respond at all to my first comment

        Comment


        • #5
          Hi Rich,
          many thanks for your help, it does work with gen hhid2=substr(hhid, 1, 14)
          Sorry I did not react on the first comment, but I was still trying to figure out the meaning of "h int" as I have never come cross this notation before. What does it exactly refer to?
          Thanks again for the help, greatly appreciated

          Basile

          Comment


          • #6
            "h" is short for help so "h int" means show me the help file for int

            Comment


            • #7
              Basile: The request from February http://www.statalist.org/forums/foru...ticular-values still stands.

              Comment

              Working...
              X