Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • adding leading zeros to string variable

    Hi,

    I want to merge 2 datasets based on two variables, namely CIK and Year. However, in one dataset the CIK codes has leading zeros while in the other dataset there are no leading zeros. For instance, 0000001750 and 1750, or 0012480089 and 12480089. I want to include leading zeros to the values that do not contain leading zeros. I assume this is the problem when I want to merge.

    When I try this command:
    Code:
    format CIK %010.0f
    I get the "Type mismatch" error

    I also tried these codes:
    Code:
    gen str10 z = string(CIK, "%10.0f")
    gen str10 z = string(CIK, "%10.0g")
    I just don't get which command to use. I need to get 10-digit numbers, and my variable is a string. The screenshot shows how my data looks now.

    Thanks in advance, if you need more information please let me know!

    Josephine
    Attached Files

  • #2
    Code:
    gen z = string(real(CIK),"%010.0f")

    Comment


    • #3
      If your variable is already string, then the string() function is rejected as being intended to convert numeric values to string, not to change existing strings. See

      Code:
      help string()
      In recent versions of Stata the preferred name is strofreal() but string() still works.

      Your problem is different, as wanting to pad your existing string as in

      Code:
      clear
      input str8 wanting
      "1750"
      "12480089"
      end
      
      gen wanted1 = substr(10 * "0", 1, 10 - length(wanting)) + wanting
      
      gen wanted2 = string(real(wanting), "%010.0f")
      
      list
      
           +------------------------------------+
           |  wanting      wanted1      wanted2 |
           |------------------------------------|
        1. |     1750   0000001750   0000001750 |
        2. | 12480089   0012480089   0012480089 |
           +------------------------------------+



      In words:

      1. You can find out the length of the string, which lets you work out how many zeros to add as a prefix. 10 * "0" is a convenient way to write "0000000000". At the same time, you may know that you will never need more than 6 zeros or whatever.

      2. You can convert to real and then convert back insisting on a leading zero format. That can all be done in one line.


      Yet another solution, possibly easier than either, is to destring the string variable, on which the leading zeros will disappear.

      EDIT: Crossed with #2, which gives one of these solutions, and one is all you need.

      Comment

      Working...
      X