adding leading zeros to string variable

Josephine Nicolai

Join Date: Jun 2021

Posts: 20
#1

adding leading zeros to string variable

01 Dec 2022, 03:13

Hi,

I want to merge 2 datasets based on two variables, namely CIK and Year. However, in one dataset the CIK codes has leading zeros while in the other dataset there are no leading zeros. For instance, 0000001750 and 1750, or 0012480089 and 12480089. I want to include leading zeros to the values that do not contain leading zeros. I assume this is the problem when I want to merge.

When I try this command:

Code:

format CIK %010.0f

I get the "Type mismatch" error

I also tried these codes:

Code:

gen str10 z = string(CIK, "%10.0f") gen str10 z = string(CIK, "%10.0g")

I just don't get which command to use. I need to get 10-digit numbers, and my variable is a string. The screenshot shows how my data looks now.

Thanks in advance, if you need more information please let me know!

Josephine

Attached Files
Tags: None
Øyvind Snilsberg

Join Date: Oct 2021

Posts: 591
#2

01 Dec 2022, 04:14

Code:

gen z = string(real(CIK),"%010.0f")
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

01 Dec 2022, 04:26

If your variable is already string, then the string() function is rejected as being intended to convert numeric values to string, not to change existing strings. See

Code:

help string()

In recent versions of Stata the preferred name is strofreal() but string() still works.

Your problem is different, as wanting to pad your existing string as in

Code:

clear input str8 wanting "1750" "12480089" end gen wanted1 = substr(10 * "0", 1, 10 - length(wanting)) + wanting gen wanted2 = string(real(wanting), "%010.0f") list +------------------------------------+ | wanting wanted1 wanted2 | |------------------------------------| 1. | 1750 0000001750 0000001750 | 2. | 12480089 0012480089 0012480089 | +------------------------------------+

In words:

1. You can find out the length of the string, which lets you work out how many zeros to add as a prefix. 10 * "0" is a convenient way to write "0000000000". At the same time, you may know that you will never need more than 6 zeros or whatever.

2. You can convert to real and then convert back insisting on a leading zero format. That can all be done in one line.

Yet another solution, possibly easier than either, is to destring the string variable, on which the leading zeros will disappear.

EDIT: Crossed with #2, which gives one of these solutions, and one is all you need.
1 like
Comment

Announcement

adding leading zeros to string variable

Comment

Comment