Dear Stata community:
I am wondering whether there is a bug in the Stata 17 codebook command affecting what it indicates about the %tc date-time format, or whether I am misunderstanding date-time formats?
Background: When I have applied the %td format to a numeric variable, the codebook tells me that the variable is a daily date, as shown with date2 in the example below. This makes sense to me--I have a numeric variable, and because I applied the %td format, Stata knows it is a daily date, and the codebook tells me so.
. codebook date2 // Starting with a numeric variable.
------------------------------------------------------------------------------------------------------------
date2 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [22471,22986] Units: 1
Unique values: 49 Missing .: 0/76
Mean: 22900.8
Std. dev.: 103.416
Percentiles: 10% 25% 50% 75% 90%
22883 22894 22922.5 22953.5 22978
. format date2 %td // Applying daily date formatting so the variable is human-readable.
. codebook date2 // Output is as I expect--Yay!
------------------------------------------------------------------------------------------------------------
date2 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric daily date (float)
Range: [22471,22986] Units: 1
Or equivalently: [10jul2021,07dec2022] Units: days
Unique values: 49 Missing .: 0/76
Mean: 22900.8 = 12sep2022(+ 18 hours)
Std. dev.: 103.416
Percentiles: 10% 25% 50% 75% 90%
22883 22894 22922.5 22953.5 22978
26aug2022 06sep2022 04oct2022 04nov2022 29nov2022
In contrast, when I apply %tc formatting to a numeric variable, the codebook does NOT indicate it is a date-time variable, even though the variable displays as a date-time the way I want it to when using the -browse- or -list- commands, as shown with date1 below. Why doesn't the -codebook- command produce output with a human-readable date1?
. codebook date1 // Starting with a numeric variable.
------------------------------------------------------------------------------------------------------------
date1 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [1.977e+12,1.987e+12] Units: 100000
Unique values: 65 Missing .: 10/76
Mean: 2.0e+12
Std. dev.: 3.0e+09
Percentiles: 10% 25% 50% 75% 90%
2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12
. format date1 %tc // Applying date-time formatting so the variable is human-readable.
. codebook date1 // Output is not as I expect, there is no indication that date1 is formatted as a date-time variable.
------------------------------------------------------------------------------------------------------------
date1 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [1.977e+12,1.987e+12] Units: 100000
Unique values: 65 Missing .: 10/76
Mean: 2.0e+12
Std. dev.: 3.0e+09
Percentiles: 10% 25% 50% 75% 90%
2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12
. list date1 // However this output IS as I expect! Yay! (But why doesn't the codebook reflect the %tc formatting?)
+--------------------+
| date1 |
|--------------------|
1. | 29aug2022 10:28:25 |
2. | 30aug2022 09:57:27 |
.
.
.
I would appreciate any insight into the differences in how the codebook displays %tc vs. %td variables!
Melissa
I am wondering whether there is a bug in the Stata 17 codebook command affecting what it indicates about the %tc date-time format, or whether I am misunderstanding date-time formats?
Background: When I have applied the %td format to a numeric variable, the codebook tells me that the variable is a daily date, as shown with date2 in the example below. This makes sense to me--I have a numeric variable, and because I applied the %td format, Stata knows it is a daily date, and the codebook tells me so.
. codebook date2 // Starting with a numeric variable.
------------------------------------------------------------------------------------------------------------
date2 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [22471,22986] Units: 1
Unique values: 49 Missing .: 0/76
Mean: 22900.8
Std. dev.: 103.416
Percentiles: 10% 25% 50% 75% 90%
22883 22894 22922.5 22953.5 22978
. format date2 %td // Applying daily date formatting so the variable is human-readable.
. codebook date2 // Output is as I expect--Yay!
------------------------------------------------------------------------------------------------------------
date2 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric daily date (float)
Range: [22471,22986] Units: 1
Or equivalently: [10jul2021,07dec2022] Units: days
Unique values: 49 Missing .: 0/76
Mean: 22900.8 = 12sep2022(+ 18 hours)
Std. dev.: 103.416
Percentiles: 10% 25% 50% 75% 90%
22883 22894 22922.5 22953.5 22978
26aug2022 06sep2022 04oct2022 04nov2022 29nov2022
In contrast, when I apply %tc formatting to a numeric variable, the codebook does NOT indicate it is a date-time variable, even though the variable displays as a date-time the way I want it to when using the -browse- or -list- commands, as shown with date1 below. Why doesn't the -codebook- command produce output with a human-readable date1?
. codebook date1 // Starting with a numeric variable.
------------------------------------------------------------------------------------------------------------
date1 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [1.977e+12,1.987e+12] Units: 100000
Unique values: 65 Missing .: 10/76
Mean: 2.0e+12
Std. dev.: 3.0e+09
Percentiles: 10% 25% 50% 75% 90%
2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12
. format date1 %tc // Applying date-time formatting so the variable is human-readable.
. codebook date1 // Output is not as I expect, there is no indication that date1 is formatted as a date-time variable.
------------------------------------------------------------------------------------------------------------
date1 (unlabeled)
------------------------------------------------------------------------------------------------------------
Type: Numeric (float)
Range: [1.977e+12,1.987e+12] Units: 100000
Unique values: 65 Missing .: 10/76
Mean: 2.0e+12
Std. dev.: 3.0e+09
Percentiles: 10% 25% 50% 75% 90%
2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12
. list date1 // However this output IS as I expect! Yay! (But why doesn't the codebook reflect the %tc formatting?)
+--------------------+
| date1 |
|--------------------|
1. | 29aug2022 10:28:25 |
2. | 30aug2022 09:57:27 |
.
.
.
I would appreciate any insight into the differences in how the codebook displays %tc vs. %td variables!
Melissa