I have a dataset that has been imported from Excel, I do not have the spreadsheet, and the variable Sex should be either M,F or missing, unfortunately some of the values included a trailing space. When I summarise the data I get these results using tab, fre(from SSC) and table:
My query is why does table does not recognise all the values of Sex?
Thank you,
Martyn
Stata 17 MP
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str2 Sex int _freq "" 5 "F" 47 "M" 104 "M " 1 end
Code:
. expand _freq (153 observations created) . chartab Sex decimal hexadecimal character | frequency unique name ------------------------------------+---------------------------------------- 32 \u0020 | 1 SPACE 70 \u0046 F | 47 LATIN CAPITAL LETTER F 77 \u004d M | 105 LATIN CAPITAL LETTER M ------------------------------------+---------------------------------------- freq. count distinct ASCII characters = 153 3 Multibyte UTF-8 characters = 0 0 Unicode replacement character = 0 0 Total Unicode characters = 153 3 . tab Sex Sex | Freq. Percent Cum. ------------+----------------------------------- F | 47 30.92 30.92 M | 104 68.42 99.34 M | 1 0.66 100.00 ------------+----------------------------------- Total | 152 100.00 . fre Sex Sex -- Sex ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid F | 47 29.94 30.92 30.92 M | 104 66.24 68.42 99.34 M | 1 0.64 0.66 100.00 Total | 152 96.82 100.00 Missing | 5 3.18 Total | 157 100.00 ----------------------------------------------------------- . table Sex -------------------- | Frequency --------+----------- Sex | F | 47 M | 1 Total | 152 -------------------- .
Thank you,
Martyn
Stata 17 MP
Comment