Hello,
I am creating a new variable for continuous distance data. I want to categorise the data into 4 quantiles.
(variable name: dist_bt_fi, new variable name: dist_bt_fi_4)
My variable has 133 0's, and ranges from 0-9300, and are mainly in the 1000's.
when I write the following;
sort dist_bt_fi
xtile dist_bt_fi_4 = dist_bt_fi, nq(4)
tab dist_bt_fi_4
tabstat dist_bt_fi, stat(n mean min max sd p50) by(dist_bt_fi_4)
The minimum and maximum values for each quantile are incorrect (by a long shot! My maximum value is 54)
dist_bt_fi_4 | min max
-------------+----------------------------
1 | 1, 2
2 | 3, 3
3 | 4 ,23
4 | 24, 54
---------------+---------------------------
I cannot tell where these numbers are being generated from, but I've heard it might be to do with having lots of zeros in the dataset?
Any thoughts would be greatly appreciated,
Anna
I am creating a new variable for continuous distance data. I want to categorise the data into 4 quantiles.
(variable name: dist_bt_fi, new variable name: dist_bt_fi_4)
My variable has 133 0's, and ranges from 0-9300, and are mainly in the 1000's.
when I write the following;
sort dist_bt_fi
xtile dist_bt_fi_4 = dist_bt_fi, nq(4)
tab dist_bt_fi_4
tabstat dist_bt_fi, stat(n mean min max sd p50) by(dist_bt_fi_4)
The minimum and maximum values for each quantile are incorrect (by a long shot! My maximum value is 54)
dist_bt_fi_4 | min max
-------------+----------------------------
1 | 1, 2
2 | 3, 3
3 | 4 ,23
4 | 24, 54
---------------+---------------------------
I cannot tell where these numbers are being generated from, but I've heard it might be to do with having lots of zeros in the dataset?
Any thoughts would be greatly appreciated,
Anna
Comment