I don't understand the algorithm behind fastxtile() when the number of observations is less than the bin number.
I know one may question why do this in the first place, but there are instances when grouping by a certain variable may create groups that have less observations than the bin number itself.
When tested with several bin numbers, the fastxtile() behaves in a seemingly regular pattern, but I can't seem to decipher what that pattern is.
I used a code similar to the following:
egen temp_bin = fastxtile(mv), by(time_var) n(5)
The following are some sample cases when using a specific bin_number with a certain number of observations (num_obs) and the bins that the fastxtile() command categorized the observations into.
bin_num=5:
# num_obs=5: 1,2,3,4,5
# num_obs=4: 1,2,3,4
# num_obs=3: 1,2,4
# num_obs=2: 1,3
# num_obs=1: 1
So, for example, in the above case, when there were 3 observations and bin_num=5, the observations were binned as bin 1, bin 2 and bin 4.
bin_num=7:
# num_obs=7: 1,2,3,4,5,6,7
# num_obs=6: 1,2,3,4,5,6
# num_obs=5: 1,2,3,5,6
# num_obs=4: 1,2,4,6
# num_obs=3: 1,3,5
# num_obs=2: 1,4
# num_obs=1: 1
bin_num=9:
# num_obs=9: 1,2,3,4,5,6,7,8,9
# num_obs=8: 1,2,3,4,5,6,7,8
# num_obs=7: 1,2,3,4,6,7,8
# num_obs=6: 1,2,3,5,7,8
# num_obs=5: 1,2,4,6,8
# num_obs=4: 1,3,5,7
# num_obs=3: 1,4,7
# num_obs=2: 1,5
# num_obs=1: 1
I'd appreciate any form of feedback.
Also, this is my first time posting a question here, so I'd be happy to clarify the question further if needed.
I know one may question why do this in the first place, but there are instances when grouping by a certain variable may create groups that have less observations than the bin number itself.
When tested with several bin numbers, the fastxtile() behaves in a seemingly regular pattern, but I can't seem to decipher what that pattern is.
I used a code similar to the following:
egen temp_bin = fastxtile(mv), by(time_var) n(5)
The following are some sample cases when using a specific bin_number with a certain number of observations (num_obs) and the bins that the fastxtile() command categorized the observations into.
bin_num=5:
# num_obs=5: 1,2,3,4,5
# num_obs=4: 1,2,3,4
# num_obs=3: 1,2,4
# num_obs=2: 1,3
# num_obs=1: 1
So, for example, in the above case, when there were 3 observations and bin_num=5, the observations were binned as bin 1, bin 2 and bin 4.
bin_num=7:
# num_obs=7: 1,2,3,4,5,6,7
# num_obs=6: 1,2,3,4,5,6
# num_obs=5: 1,2,3,5,6
# num_obs=4: 1,2,4,6
# num_obs=3: 1,3,5
# num_obs=2: 1,4
# num_obs=1: 1
bin_num=9:
# num_obs=9: 1,2,3,4,5,6,7,8,9
# num_obs=8: 1,2,3,4,5,6,7,8
# num_obs=7: 1,2,3,4,6,7,8
# num_obs=6: 1,2,3,5,7,8
# num_obs=5: 1,2,4,6,8
# num_obs=4: 1,3,5,7
# num_obs=3: 1,4,7
# num_obs=2: 1,5
# num_obs=1: 1
I'd appreciate any form of feedback.
Also, this is my first time posting a question here, so I'd be happy to clarify the question further if needed.