I'm trying to bootstrap from data that represents a frequency table. To simplify, say x is a variable and n is its frequency, and say I have data called original, which summarizes the distribution as follows:
x n
1 40
2 30
3 30
So the original data summarizes 100 cases with x=(1,2,3) in proportions 40:30:30.
What I'd like to do is generate another dataset representing the distribution of 100 cases drawn at random, with replacement, from the distribution described by the original data. Or actually I'd like to do that 200 times and stack the results. I'm open to different ways of representing the results, but they might look something like this:
sample x n
1 1 42
1 2 32
1 3 26
2 1 35
2 2 30
2 3 35
....
200 1 34
200 2 27
200 3 39
bsample 3, weight(n) doesn't do this, and neither does bsample 100, weight(n).
Many thanks for any suggestions.
x n
1 40
2 30
3 30
So the original data summarizes 100 cases with x=(1,2,3) in proportions 40:30:30.
What I'd like to do is generate another dataset representing the distribution of 100 cases drawn at random, with replacement, from the distribution described by the original data. Or actually I'd like to do that 200 times and stack the results. I'm open to different ways of representing the results, but they might look something like this:
sample x n
1 1 42
1 2 32
1 3 26
2 1 35
2 2 30
2 3 35
....
200 1 34
200 2 27
200 3 39
bsample 3, weight(n) doesn't do this, and neither does bsample 100, weight(n).
Many thanks for any suggestions.
Comment