Hi,
I currently work on project where I want to do the following:
I have a dataset with a large set of characteristics (f.e. age) of people in year t. I also have a file with a population forecast of how many people of age X there will be in year t+30. I now want to draw from my dataset to build a population that fits the forecasted demography to see, how the age changes of the population affect some other variables (assuming that all things other than the age composition of the population stay the same). That also implies oversampling some age groups (there are less old people now than there will be in the future).
One way of doing this is to expand the original dataset such that the issue, that the bsample command does not allow samples larger than the dataset it draws from, is not a problem anymore. Then use a loop that draws samples per age category and save these as separate datasets (I might use frames instead but being limited to 100 and the working memory will likely become an issue). Then append the files in the end to get the final dataset.
However, this feels more like a workaround than an efficient way of doing this so I wonder if there is a smarter/faster way to do this?
Bonus question: It seem like using the values from the population forecast that are stored in a seperate frame as input for the 'bsample' command works just fine. However doing the same with the 'sample' command, doesn't and I get '_frval found where number expected'. Any way to make this work?
Best
Henrik
I currently work on project where I want to do the following:
I have a dataset with a large set of characteristics (f.e. age) of people in year t. I also have a file with a population forecast of how many people of age X there will be in year t+30. I now want to draw from my dataset to build a population that fits the forecasted demography to see, how the age changes of the population affect some other variables (assuming that all things other than the age composition of the population stay the same). That also implies oversampling some age groups (there are less old people now than there will be in the future).
One way of doing this is to expand the original dataset such that the issue, that the bsample command does not allow samples larger than the dataset it draws from, is not a problem anymore. Then use a loop that draws samples per age category and save these as separate datasets (I might use frames instead but being limited to 100 and the working memory will likely become an issue). Then append the files in the end to get the final dataset.
However, this feels more like a workaround than an efficient way of doing this so I wonder if there is a smarter/faster way to do this?
Bonus question: It seem like using the values from the population forecast that are stored in a seperate frame as input for the 'bsample' command works just fine. However doing the same with the 'sample' command, doesn't and I get '_frval found where number expected'. Any way to make this work?
Best
Henrik
Comment