Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random sampling with criteria depending on the sum of values

    Following very useful FAQ I generated subsamples corresponding to number of criteria. For instance, I'm selecting X number of geographies meeting certain threshold with respect to one or the other indicator. In my dataset I have a variable representing the population size for each of the geographies. I would like to include sampling condition where I could sample a random group of observations meeting one of the previously-defined criteria as well as criterion with respect to the total size of population. As such, I am not interested in the number of observations but in the total population size for the whole subsample. The population criterion should be applied to the whole subsample not single geography. In a word, I would like to obtain a subsample where all geographies meet some criterion with respect to the value of selected and indicator and the total population of selected subsample does not exceed provided figure. Presently, I'm generating my subsample with the use following code:
    Code:
    generate  lowindicator = (indicator <= 3)
    label variable lowindicator "Indicator X, lower or equal 3%"
    label values lowindicator IncludeExclude
    sort lowindicator random
    generate insamplelowindicator = lowindicator & (_N - _n) < 50
    label variable insamplelowindicator "In sample"
    label values insamplelowindicator IncludeExclude
    As usual, I will be grateful for any help.
    Kind regards,
    Konrad
    Version: Stata/IC 13.1
Working...
X