Extract IDs where condition is met, looped over multiple time points

Josephine George

Join Date: Dec 2018

Posts: 34
#1

Extract IDs where condition is met, looped over multiple time points

14 Dec 2021, 09:52

I have aggregate data summarising individual level data. In the aggregate data, each row describes a combination of country and generation, each variable is a survey round from the orginal data, and the values are the number of responses in a given category and survey round. I have substantially abbreviated the example data here; there are at least 20 survey rounds for some of my data sources.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str26 category int(_1997_Oct _1999_March _2000_Nov) " France prewar" 246 241 243 " France boomers" 408 384 365 " France genx" 323 313 298 " France millen" 28 62 97 " France genz" 0 0 0 end

I am trying to end up with a list of category-survey rounds where there were fewer than n observations. In the above example, if n=100, I would want to end up with something like:
OCt1997 March1999 Nov2000

France millen France millen France millen

France genz France genz France genz

I do also have the individual level data, currently "long", if what I want can be more easily achieved from that.
Tags: None
Josephine George

Join Date: Dec 2018

Posts: 34
#2

14 Dec 2021, 10:00

This turned out to be slightly more straightforward than I thought. In case it helps anyone, this code seems to work:

Code:

ds category, not foreach x of varlist `r(varlist)' { list category if `x'>0 & `x'<100 }

Edit: I am still struggling to figure out how to get the output into a more useful format (ideally Excel rather than lists in the Stata window)

Last edited by Josephine George; 14 Dec 2021, 10:24.
Comment

OCt1997	March1999	Nov2000
France millen	France millen	France millen
France genz	France genz	France genz

Announcement

Extract IDs where condition is met, looped over multiple time points

Comment