How to count "at least one"

Jacob Alexander

Join Date: Jul 2019

Posts: 4
#1

How to count "at least one"

10 Jul 2019, 04:08

Hello,

I apologise for what I'm sure is a trivial question but I've been searching through the forums for an answer to no avail.

I have data from a cluster randomised trial in 3 arms, with 10 clusters in each arm. Each cluster contains 20 households and there is data on how many windows a house has and whether that window has a blind (a binary outcome, coded 0 or 1). I have missing data for 5 households out of the 600. Each entry in the dataset is for a window, identified by its household number (1-20), cluster number (1-30), arm number (1-3) and outcome (0 or 1), giving a total of 3162 entries, but there is no unique household identifier.

I'm trying to work out the proportion of households in each arm in which at least one of the windows has a blind.

As far as I can tell, I have to do two things:

(1) Generate a unique identifier for each household, which will usually have multiple entries because most houses have multiple windows.

The code I've come up with (which seems to work) is:

Code:

egen household_id = group(cluster_number household_number)

(2) Write some code which effectively says "if any of the entries under a particular household_id has a value of 1 for the outcome, count that household. If not, don't." And separate them by the three arms.

But I'm really stuck on how to do part (2).

Any assistance would be greatly appreciated.

Jacob
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35699
#2

10 Jul 2019, 04:22

See https://www.stata.com/support/faqs/d...ble-recording/ Something like this should help.

Code:

egen wanted = max(window), by(household_id) egen tag = tag(household_id) tab cluster_number wanted if tag
Comment
Jacob Alexander

Join Date: Jul 2019

Posts: 4
#3

10 Jul 2019, 08:42

Dear Nick,

Thanks for your reply, which has worked everything out nicely. I'd looked at that webpage but hadn't worked out the tag section of code.

Best wishes,

Jacob
Comment
Jacob Alexander

Join Date: Jul 2019

Posts: 4
#4

11 Jul 2019, 04:06

Hello,

Similar to yesterday's question, I'm now trying to work out the number of blinds per 100 households, by arm and overall, and the 95% confidence intervals of these means.

I've been through these forums over and over and just can't work out what to type (or even the name of the process I'm looking for). I'm new to STATA so please forgive my ignorance.

Best wishes,
Jacob
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35699
#5

11 Jul 2019, 05:37

Please study FAQ Advice #12 on how best to help us -- by giving a data example. No need for repeated apologies -- we were all beginners once -- but a real need for concrete examples.

https://www.statalist.org/forums/help#version

While you're there swing by

https://www.statalist.org/forums/help#spelling

If I understand you correctly, your first need is a dataset based on households, which in terms of the variables in #2 would be got with

Code:

keep if tag

This invented dataset shows some technique -- and some personal choices (e.g. I like the option jeffreys in ci proportions).

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(window household_id cluster) 1 1 1 1 2 1 0 3 1 0 4 2 0 5 2 1 6 2 0 7 3 0 8 3 0 9 3 1 10 4 1 11 4 1 12 4 end statsby, by(cluster) total clear : ci proportions window, jeffreys su cluster, meanonly local totalid = r(max) + 1 replace cluster = `totalid' if cluster == . label def cluster `totalid' "Total" label val cluster cluster twoway scatter proportion cluster, mc(blue) || rcap lb ub cluster, lc(blue) ytitle(Mean proportions and 95% confidence intervals) scheme(s1color) title(Households with windows) legend(off) yla(0 "0" 1 "1" 0.2(0.2)0.8, ang(h) format(%02.1f)) xla(, valuelabel)

You can copy and paste this code into a do-file editor window and run it all at once.
Comment
Jacob Alexander

Join Date: Jul 2019

Posts: 4
#6

11 Jul 2019, 09:04

Dear Nick,

Thank you for the reply. Points noted about Stata vs STATA, repeated apologies and giving data examples. Particularly the point about giving data examples because what you've kindly shown me how to do is not quite what I'm trying to do, which I'll now try and explain better.

I can't use

Code:

keep if tag

because what I'm trying to do is calculate the mean number of blinds per 100 households (an index I'm interested in). It's a mean, not a proportion, and it will be above 100. And because of the way the data is arranged, if I exclude all non-tag data I'll be limiting myself to a maximum of 100 for the index.

What I've done so far is this:

Code:

. egen household_id = group(cluster_number household_number) . egen tag = tag(household_id) . tab window arm if window==1 | arm window | 1 2 3 | Total -----------+---------------------------------+---------- 1 | 276 210 181 | 667 -----------+---------------------------------+---------- Total | 276 210 181 | 667 . tab arm if tag arm | Freq. Percent Cum. ------------+----------------------------------- 1 | 199 33.45 33.45 2 | 198 33.28 66.72 3 | 198 33.28 100.00 ------------+----------------------------------- Total | 595 100.00

What I want to do now is divide the number of blinds by the number of households in each arm (and overall) then multiply them by 100. So (276/199)*100 and so on. And then generate a 95% confidence interval for the index for each arm, and also have the index in a form such that I can do negative binomial regression along the lines of

Code:

nbreg newindex i.arm

afterwards. I hope this is a lot clearer and thanks for giving me tips on how to use the forum correctly.

Best wishes,
Jacob
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35699
#7

11 Jul 2019, 09:19

You did say that: my fault. Other way round, still no data example! My latest guess is

Code:

egen total = total(window), by(household_id) egen tag = tag(household_id) keep if tag statsby, by(cluster) total clear : ci mean total

and then it's the same code as before. I stopped at trying to understand your last request, given other things to do.
Comment

Announcement

How to count "at least one"

Comment

Comment

Comment

Comment

Comment

Comment