I have a dataset of clinic attendances including a unique identifier for the attendance, characteristics of the person attending, binary variables which each define whether the attendance resulted in a certain diagnosis (or categories of diagnoses) which are not mutually exclusive, and finally some other characteristics of the attendance such as duration and cost.
Something like this:
etc
I would like to produce several tables from this some of which will include counts and %, e.g.:
Others which would include means and 95%CI, e.g.:
I can't find a simple way of doing this. The best that I have found is tabout which allows me to produce the counts, % and means for multiple variables, however, this has several problems: Firstly, for each diagnosis it will report count and % when the diagnosis is 1 and when it is 0 and the total ... I am only interested in when the diagnosis is 1. Given that I have a large number of diagnoses it is clunky having to filter out those that I need from the outputed tables, secondly, there doesn't seem to be a way to produce 95%CIs for the means.
I thought there must be a way of collapsing the data but I can't think of a way of doing this given that the diagnoses are not mutually exclusive.
Does anyone have any suggestions for how I can do this?
Thanks very much,
Jamie
Something like this:
ID | age | sex | diag1_cardiac | Diag2_renal | Diag3_fever | duration | cost |
A | 1 | M | 1 | 0 | 1 | 6 | 2000 |
B | 4 | F | 0 | 1 | 0 | 3 | 300 |
C | 34 | F | 1 | 0 | 0 | 1 | 500 |
I would like to produce several tables from this some of which will include counts and %, e.g.:
Number of attendances | % of all attendances | |
Diag1_cardiac | ||
Diag2_renal | ||
Diag3_fever | ||
All attendances | xx | 100 |
Others which would include means and 95%CI, e.g.:
Mean age | 95% CI age | |
Diag1_cardiac | ||
Diag2_renal | ||
Diag3_fever | ||
All attendances |
I can't find a simple way of doing this. The best that I have found is tabout which allows me to produce the counts, % and means for multiple variables, however, this has several problems: Firstly, for each diagnosis it will report count and % when the diagnosis is 1 and when it is 0 and the total ... I am only interested in when the diagnosis is 1. Given that I have a large number of diagnoses it is clunky having to filter out those that I need from the outputed tables, secondly, there doesn't seem to be a way to produce 95%CIs for the means.
I thought there must be a way of collapsing the data but I can't think of a way of doing this given that the diagnoses are not mutually exclusive.
Does anyone have any suggestions for how I can do this?
Thanks very much,
Jamie
Comment