statplot

Tara Boyle

Join Date: Nov 2022

Posts: 136
#1

statplot

09 Mar 2023, 03:53

Code:

statplot Gender-Asa-ethnicity-social_depriv-provider-robotic, bar(1,bfcolor(green*0.4)) over(procedure_type) recast(bar) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(Percent)

Stata reports back with 'invalid name'

I have tried replacing Percent with " on either side

What am I doing wrong?
Tags: None
Tara Boyle

Join Date: Nov 2022

Posts: 136
#2

09 Mar 2023, 04:01

Just removed the hyphens ! And it worked as otherwise stata detected it as one whole variable
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 136
#3

09 Mar 2023, 04:39

Questions re stataplot Nick Cox

Code:

statplot gender asa ethnicity provider, bar(1, bfcolor(green*0.4)) over(procedure) recast(bar) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(Percent)

1. I manually changed the size of the labels '1 male or 2 female' to
angulated and vsmall, how can I add this to the code?

2. Y axis - I have done this as percent but actually I would like to show 'number'
How do I do this in the code ? Do i just delete this section yla(0 0.25 "25" 0.5 "50" 0.75 "75" because otherwise it doesn't work?

3. How do I include the colour for THR or TKR to be 2 different separate colors? Right now they're both green

4. With regards to ethnicity, as you can see I have options from 1-8. I assume stata is totalling just the numbers of observations in ethnicity?
I am planning to show the proportions pre-propensity score matching. And then show how the populations within THR and TKR become equally similar post matching.
However, with regards to Ethnicity as I have more than one ordinal observation (1-8) I don't understand how stata is interpreting and plotting this? Can you pls confirm.

Please note:
Rather than the labels showing up as '1 male or 2 female' I will change the label of the variables to something
more readable eg 'Gender'.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(procedure gender asa Anesthesia ethnicity provider) 1 1 1 0 1 0 1 1 2 1 2 1 1 2 2 1 3 1 2 2 3 1 4 0 2 1 2 0 5 1 2 2 1 1 6 1 2 1 1 0 8 0 2 1 3 1 8 1 end label values procedure type label def type 1 "THR", modify label def type 2 "TKR", modify label values gender sex label def sex 1 "Male", modify label def sex 2 "female", modify label values asa ana label def ana 1 "general", modify label def ana 2 "regional", modify label values ethnicity ethn label def ethn 1 "white", modify label def ethn 2 "mixed", modify label def ethn 3 "black", modify label def ethn 4 "african", modify label def ethn 5 "Indian", modify label def ethn 6 "black other", modify label def ethn 8 "pakistani", modify label values provider prov label def prov 0 "nhs", modify label def prov 1 "private", modify
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35444
#4

09 Mar 2023, 10:53

statplot is from SSC, as you are asked to explain. (People are asked to say where community-contributed commands come from: FAQ Advice #12.)

Your existing plot is showing the means of various categorical variables, as documented.

statistic() specifies the summary statistic used to summarize and plot varlist. The default is mean. See collapse for a full list of accepted statistics.
Note that only one statistic may be specified.

but the mean of a categorical variable is in general only interesting or useful if that categorical variable is binary and coded (0, 1). Mean ethnicity is is especially useless as dependent arbitrarily on coding.

You don't want means, but if you want frequencies for all of

THR or TKR by gender: a 2 x 2 table

ditto by asa but a 2 x 3 table

ditto by ethnicity at least a 2 x 7 table

ditto by provider: a 2 x 2 table

all on one graph, I think that is beyond statplot.
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 136
#5

09 Mar 2023, 10:59

I found this solution

Code:

statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")

I addressed Q2. but have to manually change the labels. Would be great if someone could point out how to change the labels to vsmall incorporated in the code.

Following propensity score matching, some of the observations will be eliminated and thus the bins within each variable should approach each other.
This shouldn't be an issue with those that are binary - two options available 0 or 1.

But for ethnicity, with more than one ordinal variable how can I make sure that only those used in matching are plotted.
Perhaps do I need to use fweights

Here in this paper:
http://www.lindenconsulting.org/docu...ce_Article.pdf

He states: - this would be equivalent to frequency
In a histogram, the data are divided into non-overlapping intervals (bins), and the number of data points within each interval is counted. The graph depicts these frequency counts – the bar is centred at the midpoint of each interval – and its height reflects the average number of data points in the interval.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35444
#6

09 Mar 2023, 12:31

Code:

statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")

That's not a good solution to anything. It's just the category counts for procedure repeated regardless of the other variables.
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 136
#7

09 Mar 2023, 16:43

Originally posted by Nick Cox View Post

Code:

statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")

That's not a good solution to anything. It's just the category counts for procedure repeated regardless of the other variables.

Thanks for your input Prof Cox, I suppose you wouldn’t have any other suggestions in terms of using frequencies and presenting them in a histogram as described by Prof Linden…

Last edited by Tara Boyle; 09 Mar 2023, 16:46.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35444

09 Mar 2023, 17:44

I have not read Ariel's paper, but I looked at his Figures. All your data are categorical, so several of his graphs aren't pertinent.

If by histogram you mean a bar chart of category counts, there are many ways to do it. Here are three:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(procedure gender asa Anesthesia ethnicity provider)
1 1 1 0 1 0
1 1 2 1 2 1
1 2 2 1 3 1
2 2 3 1 4 0
2 1 2 0 5 1
2 2 1 1 6 1
2 1 1 0 8 0
2 1 3 1 8 1
end
label values procedure type
label def type 1 "THR", modify
label def type 2 "TKR", modify
label values gender sex
label def sex 1 "male", modify
label def sex 2 "female", modify

set scheme s1color 
graph bar (count), over(procedure) over(gender) name(G1, replace)

* ssc install catplot 
catplot procedure gender, recast(bar) name(G2, replace)

* install from Stata Journal 
tabplot procedure gender, showval name(G3, replace)

In this case, but not always, catplot echoes graph bar (count).

There are naturally many options to tune what is shown. You can also change variable roles.

Comment

Tara Boyle

Join Date: Nov 2022

Posts: 136
#9

09 Mar 2023, 23:17

Yes in this case I was referring to his categorical presentation of data from which I quoted his article, himself having plotted a histogram showing frequency counts at each interval.

i believe in Stata a histogram is termed barchart. With regards to catplot and tabplot, what is the difference between the two and statlplot ? As you are still using graphbar(count) and I thought you hadn’t recommended this in post #6. Although I may have interpreted this incorrectly.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35444
#10

10 Mar 2023, 01:10

Stata has histogram and twoway histogram and graph bar and twoway bar, and some others, and so those are the distinctions it makes.

Those distinctions aren't that restrictive. Only yesterday I discovered, or re-discovered, that Stata won't draw histograms with its own commands when you have analytic weights, so I cheerfully did the calculation myself and fired up twoway bar.

They don't necessarily bear on how users think about their graphs. Some people in statistical communities insist that a histogram is only a bar chart representation with touching bars and that (e.g.) a bar chart showing frequencies of a nominal variable is not a histogram. That's perhaps historic usage, but I am happy to think that a histogram is just a particular kind of bar chart and don't see any strong objection to any bar chart whatsoever showing frequencies, proportions, percents or densities being called a histogram. But reviewers and examiners might have prejudices on this detail.

I wrote catplot in 2004 because Stata did not directly support what some years later was implemented as graph bar (count), but my syntax was necessarily different. catplot is however a wrapper for graph bar or graph hbar or graph dot depending what the user chooses, so many of its options are as documented for those official commands. I suspect that catplot is a little more versatile, and I consider that it is more transparent about how to work with percents, than the official command, but always distrust a programmer familiar with their own work. Also, I haven't used graph bar (count) that much. If it had existed in 2004 catplot would possibly not have been written.

tabplot is a wrapper for twoway bar and its syntax is again a mix of syntaxes. It mostly has different goals.

The point in #6 is quite different and was only that statplot (SSC) with your syntax doesn't do anything useful.

Some people prefer not to use community-contributed commands or are unable to install them because of workplace policy on downloads.

Otherwise the best way to find out about differences between these commands is to study the help files and run some of the examples.
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment