Dear Stata-listers
In my quest for getting both N and mean values (or percentages) reported in my much needed graphs, I have encountered yet another problem. I have worked out an example using auto.dta, in order to better illustrate. I have also made a couple of graphs (graph1 and graph2) which look the way I need them to look, but in order to get this result I have had to make changes manually using Stata's graph editor. In the future I will be making a lot of such graphs, so I need them to be produced auto"magically" from written code, not manually, from me tampering with the graph editor.
I have two questions, but before I ask them I will explain a couple of things (hope this is not too much explaining):
In this example I need you to pretend that mpg, trunk and turn are percentage variables summing to 100 (i.e, togheter they sum to 100%, so that for example the mpg would be 25, trunk would be 25 and turn would be 50, or mpg would be 0, trunk would be 75 and turn would be 25 etc). I know they don't do this now, but the general idea is what is important here. Myself, I have three time-use variables which sum to 100 %. Each respondent has given the percentage of their work time spent on three main groups of work tasks during a regular work week. Since the three variables have to sum to 100%, this implies that each person has a valid value for each variable and henceforth N is the same for each variable within the groups of respondents. In other words, I don't need a separate "N" for each variable within the same group of respondents the way my current code now offers (since I have not succeeded in stacking the bars). There are two groups of respondents in my data (Master degree respondents and Bachelor degree respondents), but in addition I need the average for MA- and BA-respondents summed together. In order to get this, I have generated another group called "Total" (I also do this in the auto.dta example below).*/
/* I have two questions:
1) Is there a way to stack the variables for each of the three groups, so that mpg, trunk and turn are represented in one horisontal bar which sums to 100%? (and still keeping N to the left of the bar and have the different colors for each variable)
--> See example graph1 below (generated manually with the graph editor)
2) Assuming these were not percent-variables summing to 100%, but just three random variables with different averages on a Likert scale and also different number of observations per variable, is there a way to stack the variables for each of the groups (in the auto.dta these groups would be Foreign and Domestic, but my groups are a number of study programmes), so that we get one bar per group with three "stacks" that each have the mean value and N reported inside them?
-->See example graph2 below (also generated manually with the graph editor)*/
Here is "my" code (really, it is mostly Nick Cox's code, which he so kindly helped me with in one of my previous posts on a related problem), re-written into an auto.dta-example:
/*The code below produces a graph which reports N and mean values (percentages) but not the way I need it to look. The way I need the graph(s) to look, is the way it is illustrated in graph1. I.e. stacked for each group of respondents and with N reported to the left
Also, graph2 shows what I need the graph to look in the case that we imagine the variables (mpg, trunk and turn) as mean values on a Likert scale (1 - 5), stacked with mean value and N reported inside each "stack"*/
This results in this graph, which is not stacked and shows "superfluous" N's (since N is the same for each variable within each group of respondents):
If anyone can help me on these matters I will appreciate it more than I am able to convey in words. My supervisor needs these graphs to look a particular way and have N reported for each variable, and while I have been working hard, I end up using too much time struggling with code which takes me only so far, but not really there as far as N and "stacking" is concerned.
Any help is appreciated.
Best wishes,
Johanne
Example graphs, graph1 (the way I need this graph to look) and graph2 (the way I need some other graphs to look):
In my quest for getting both N and mean values (or percentages) reported in my much needed graphs, I have encountered yet another problem. I have worked out an example using auto.dta, in order to better illustrate. I have also made a couple of graphs (graph1 and graph2) which look the way I need them to look, but in order to get this result I have had to make changes manually using Stata's graph editor. In the future I will be making a lot of such graphs, so I need them to be produced auto"magically" from written code, not manually, from me tampering with the graph editor.
I have two questions, but before I ask them I will explain a couple of things (hope this is not too much explaining):
In this example I need you to pretend that mpg, trunk and turn are percentage variables summing to 100 (i.e, togheter they sum to 100%, so that for example the mpg would be 25, trunk would be 25 and turn would be 50, or mpg would be 0, trunk would be 75 and turn would be 25 etc). I know they don't do this now, but the general idea is what is important here. Myself, I have three time-use variables which sum to 100 %. Each respondent has given the percentage of their work time spent on three main groups of work tasks during a regular work week. Since the three variables have to sum to 100%, this implies that each person has a valid value for each variable and henceforth N is the same for each variable within the groups of respondents. In other words, I don't need a separate "N" for each variable within the same group of respondents the way my current code now offers (since I have not succeeded in stacking the bars). There are two groups of respondents in my data (Master degree respondents and Bachelor degree respondents), but in addition I need the average for MA- and BA-respondents summed together. In order to get this, I have generated another group called "Total" (I also do this in the auto.dta example below).*/
/* I have two questions:
1) Is there a way to stack the variables for each of the three groups, so that mpg, trunk and turn are represented in one horisontal bar which sums to 100%? (and still keeping N to the left of the bar and have the different colors for each variable)
--> See example graph1 below (generated manually with the graph editor)
2) Assuming these were not percent-variables summing to 100%, but just three random variables with different averages on a Likert scale and also different number of observations per variable, is there a way to stack the variables for each of the groups (in the auto.dta these groups would be Foreign and Domestic, but my groups are a number of study programmes), so that we get one bar per group with three "stacks" that each have the mean value and N reported inside them?
-->See example graph2 below (also generated manually with the graph editor)*/
Here is "my" code (really, it is mostly Nick Cox's code, which he so kindly helped me with in one of my previous posts on a related problem), re-written into an auto.dta-example:
Code:
sysuse auto, clear keep foreign mpg trunk turn replace foreign=2 if foreign !=. label define origin 2 "Total", add save auto2, replace append using "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta" keep foreign mpg trunk turn collapse /// (count) countmpg=mpg /// (count) counttrunk=trunk /// (count) countturn=turn /// (mean) meanmpg=mpg /// (mean) meantrunk=trunk /// (mean) meanturn=turn /// , by(foreign) browse reshape long count mean, i(foreign) j(which) string capture drop order2 capture drop group2 capture drop detail2 label define order2 1 mpg 2 trunk 3 turn encode which, gen(order2) label(order2) label def order2 1 " ", modify label def order2 2 " ", modify label def order2 3 " ", modify gen detail2 = " ({it:n} = " + string(count) + ")" egen group2 = group(order2 detail2), label separate mean, by(order)
Also, graph2 shows what I need the graph to look in the case that we imagine the variables (mpg, trunk and turn) as mean values on a Likert scale (1 - 5), stacked with mean value and N reported inside each "stack"*/
Code:
#delimit ; graph hbar (asis) mean?, over(group2, label(labsize(small))) over(foreign, label(labsize(small))) nofill bar(1, fcolor(eggshell) lcolor(black)) bar(2, fcolor(khaki) lcolor(black))bar(3, fcolor(olive_teal)lcolor(black)) blabel(bar, pos(outside) format(%12.1f) size(vsmall)) legend(lab(1 "MPG") lab(2 "Trunk")lab(3 "Turn") size(vsmall) keygap(0.5) symxsize(5) col(6) position(-2) ring(-1)) plotregion(lcolor(none)) scheme(s1mono) title("Rated importance for car owner (%)" " ", size(medlarge) ) ytitle(" ") note("Note: Some note", size(small) span) ylab(none) name(fig15statalist2, replace); graph save fig15statalist2, replace; #delimit cr
If anyone can help me on these matters I will appreciate it more than I am able to convey in words. My supervisor needs these graphs to look a particular way and have N reported for each variable, and while I have been working hard, I end up using too much time struggling with code which takes me only so far, but not really there as far as N and "stacking" is concerned.
Any help is appreciated.
Best wishes,
Johanne
Example graphs, graph1 (the way I need this graph to look) and graph2 (the way I need some other graphs to look):
Comment