Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stacking variables with hbar and having both N and mean values reported?

    Dear Stata-listers

    In my quest for getting both N and mean values (or percentages) reported in my much needed graphs, I have encountered yet another problem. I have worked out an example using auto.dta, in order to better illustrate. I have also made a couple of graphs (graph1 and graph2) which look the way I need them to look, but in order to get this result I have had to make changes manually using Stata's graph editor. In the future I will be making a lot of such graphs, so I need them to be produced auto"magically" from written code, not manually, from me tampering with the graph editor.

    I have two questions, but before I ask them I will explain a couple of things (hope this is not too much explaining):

    In this example I need you to pretend that mpg, trunk and turn are percentage variables summing to 100 (i.e, togheter they sum to 100%, so that for example the mpg would be 25, trunk would be 25 and turn would be 50, or mpg would be 0, trunk would be 75 and turn would be 25 etc). I know they don't do this now, but the general idea is what is important here. Myself, I have three time-use variables which sum to 100 %. Each respondent has given the percentage of their work time spent on three main groups of work tasks during a regular work week. Since the three variables have to sum to 100%, this implies that each person has a valid value for each variable and henceforth N is the same for each variable within the groups of respondents. In other words, I don't need a separate "N" for each variable within the same group of respondents the way my current code now offers (since I have not succeeded in stacking the bars). There are two groups of respondents in my data (Master degree respondents and Bachelor degree respondents), but in addition I need the average for MA- and BA-respondents summed together. In order to get this, I have generated another group called "Total" (I also do this in the auto.dta example below).*/

    /* I have two questions:

    1) Is there a way to stack the variables for each of the three groups, so that mpg, trunk and turn are represented in one horisontal bar which sums to 100%? (and still keeping N to the left of the bar and have the different colors for each variable)

    --> See example graph1 below (generated manually with the graph editor)

    2) Assuming these were not percent-variables summing to 100%, but just three random variables with different averages on a Likert scale and also different number of observations per variable, is there a way to stack the variables for each of the groups (in the auto.dta these groups would be Foreign and Domestic, but my groups are a number of study programmes), so that we get one bar per group with three "stacks" that each have the mean value and N reported inside them?

    -->See example graph2 below (also generated manually with the graph editor)*/

    Here is "my" code (really, it is mostly Nick Cox's code, which he so kindly helped me with in one of my previous posts on a related problem), re-written into an auto.dta-example:


    Code:
     sysuse auto, clear
    keep  foreign mpg trunk turn
    
    replace foreign=2 if foreign !=.
                
    label define origin 2 "Total", add
    
    save auto2, replace
                
    append using "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta"
                
    keep  foreign mpg trunk turn
    
    collapse  ///
    (count) countmpg=mpg ///
    (count) counttrunk=trunk  ///
    (count) countturn=turn  ///
    (mean) meanmpg=mpg    ///
    (mean) meantrunk=trunk   ///
    (mean) meanturn=turn   ///
    , by(foreign)
    
    browse
    
    reshape long count mean, i(foreign) j(which) string
    
    capture drop order2
    capture drop group2
    capture drop detail2
    
    label define order2 1 mpg  2 trunk 3 turn  
    
    encode which, gen(order2) label(order2)
    
    label def order2 1 " ", modify
    label def order2 2 " ", modify
    label def order2 3 " ", modify
    
    gen detail2 = " ({it:n} = " + string(count) + ")"  
    
    egen group2 = group(order2 detail2), label
    
    separate mean, by(order)
    /*The code below produces a graph which reports N and mean values (percentages) but not the way I need it to look. The way I need the graph(s) to look, is the way it is illustrated in graph1. I.e. stacked for each group of respondents and with N reported to the left

    Also, graph2 shows what I need the graph to look in the case that we imagine the variables (mpg, trunk and turn) as mean values on a Likert scale (1 - 5), stacked with mean value and N reported inside each "stack"*/


    Code:
    #delimit ;
    graph hbar (asis) mean?, over(group2, label(labsize(small)))  over(foreign, label(labsize(small))) nofill
    bar(1, fcolor(eggshell) lcolor(black)) bar(2, fcolor(khaki) lcolor(black))bar(3, fcolor(olive_teal)lcolor(black))
    blabel(bar, pos(outside) format(%12.1f) size(vsmall))
        legend(lab(1 "MPG") lab(2 "Trunk")lab(3 "Turn") size(vsmall)
        keygap(0.5) symxsize(5) col(6) position(-2) ring(-1))
        plotregion(lcolor(none))
        scheme(s1mono)
        title("Rated importance for car owner (%)" " ", size(medlarge) )
        ytitle(" ")
        note("Note: Some note", size(small) span)
        ylab(none)                                                                                                                                        
        name(fig15statalist2, replace);
        graph save fig15statalist2, replace;
    #delimit cr
    This results in this graph, which is not stacked and shows "superfluous" N's (since N is the same for each variable within each group of respondents):

    Click image for larger version

Name:	not the right way.png
Views:	1
Size:	31.0 KB
ID:	1393948




    If anyone can help me on these matters I will appreciate it more than I am able to convey in words. My supervisor needs these graphs to look a particular way and have N reported for each variable, and while I have been working hard, I end up using too much time struggling with code which takes me only so far, but not really there as far as N and "stacking" is concerned.

    Any help is appreciated.

    Best wishes,
    Johanne

    Example graphs, graph1 (the way I need this graph to look) and graph2 (the way I need some other graphs to look):

    Click image for larger version

Name:	graph1.png
Views:	1
Size:	43.3 KB
ID:	1393949 Click image for larger version

Name:	graph2.png
Views:	1
Size:	45.5 KB
ID:	1393950
    Last edited by Johanne Karlsen; 21 May 2017, 11:17.

  • #2
    Originally posted by Johanne Karlsen View Post
    I have also made a couple of graphs (graph1 and graph2) which look the way I need them to look, but in order to get this result I have had to make changes manually using Stata's graph editor. In the future I will be making a lot of such graphs, so I need them to be produced auto"magically" from written code, not manually, from me tampering with the graph editor.
    You can record changes made in the Graph Editor and apply them to other graphs with the gr_edit command. See this post for an example.

    Comment


    • #3
      You can record changes made in the Graph Editor and apply them to other graphs with the gr_edit command.
      Thank you for commenting and trying to help me out.

      I know of this feature of the Graph Editor and I have successfully used it previously. However, in this case it is not useful, since I have a lot of graphs to produce and N and mean values are different for each variable and thus different for each graph. As such, I would still have to manually edit each gr_edit command to incorporate the correct number of observations and mean values, and also I would have to run the analyses to get obs and means. What I need is to write a code that makes the graphs and collects the correct number of observations and mean values for the particular variables automatically, without me having to write the numbers into the code or run any additional analyses.

      Comment


      • #4
        Code:
        ren mean1 mpg
        ren mean2 trunk
        ren mean3 turn
        
        collapse count mpg trunk turn, by(foreign)
        
        levelsof foreign, local(levels)
        foreach l of local levels {
          local lab`l' : label origin `l'
          sum count if foreign == `l'
          lab def origin `l' "`lab`l'' (N = `r(mean)')", modify
        }
        
        set scheme s1mono
        graph hbar mpg trunk turn, ///
          stack percent over(foreign) ///
          blabel(bar, format(%2.0f) pos(inside)) ///
          legend(lab(1 "MPG") lab(2 "Trunk") lab(3 "Turn") ///
          pos(12) row(1))
        Click image for larger version

Name:	bars.png
Views:	1
Size:	9.8 KB
ID:	1394842

        Comment

        Working...
        X