Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Overlay a bar graph and line graph

    Hello all,

    I'm trying to replicate this graph using different data:

    Click image for larger version

Name:	Quintile.png
Views:	1
Size:	23.3 KB
ID:	1316160


    However my output is showing me this:


    Click image for larger version

Name:	Coverage.png
Views:	1
Size:	83.5 KB
ID:	1316161

    Here is my data and my code, respectively:

    Data

    Code:
    Sector    Coverage    SubminimumwageheadcountRatio
    Agriculture; hunting; forestry and fish    .85813358    0.52
    Mining and Quarrying    .89187858    0.13
    Manufacturing    .36191116    0.29
    Electricity; gas and water supply    .68654997    0.18
    Construction    .13923316    0.36
    Wholesale and Retail Trade    .62995644    0.35
    Transport; storage and communication    .34724069    0.27
    Financial Services    .75544079    0.22
    CSP    .90706484    0.31
    Private Households    .9931992    0.80
    Code

    Code:
    twoway (bar Coverage Sector1, yaxis(1)) || (line SubminimumwageheadcountRatio Sector1, connect(l) yaxis(2))


  • #2
    This is a little hard to follow. You show us Sector, with either string values or value labels. but your graph is in terms of Sector1, which is clearly numeric; if it has value labels, they are not shown.

    Your immediate problem, however, is the lack of a sort option on the line graph.

    If I were reviewing this graph as part of a paper for a journal, I would suggest something quite different, a side by side dot chart with horizontal text.

    Your graph design has no scope for showing the labels legibly as each must be squeezed into 1/10 of the axis space. The only evident solution there is putting the text vertically. A line chart has some logic for quintile groups, which have a defined order, but little or no logic for qualitatively distinguished sectors.

    Using dataex (SSC) as we recommend would have reduced the engineering needed here to show your data. Your own graph could improve on this by echoing variable labels. I've sorted on coverage (alphabetical order does not help) but sorting on the other variable may make more sense. I edited the sector names to reduce use of upper case, but I missed one.

    The graph you are copying shows coverage as a percent; in your own graph you show it as a fraction but your variable label says %. I don't understand that, but it looks like a slip.


    Code:
    clear
    input str42 Sector    Coverage    SubminimumwageheadcountRatio
    "Agriculture; hunting; forestry and fish"    .85813358    0.52
    "Mining and quarrying"    .89187858    0.13
    "Manufacturing"    .36191116    0.29
    "Electricity; gas and water supply"    .68654997    0.18
    "Construction"    .13923316    0.36
    "Wholesale and Retail Trade"    .62995644    0.35
    "Transport; storage and communication"    .34724069    0.27
    "Financial services"    .75544079    0.22
    "CSP"    .90706484    0.31
    "Private households"    .9931992    0.80
    end
    
    compress
    set scheme s1color
    graph dot (asis) C Sub, over(Sector, sort(1)) ///
    marker(1, ms(Oh)) marker(2, ms(X)) legend(row(2)) ///
    linetype(line) lines(lcolor(gs12) lw(vvthin))
    Click image for larger version

Name:	rooney2.png
Views:	1
Size:	48.1 KB
ID:	1316165

    Last edited by Nick Cox; 10 Nov 2015, 01:51.

    Comment


    • #3
      Nick gave very nice and shiny alternative graph. Here is another one in a case you want to exactly replicate your example above (you'll probably need to trim the long label or to use horizontal bar instead). I share Nick's comment about your data. In a case your Sector1 variable is numeric with value labels, just skip the two lines before the twoway command.

      Code:
      clear*
      input str40 Sector    Coverage    SubminimumwageheadcountRatio
      "Agriculture; hunting; forestry and fish"    .85813358    0.52
      "Mining and Quarrying "   .89187858    0.13
      "Manufacturing "   .36191116    0.29
      "Electricity; gas and water supply "   .68654997    0.18
      "Construction "   .13923316    0.36
      "Wholesale and Retail Trade "   .62995644    0.35
      "Transport; storage and communication "   .34724069    0.27
      "Financial Services "   .75544079    0.22
      "CSP "   .90706484    0.31
      "Private Households "   .9931992    0.80
      end
      
      g Sector1=_n
      labmask Sector1, val(Sector) // ssc install labmask
      
      twoway bar Coverage Sector1, ylab(0(.2)1, notick) barwidth(.7) xtitle("") ytitle("") xla(1/10, valuelabel notick ang(90)) || ///
      line SubminimumwageheadcountRatio Sector1, sort

      Comment


      • #4
        I prefer my own suggestion (surprise!), but two generic points that deserve emphasis:

        1. Bars aren't guaranteed to start at zero. Oded's code achieves this by insisting that the y axis labels include zero.

        2. With a combined bar and line plot, it is difficult to see why one variable is shown one way and not the other way round. It's also hard to give the two variables equal emphasis. The second may be regarded as a feature.

        Comment


        • #5
          Two papers in this territory:

          SJ-11-3 gr0049 . . . . . . . . . . Stata tip 102: Highlighting specific bars
          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
          Q3/11 SJ 11(3):474--477
          tip on highlighting a subset of observations in a bar or
          dot chart

          http://www.stata-journal.com/sjpdf.h...iclenum=gr0049

          SJ-8-2 gr0034 . . . . . . . . . . Speaking Stata: Between tables and graphs
          (help labmask, seqvar if installed) . . . . . . . . . . . . N. J. Cox
          Q2/08 SJ 8(2):269--289
          outlines techniques for producing table-like graphs

          http://www.stata-journal.com/sjpdf.h...iclenum=gr0034

          Comment


          • #6
            Hi, all -
            I agree with you, Nick. I'm not sure if it's just my friends but I've had a hard time communicating findings with dot plots and other visinfo techniques so near-and-dear to me, in no small part due to your tireless innovations and contributions.

            In addition, many journals require additional fees for color graphics, promoting monochrone colors.

            As well, most social scientists I encounter prefer the numbers in the graphs.

            Accordingly, Nick, I've used your hhbar package to create two bar charts, in monochrome, with summary statistics. Please be forgiving! I'll try to swing by with colorful charts when I can, but until then I may have to do with bar charts.


            Code:
            ssc install hbar
            graph hbar (asis) C Sub, over(Sector, sort(1)) ///
            blabel(bar, position(above) format(%3.1f)) ///
            scheme(s1mono) legend(row(2)) ///
            plotregion(style(none))
            Nathan E. Fosse, PhD
            [email protected]

            Comment


            • #7
              Thanks for the agreement (and the disagreement too, it seems).

              Some confusion to be cleared up: I wrote an hbar package from 1997 to 1999 for Stata 6 and it remains on SSC but it's nothing at all to do with graph hbar except in an ectoplasmic sense that StataCorp developers were aware of that work when they rewrote Stata's graphics for Stata 8. When anyone types graph hbar they are not invoking my hbar, regardless of whether they installed it first.

              More importantly

              1. The use of colour in #2 is the least important detail and utterly dispensable. That's one reason I chose different markers. Indeed the graph may seem simpler and thus improved if presented in monochrome.

              2. I have nothing against horizontal bar charts, as should already be evident. I've found in straw polls of student groups a preference for bar charts over dot charts for the same data but no-one has a better reason than familiarity. Familiar beats unfamiliar with nothing else said, but it's a comfort to know that your colleagues at Harvard (and elsewhere) are just as fearful as my students of the idea that the position of a point symbol against a scale conveys quantity. It's not true in the current example, but there are many examples in which a bar chart is a lousy choice because the bar origin would be an enormous distance to the left or right of the data or is not even defined. It beats me that researchers will happily yearn for the most complicated models that happen to be fashionable in current literature but often revert to what they used in high school when choosing graphs.

              3. Numbers on graphs are dear to me too and encouraged in various papers of mine. See the 2008 paper cited in #6 and designplot for examples. On the latter

              http://www.statalist.org/forums/foru...riptive-tables

              http://www.stata-journal.com/article...article=gr0061

              Comment


              • #8
                Thanks, Nick. Well put regarding the disjuncture between complex models (e.g., SEM) in contrast to "high school" graphs. Thanks for correcting the provenance of the hbar which somehow slipped my mind as well. Thanks for the reference to designplots - I'll continue ot try those too.

                In any case some graphs are better than no graphs. Andy Gelman pointed out in his review of Freakonomics how surprising it is that such a "rogue" set of analyses present no visinfo of the data!
                Nathan E. Fosse, PhD
                [email protected]

                Comment


                • #9
                  Hi all,

                  Thanks for this lively discussion.

                  The graph is not for a journal (at least not yet) but rather a report for a government department.

                  I share the sentiments of Nick and Nathan's students regarding dot graphs. In my case, it is definitely due to familiarity and as you said, I feel that the dots would not convey quantity.

                  Having said that, I will endeavour to think of a dot plot next time I am confronted with indecision on which type of graph to choose .

                  I have saved your papers Nick so I can read them in future.

                  Comment


                  • #10
                    Thanks for the closure.

                    Do you avoid scatter plots too? Same principle: position of point relative to axis conveys magnitude. (Treat the question as rhetorical if you like.)

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      Thanks for the closure.

                      Do you avoid scatter plots too? Same principle: position of point relative to axis conveys magnitude. (Treat the question as rhetorical if you like.)
                      No, I freely use scatterplots, but that is generally when I want to show a relationship/correlation between two variables. To my mind, the magnitudes aren't so important in these cases. For example, if I was graphing GDP Per Capita v average electricity consumption, the actual magnitude of the two variables wouldn't matter to me.

                      Comment


                      • #12
                        I understand that emphasis. But a scatter plot is useful only insofar as it conveys actual magnitudes. And often it is essential to look at the axes to see what is going on, e.g. to think about outliers.

                        Comment

                        Working...
                        X