Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • vennbar now available from SSC

    Thanks as ever to Kit Baum, a new package vennbar is now available from SSC from Tim Morris and myself.

    This completes publication (making public) of a trio of commands from a project started on 8 September 2022 in the Fitzroy Tavern, Charlotte Street, London during the London Stata Users' meeting .

    The trio started with jaccard (a small deal)

    https://www.statalist.org/forums/for...lable-from-ssc

    and continued with upsetplot (a bigger deal than jaccard).

    https://www.statalist.org/forums/for...lable-from-ssc

    vennbar honours John Venn (he of Venn diagrams) but not in the observance: as with upsetplot the aim is to present alternatives to annotated Venn diagrams showing counts (more generally abundances) of overlapping subsets.

    So why are there two commands? The answer is twofold. First, a piece of personal history that you need not care about. Second, a more fundamental reason.

    When Tim showed me upset plots -- at this point I must have forgotten Andrew Musau 's fine post at https://www.statalist.org/forums/for...mptoms-graphic -- my very first reaction was that an existing command already was available, which was nonsense, so no more of that. My first reaction after that was that I didn't much like the design and thought that you could get something as good or even better just by working out frequencies and feeding them to graph hbar -- or if you prefer graph bar or graph dot So that is how vennbar .started, Later it become clear that some things are easier or work better with a wrapper of that kind -- while other things are easier or work better with a wrapper for graph twoway, which is how upsetplot started. Also, I like the upsetplot design much more than I did. It's a platitude of monumental proportions that you have to try graph types out on real data, not just read about them,

    So much for the personal history. The fundamental reason that remains is that there are two commands because they are wrappers for quite different Stata graph commands and none has a monopoly of virtues.

    At this point I am going to assume that if this interests you, then either you have looked at the recent posts first linked above, or you can now do that to get more flavour. In any case, the help file for vennbar really is quite detailed. So, just a few token graphs by way of a taster.
    Click image for larger version

Name:	VB1.png
Views:	1
Size:	33.1 KB
ID:	1696646

    Click image for larger version

Name:	VB2.png
Views:	1
Size:	33.9 KB
ID:	1696647

    Click image for larger version

Name:	VB9.png
Views:	1
Size:	43.8 KB
ID:	1696648



    PS Tim asked earlier -- If one command is called upsetplot why is this one not called vennbarchart? to which the only answer I could think of was that I didn't want to keep typing vennbarchart.
    Last edited by Nick Cox; 10 Jan 2023, 09:41.

  • #2
    Not as colorful as -upsetplot-, it is nevertheless easier to interpret.

    Comment


    • #3
      Chen Samulsion That's one vote. I think you would be in a minority there!

      Comment


      • #4
        I like both. I have explored upsetplot in last few days. Maybe in practice I will hedge, use -upsetplot- to present (audience like colorful and innovative things), and use -vennbar- to interpret.

        Comment


        • #5
          I recently reviewed Alan Smith. 2022. How Charts Work. Harlow: Pearson This is what he says

          The journey towards infographic hell always begins by valuing style over meaningful communication. (p.216)

          Comment


          • #6
            Nick, I just read your book review in Amazon. I agree with you that some people are
            more interested in flaunting their creativity than in what the data may say
            . I like Stata's slogan "Your data tell a story", and I think the overriding aim of statistical graphs is to tell a story accurately with an uncluttered form. When we value style over meaning, we should remind ourselves that too much water drowned the miller.

            Comment


            • #7
              Hi ,

              I am trying including Bar labels , same as Nick shows us on #1, but this time, using upsetplot command.

              Code:
              * EXAMPLE 4
              * various indicators in nlswork.dta
              webuse nlswork, clear
              local toptitle "t1title(Number of people)"
              label var nev_mar "never married"
              label var c_city "central city"
              label var collgrad "college graduate"
              label var south "South"
              upsetplot nev_mar c_city collgrad south, varlabels baropts(`toptitle' `bcolour') name(UP10, replace) ms(none) mlabc(black) mla(_count) mlabpos(12) mlabsize(small)
              however, some labels are missing:

              Click image for larger version

Name:	Capturar.PNG
Views:	4
Size:	41.8 KB
ID:	1706318
              Attached Files

              Comment


              • #8
                sorry, for repeating the image, it is the same.
                Need to read the forumĀ“s tutorial , again, on uploading images section..

                Comment


                • #9
                  Thanks for the report. We'll take a look to try to work out what's happening.

                  Comment


                  • #10
                    (NB despite the posting #7 in this thread on vennbar, this is indeed a query about upsetplot.)

                    The resolution is as follows. By adding these options at the end of the command

                    Code:
                     ms(none) mlabc(black) mla(_count) mlabpos(12) mlabsize(small)
                    you are in effect over-writing the first marker symbol call. If you run the command without these options, you will see solid circles for the 8/16 subsets in which "never married" is 1. You are replacing those with marker labels. The reason that some counts are missing is that in 8/16 subsets -- those in which "never married" is 0 --there is no marker symbol to replace. The remedy is to put your extra options within a bag or container option

                    Code:
                    labelopts(ms(none) mlabc(black) mla(_count) mlabpos(12) mlabsize(small))
                    the result of which incidentally shows the reason for mlabpos(12) as the marker labels are placed on the top of each bar.

                    Last edited by Nick Cox; 20 Mar 2023, 16:38.

                    Comment


                    • #11
                      Revised files are now up on SSC, thanks as always to Kit Baum.

                      We have made many minor tweaks to both code and help file. In particular, a variable hitherto called _count is now called _freq and the default is to label bars with the magnitudes they show. We hope no existing user is upset by any changes.

                      Comment


                      • #12
                        Stata Journal paper now out at https://journals.sagepub.com/doi/pdf...6867X241258010

                        Comment

                        Working...
                        X