Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating connected scatter plot with pre and post data

    Hello,

    I am trying to create a scatter (bubble?) plot showing the values of my pre- and post-intervention data for both control and intervention participants. I have two variables with my outcome (one for each time period) labeled prop2011 and prop2017. I would like 2011 and 2017 to be on the x-axis and the values of prop2011 and prop2017 to appear on the y axis. I would also like a line connecting the points for each different ID number (idn). Additionally, I would like the size of the circle to correspond with the variables denom2011 and denom2017. I have included a dataex sample of my data below. Any help would be much appreciated!

    Thank you,

    Sarah


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(idn intervention prop2011 prop2017) double(denom2011 denom2017)
     1 0          0  .15151516   19   33
     2 0        .35   .4545455   40   33
     3 0  .14035088        .48   57   25
     4 0  .14583333    .171875  144  192
     5 0   .3421053   .4864865   76   37
     6 0  .24489796  .17567568   98   74
     7 0   .3333333   .8333333    6    6
     8 0   .2108626   .3072289  313  166
     9 0  .05769231  .06060606   52   66
    10 0  .14285715   .0888889   14   45
    11 0   .3529412  .08333334   34   48
    12 0  .14503817        .15  131  280
    13 0   .4340659         .4  182  110
    14 0   .3305085   .3970588  236  136
    15 0   .1948052   .2368421   77   38
    16 0  .23741007  .28703704  139  216
    17 0  .12244898   .1764706   49   17
    18 0  .24324325   .3571429  111  168
    19 0          0  .13761468   40  109
    20 0  .06666667      .1875   30   32
    21 0  .11688311  .15966387   77  119
    22 0   .0909091          0   11    2
    23 0  .14814815   .1627907   27   43
    24 0  .15463917  .20689656   97   29
    25 0   .3376623  .21518987   77   79
    26 0  .26666668  .24444444   15   45
    27 0   .2857143   .3809524  105   63
    28 0   .2105263   .3181818   19   22
    29 1   .6666667        .25    6    8
    30 0  .19786096  .14141414  187  198
    31 0  .15068494   .2352941   73   34
    32 0  .04166667   .1521739   72   92
    33 0   .1392405  .23030303  158  165
    34 0         .2  .22702703   15  185
    35 0   .2142857  .05555556   14   18
    36 0  .09803922  .25581396   51   86
    37 0   .0775862  .10454545  116  220
    38 0   .1846154  .13043478   65   46
    39 0  .11428571  .06896552   35   58
    40 0  .19902913   .2747604  206  313
    41 0  .06896552  .16030534   58  131
    42 0  .12121212   .1719457  132  221
    43 0   .7733333 .037037037  150   27
    44 0  .13513513  .10526316   37   76
    45 1  .13793103        .25   29   20
    46 0          0          0    1   10
    47 0  .07792208   .1588785   77  107
    48 0        .25   .2857143   16    7
    49 0  .02222222   .0406504   90  123
    50 0  .09302326          0   43    3
    51 0   .3918919      .2875   74   80
    52 0  .06666667         .2   45   45
    53 0        .44   .6578947   25   38
    54 0   .0952381  .16666667   21   18
    55 0  .08571429  .27272728   35   22
    56 0  .12322275   .1796875  211  128
    57 0  .12658228  .15384616   79  117
    58 0   .0923077  .14346895  325  467
    59 0  .11111111  .09243698  135  119
    60 0   .3448276  .17021276   29   47
    61 0  .27407408        .25  135   56
    62 1     .21875   .1764706   32   17
    63 0  .08396947  .13061224  131  245
    64 0  .29166666  .15384616   24   26
    65 0  .22330096   .3219178  103  146
    66 0   .0909091  .08333334   44   36
    67 1  .05813954         .2   86  120
    68 0  .06451613   .3235294   31   34
    69 1  .13333334        .05   15   60
    70 0  .16216215  .27184466  111  103
    71 0   .3370166   .4136126  181  191
    72 0  .15315315  .30612245  222  147
    73 0   .2826087   .2063492   92   63
    74 0  .13475177  .14754099  141   61
    75 0   .4166667  .13043478   36   23
    76 0  .29411766   .2352941   17   17
    77 0   .3333333   .6666667   30    6
    78 0  .14285715  .29032257   56   62
    79 0   .7333333   .6666667   15    6
    80 0  .24752475   .3492064  101   63
    81 0  .08163265       .125   49   48
    82 0   .2631579        .25   19    8
    83 0   .1818182        .25   55   20
    84 0   .3303965   .4039216  227  255
    85 0   .1728395  .19148937   81   94
    86 0  .06324111 .027333334 1265 1500
    87 0  .08333334  .05882353   60   51
    88 1  .03448276 .071428575   29   28
    89 0  .04225352  .10447761   71  134
    90 0  .25757575   .2093023  132  129
    91 1  .10714286        .04   28   50
    92 0  .12280702  .09920635  114  252
    93 0       .125   .2857143   16    7
    94 0 .069518715  .20714286  187  140
    end

  • #2
    Thanks for the data example.

    A while back a meme in graphics circles was that you should want readers to say Aha! (meaning, I see the structure in these data), not Wow! (meaning, how did you do that? or -- perhaps -- there is a lot of information in there!). My own thought was that the meme needed at least a third, Huh? (meaning, this just looks like a mess, unfortunately).

    I am not a fan of bubble plots which seemed to work well for Hans Rosling but in my experience rarely otherwise. The trick is to select the data for which the method works....

    All that said as preamble, or prejudice, I think the first graph here is what you're asking for, but I can't admire it. I am hard put to it to suggest something better, but I played around a bit. Notice that I've not even begun to look at intervention and control.

    Click image for larger version

Name:	bubble.png
Views:	1
Size:	155.9 KB
ID:	1512381

    Click image for larger version

Name:	arrow.png
Views:	1
Size:	76.4 KB
ID:	1512382
    Click image for larger version

Name:	dot.png
Views:	1
Size:	56.9 KB
ID:	1512383


    Here's the code I used

    Code:
    set scheme s1color 
    gen y2011 = 2011 
    gen y2017 = 2017 
    twoway pcspike prop2011 y2011 prop2017 y2017, lc(black) || scatter prop2011 y2011 [w=denom2011], ms(Oh) mc(red) ||  scatter prop2017 y2017 [w=denom2017], ms(Oh) mc(blue) ysc(r(-0.1 .)) yla(0 "0" 0.2(0.2)0.8, format("%02.1f") ang(h)) ytitle(proportion) xla(2011 2017) xsc(r(2010 2018))legend(off) name(bubble, replace)
    
    scatter prop2011 denom2011, ms(Oh) mc(red) || pcarrow prop2011 denom2011 prop2017 denom2017 , lc(blue) mc(blue) xsc(log) xla(1 3 30 10 30 100 300 1000) ytitle(proportion) xtitle(denominator) legend(order(1 "2011" 2 "to 2017")) yla(0 "0" 0.2(0.2)0.8, format("%02.1f") ang(h)) name(arrow, replace)
    
    sort prop2017 
    label var prop2017 "2017"
    label var prop2011 "2011"
    gen which = ceil(3 * _n/_N) 
    graph dot (asis) prop2011 prop2017,  over(id, sort(2) label(labsize(small))) ///
    by(which, subtitle(Proportion of whatever, pos(12))  col(3) note(""))  ///
    nofill subtitle("", pos(9) nobox nobexpand) ///
    marker(1, ms(Oh) mc(red)) marker(2, ms(+) mc(blue)) ysc(r(-0.05 .) alt) /// 
    yla(0 "0" 0.2(0.2)0.8, format("%02.1f")) linetype(line) lines(lc(gs12) lw(vthin)) name(dot, replace)

    Comment


    • #3
      Just to add that none of these designs will look any clearer with many more than 94 subjects.....

      Comment


      • #4
        Hi Nick! Wow! Thank you for all of this. I suppose something like the first graph was what I was looking for except I was hoping that the colors would distinguish the intervention participants (intervention=1) from the control participants (intervention=0). Also, do you know a way to make the dots closed? I agree, it's a little messy looking... please do let me know if you think of a more elegant way.

        Comment


        • #5
          Yes, naturally, you can have filled in circles with ms(O). That surely makes a bad problem worse.

          There is more technique at https://www.stata-journal.com/sjpdf....iclenum=gr0041

          Comment

          Working...
          X