Sparkline with line breaks

Fahad Mirza

Join Date: Sep 2018
Posts: 241

Sparkline with line breaks

07 Mar 2022, 14:30

Hello,

I have a question regarding Sparkline by Nick Cox

I am trying to make the plot and would like to see gaps where data is missing for dates. However, when i try using the cmissing(n) option, it would not work. Data for it as follows:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(mth pct_Dist_obsm1 pct_Dist_obsm2 pct_Dist_obsm3 pct_Dist_obsm4)
709  41.09735 14.581884  14.33607 29.984695
710  20.50663  23.56221   11.3355  44.59566
711 31.557364 35.394505 10.668147 22.379984
712 33.949966 22.997416  11.68076  31.37186
713  29.22156  53.09381   4.91018  12.77445
714  36.69355  31.38441 13.239247 18.682796
715         .         .         .         .
716  31.59121  47.27031 11.617843  9.520639
717 33.119682   41.9415  6.957231 17.981586
718    25.508 15.890158 18.045576  40.55627
719  41.26322  20.69178 12.274025 25.770975
720  25.46957  22.52475 12.605562  39.40012
721 33.942047   25.3424  11.94447  28.77108
722         .         .         .         .
723         .         .         .         .
724         .         .         .         .
725         .         .         .         .
726         .         .         .         .
727         .         .         .         .
728 21.392765 33.484325  13.04584  32.07707
729  41.59512  20.48501 13.277602 24.642265
730 36.606453  19.53065  12.05805  31.80485
731 30.773497  42.40366  12.18464 14.638204
732 38.100067   33.1643 10.141988 18.593645
733  24.38507  13.69805 14.291773  47.62511
734         .         .         .         .
735 34.950703 14.799165 18.471027 31.779106
736  32.90823 26.678156  13.36493 27.048685
737  29.32084 25.089773  12.47463 33.114754
738  27.54687 14.353735 17.438745  40.66065
739  32.05805 21.405014  12.05475  34.48219
740  20.69971 23.790087 18.381924  37.12828
741 29.435526  25.36409   12.8415 32.358883
742  24.72497 12.734748 16.724081   45.8162
743   26.7293 31.071676  9.638135  32.56089
end

This is the final data above that needs to be made using Sparkline

Can anyone help me figure out how to get this done? or is there a way i can make Sparkline using twoway line

I tried using the following example code for making a sparkline type plot with line breaks:

Code:

    generate pct_Dist_obsm2_2 = pct_Dist_obsm2 + 50

    generate pct_Dist_obsm3_2 = pct_Dist_obsm3 + 110
    
    generate pct_Dist_obsm4_2 = pct_Dist_obsm4 + 150
    
    twoway (line pct_Dist_obsm1 mth, cmissing(n)) (line pct_Dist_obsm2_2 mth, cmissing(n)) (line pct_Dist_obsm3_2 mth, cmissing(n)) (line pct_Dist_obsm4_2 mth, cmissing(n))

Will be great if someone can help out

Thank you.

P.S. as a suggestion to Nick Cox, it will be awesome if this command can allow for adding plots in some way

Tags: graph, line, line break, sparkline, twoway

Nick Cox

Join Date: Mar 2014
Posts: 35698

07 Mar 2022, 18:39

Thanks for the interesting question and data example.

When people (from Edward Tufte on) show sparkline examples they often look great. Who would choose poor examples? For many datasets -- in my experience -- they often turn out disappointing.

sparkline is from SSC (2013), as you are asked to explain (FAQ Advice #12). I've not used it much since I wrote it. Somewhere in the middle of the code I drop missing values temporarily for some reason, which is why cmissing(n) is legal but changes nothing. There is possibly a rewriting of the code that gets you what you asked for. You're welcome to clone and rewrite it under a different name. But the way it is written essentially rules out anything like addplot(). It's rearranging data in a space designed for the purpose, and no other purpose is compatible.

However, I didn't find sparkline helpful for these variables, even with that limitation. They are evidently components that add to 100% and so arguably should be presented on the same scale.

This seems to me to be a better representation which is honest about the gaps, and there will be others as good or better. This is what I tend to do instead, rely heavily on a by() option to do most of the work.

For the tiny trickery with year labels, see https://www.stata-journal.com/articl...article=gr0030

For the general idea of using by() here see https://www.stata-journal.com/articl...article=gr0085

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(mth pct_Dist_obsm1 pct_Dist_obsm2 pct_Dist_obsm3 pct_Dist_obsm4)
709  41.09735 14.581884  14.33607 29.984695
710  20.50663  23.56221   11.3355  44.59566
711 31.557364 35.394505 10.668147 22.379984
712 33.949966 22.997416  11.68076  31.37186
713  29.22156  53.09381   4.91018  12.77445
714  36.69355  31.38441 13.239247 18.682796
715         .         .         .         .
716  31.59121  47.27031 11.617843  9.520639
717 33.119682   41.9415  6.957231 17.981586
718    25.508 15.890158 18.045576  40.55627
719  41.26322  20.69178 12.274025 25.770975
720  25.46957  22.52475 12.605562  39.40012
721 33.942047   25.3424  11.94447  28.77108
722         .         .         .         .
723         .         .         .         .
724         .         .         .         .
725         .         .         .         .
726         .         .         .         .
727         .         .         .         .
728 21.392765 33.484325  13.04584  32.07707
729  41.59512  20.48501 13.277602 24.642265
730 36.606453  19.53065  12.05805  31.80485
731 30.773497  42.40366  12.18464 14.638204
732 38.100067   33.1643 10.141988 18.593645
733  24.38507  13.69805 14.291773  47.62511
734         .         .         .         .
735 34.950703 14.799165 18.471027 31.779106
736  32.90823 26.678156  13.36493 27.048685
737  29.32084 25.089773  12.47463 33.114754
738  27.54687 14.353735 17.438745  40.66065
739  32.05805 21.405014  12.05475  34.48219
740  20.69971 23.790087 18.381924  37.12828
741 29.435526  25.36409   12.8415 32.358883
742  24.72497 12.734748 16.724081   45.8162
743   26.7293 31.071676  9.638135  32.56089
end

reshape long pct_Dist_obsm, i(mth) j(which)

rename pct_Dist_obsm pct 

separate pct, by(which) veryshortlabel 

set scheme s1color 

twoway bar pct? mth, by(which, compact col(1) note("") legend(off) r1title(some story here)) ytitle(pct of whatever) yla(0(10)50, ang(h)) xtick(708.5(12)744.5, tlen(*4)) xla(714.5 "2019" 726.5 "2020"   738.5 "2021", tlc(none)) xtitle("") subtitle(, pos(3) size(*1.3) nobox nobexpand fcolor(none)) blcolor(red blue black magenta) bfcolor(red*0.2 blue*0.2 black*0.2 magenta*0.2)

Click image for larger version

Name: sparkline_not.png
Views: 1
Size: 27.2 KB
ID: 1653404

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

08 Mar 2022, 03:35

Naturally a line or connected version of this is possible.

Code:

twoway connected  pct? mth, cmissing(n n n n) by(which, compact col(1) note("") legend(off) r1title(some story here)) ytitle(pct of whatever) yla(0(10)50, ang(h)) xtick(708.5(12)744.5, tlen(*4)) xla(714.5 "2019" 726.5 "2020"   738.5 "2021", tlc(none)) xtitle("") subtitle(, pos(3) size(*1.3) nobox nobexpand fcolor(none)) mcolor(red blue black magenta) lcolor(red blue black magenta)

Code:

Click image for larger version

Name: sparkline_not2.png
Views: 1
Size: 35.5 KB
ID: 1653444

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

08 Mar 2022, 09:26

You could get to #4 with multiline from SSC. https://www.statalist.org/forums/for...ailable-on-ssc

Code:

set scheme s1color 

forval j = 1/4 { 
    label var pct_Dist_obsm`j' "`j'"
}

multiline pct* mth, missing cmissing(n n n n) recast(connect) by(compact col(1) note("") legend(off) r1title(some story here)) ytitle(pct of whatever) yla(0(10)50, ang(h)) xtick(708.5(12)744.5, tlen(*4)) xla(714.5 "2019" 726.5 "2020"   738.5 "2021", tlc(none)) xtitle("") subtitle(, pos(3) size(*1.3) nobox nobexpand fcolor(none)) mcolor(red blue black magenta) lcolor(red blue black magenta) separate

Comment

Fahad Mirza

Join Date: Sep 2018

Posts: 241
#5

09 Mar 2022, 01:10

These are awesome examples! I like how the bar version depicts information much better than the line. Also thank you for sharing the use of by(). It was something new to me that I can rearrange plots with it.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

09 Mar 2022, 01:45

Glad it helped. My conclusions:

1. For your data, a bar chart has the edge, given their compositional character.

2. sparkline from SSC is there for experiment and if you like the results, that’s good. I am much more likely to maintain and develop multiline, also from SSC.

Last edited by Nick Cox; 09 Mar 2022, 01:51.
1 like
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

09 Mar 2022, 11:49

This gets the bars directly with multiline (SSC).

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(mth pct_Dist_obsm1 pct_Dist_obsm2 pct_Dist_obsm3 pct_Dist_obsm4)
709  41.09735 14.581884  14.33607 29.984695
710  20.50663  23.56221   11.3355  44.59566
711 31.557364 35.394505 10.668147 22.379984
712 33.949966 22.997416  11.68076  31.37186
713  29.22156  53.09381   4.91018  12.77445
714  36.69355  31.38441 13.239247 18.682796
715         .         .         .         .
716  31.59121  47.27031 11.617843  9.520639
717 33.119682   41.9415  6.957231 17.981586
718    25.508 15.890158 18.045576  40.55627
719  41.26322  20.69178 12.274025 25.770975
720  25.46957  22.52475 12.605562  39.40012
721 33.942047   25.3424  11.94447  28.77108
722         .         .         .         .
723         .         .         .         .
724         .         .         .         .
725         .         .         .         .
726         .         .         .         .
727         .         .         .         .
728 21.392765 33.484325  13.04584  32.07707
729  41.59512  20.48501 13.277602 24.642265
730 36.606453  19.53065  12.05805  31.80485
731 30.773497  42.40366  12.18464 14.638204
732 38.100067   33.1643 10.141988 18.593645
733  24.38507  13.69805 14.291773  47.62511
734         .         .         .         .
735 34.950703 14.799165 18.471027 31.779106
736  32.90823 26.678156  13.36493 27.048685
737  29.32084 25.089773  12.47463 33.114754
738  27.54687 14.353735 17.438745  40.66065
739  32.05805 21.405014  12.05475  34.48219
740  20.69971 23.790087 18.381924  37.12828
741 29.435526  25.36409   12.8415 32.358883
742  24.72497 12.734748 16.724081   45.8162
743   26.7293 31.071676  9.638135  32.56089
end

set scheme s1color 

forval j = 1/4 { 
    label var pct_Dist_obsm`j' "`j'"
}

multiline pct* mth, missing cmissing(n n n n) recast(bar) by(compact col(1) note("") legend(off) r1title(some story here)) ytitle(pct of whatever) yla(0(10)50, ang(h)) xtick(708.5(12)744.5, tlen(*4)) xla(714.5 "2019" 726.5 "2020"   738.5 "2021", tlc(none)) xtitle("") subtitle(, pos(3) size(*1.3) nobox nobexpand fcolor(none)) blcolor(red blue black magenta) bfcolor(red*0.2 blue*0.2 black*0.2 magenta*0.2) separate

Comment

Fahad Mirza

Join Date: Sep 2018

Posts: 241
#8

10 Mar 2022, 00:13

Was not aware of multiline previously and thank you for sharing this will definitely be exploring this more!
Comment

Announcement

Sparkline with line breaks

Comment

Comment

Comment

Comment

Comment

Comment

Comment