Break in both x and y axis

Annabelle Coleman

Join Date: May 2023
Posts: 44

Break in both x and y axis

25 Oct 2023, 04:41

Dear Statalist users,

I am having trouble creating a break in the axis of a graph. I have tried to use https://www.stata.com/support/faqs/g.../scale-breaks/ but I am having trouble creating the break and combining the graphs with a break in both x and y axes.

This is the code I am currently using. I am trying to make a break so the data points from 0-80 are the majority of the graph and then points at 120 and 240 are on the end with breaks.

I have also attached an image of the graph and I am more than happy to share the date to work with.

Code:

        drop if Objective!="A00"
        drop if Processingdelay==4 | NfL==.
        keep NfL ID_visit sample_delay disease_grp
        reshape wide NfL , i(ID_visit) j(sample_delay) string //creates variables of sample_type Ven_p Ven_S Cap_P Cap_S for scatter plots

        ds ID* dis* , not // lists all string variables in memory of  Ven_p Ven_S Cap_P Cap_S
        local Obj_type `r(varlist)' // local for variables  Ven_p Ven_S Cap_P Cap_S
        
        local axislabelformat labsize(medium) tlwidth(medium)
        local axislineformat aspectratio(1) xscale(lwidth(vthin)) yscale(lwidth(vthin))
        local scattercircle msymbol(circle) msize(large) mlcolor(none)
        
        include "0.1 GRAPHICS_OD.do" //contains colours for scatter plots
        
    foreach s_type1 in `Obj_type' { //** Open double loop to run through code comparing each sample_type in varlist with each other in list
        foreach s_type2 in `Obj_type'  {
            
            
        twoway (lfitci `s_type1' `s_type2', clwidth(vthin) clcolor("`INF_Blue_Light'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==0 , `scattercircle' mcolor("`INF_grey'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==1 , `scattercircle' mcolor("`INF_Red_Light'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==2 , `scattercircle' mcolor("`INF_Red'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==3 , `scattercircle' mcolor("`INF_Indigo_Light'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==4 , `scattercircle' mcolor("`INF_Indigo'")) ///    
            (scatter `s_type1' `s_type2' if disease_grp==5 , `scattercircle' mcolor("`INF_Indigo_Dark'")) ///    
            (scatter `s_type1' `s_type2' if disease_grp==6 , `scattercircle' mcolor("`INF_Green_Light'")) ///
            (scatter `s_type1' `s_type2' if disease_grp==7 , `scattercircle' mcolor("`INF_green'")) ///    
            (scatter `s_type1' `s_type2' if disease_grp==8 , `scattercircle' mcolor("`INF_gold'")) ///    
            (function y=x , range(``obj'`analyte'y') lcolor("`INF_Blue'") lwidth(vthin)), ///
            legend(off) ytitle("`s_type1'", size(medthick)) xtitle("`s_type2'", size(medthick)) ///
            xlabel("0(20)80", `axislabelformat') xscale(range("0 80")) ///
            ylabel("0(20)80", `axislabelformat') yscale(range("0 80")) ///
            `axislineformat' ///
            name(`s_type1'_`s_type2'_Scat,replace)
          
        }
        
    }

I look forward to your help and responses. Please do let me know if you require more information.

Annabelle

Click image for larger version

Name: Screenshot 2023-10-25 at 11.38.50.png
Views: 1
Size: 224.7 KB
ID: 1731464

Last edited by Annabelle Coleman; 25 Oct 2023, 05:34.

Tags: None

Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

25 Oct 2023, 05:04

Why not use a log scale for both y and x?

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

25 Oct 2023, 05:09

Sorry, but I can't follow everything you're doing here and indeed -- as you say -- much of the code and all of your data are hidden from us.

I can't see that you're trying to implement a scale break, so what is the problem otherwise?

Evidently the text with regression results is badly placed and you're missing how to show superscript 2, and so forth. I can't quickly see the associated code.

Someone on social media once commented that the FAQ you cite (I'm first author) was intended to make you feel guilty about wanting a scale break. That was sharp and exaggerated but not absolutely wrong. Scale breaks in my view are usually a bad idea and Stata doesn't really support them. Anything you do is a work-around.

More fundamentally I don't off-hand know how to break both axes at once in a simple way, as that implies four sub-graphs.

How about using logarithmic scales? Or one graph with all of the data and one omitting the moderate outliers?

As commented recently red and green are best not shown together on a graph.

Distinguishing 9 groups clearly is a hard task without something different, such as a front-and-back plot. https://www.statalist.org/forums/for...ailable-on-ssc

I don't know what NfLCap_S_0 or NfLCap_P_0 is and presumably when you get this graph improved you will define variable labels or equivalently better axis titles.

EDIT I see that Maarten Buis has also suggested logarithmic scales.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 751
#4

25 Oct 2023, 08:45

Nick Cox is aware that he and I have an amicable gentlemen's disagreement on this topic.

For me the issue often comes down to a tradeoff among (a) displaying the data "as is" versus (b) using some method (e.g. log transformation) that will pull in the extreme data points but will also visually compress the non-extreme data cloud versus (c) using an axis break that provides a compromise between the two. I believe that reasonable people can disagree about how they prefer to manage the tradeoff.

I wrote a little note last year that should give you some idea of how to introduce axis breaks in Stata graphs. https://uwmadison.box.com/s/kt0lsncj...ye54wka5j0f3rb
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

25 Oct 2023, 09:21

Indeed. as many serious researchers have wanted to show scale breaks, I am happy to think that they are otherwise smart people.

The purpose of visualization is to make things clear. As John knows, I see quite often boundary cases where to me visualization requires something a bit awkward like log(y + c) even though that is not the best way towards a model.

A scale break is easy to explain but harder to assimilate. But then some people tell me that log scales are hard to assimilate.
1 like
Comment
John Mullahy

Join Date: Dec 2016

Posts: 751
#6

25 Oct 2023, 09:24

"otherwise"
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#7

25 Oct 2023, 09:30

There are scale breaks and scale breaks. Like many people of my generation or any older or some younger, I drew many graphs by hand long before I ever used a computer. We were encouraged, indeed instructed, that a y axis that didn't start at zero should be indicated as such by some kind of jagged line. These are no longer fashionable or even easy in many programs. +

The scale breaks here are within the range of the data and a different kettle of fish.

According to legend, the late statistician David Wallace called "take logarithms" the first rule of applied statistics. (I wish I could remember where I saw this.)
Comment

Announcement

Break in both x and y axis

Comment

Comment

Comment

Comment

Comment

Comment