Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Break in both x and y axis

    Dear Statalist users,

    I am having trouble creating a break in the axis of a graph. I have tried to use https://www.stata.com/support/faqs/g.../scale-breaks/ but I am having trouble creating the break and combining the graphs with a break in both x and y axes.

    This is the code I am currently using. I am trying to make a break so the data points from 0-80 are the majority of the graph and then points at 120 and 240 are on the end with breaks.

    I have also attached an image of the graph and I am more than happy to share the date to work with.

    Code:
            drop if Objective!="A00"
            drop if Processingdelay==4 | NfL==.
            keep NfL ID_visit sample_delay disease_grp
            reshape wide NfL , i(ID_visit) j(sample_delay) string //creates variables of sample_type Ven_p Ven_S Cap_P Cap_S for scatter plots
    
            ds ID* dis* , not // lists all string variables in memory of  Ven_p Ven_S Cap_P Cap_S
            local Obj_type `r(varlist)' // local for variables  Ven_p Ven_S Cap_P Cap_S
            
            local axislabelformat labsize(medium) tlwidth(medium)
            local axislineformat aspectratio(1) xscale(lwidth(vthin)) yscale(lwidth(vthin))
            local scattercircle msymbol(circle) msize(large) mlcolor(none)
            
            include "0.1 GRAPHICS_OD.do" //contains colours for scatter plots
            
        foreach s_type1 in `Obj_type' { //** Open double loop to run through code comparing each sample_type in varlist with each other in list
            foreach s_type2 in `Obj_type'  {
                
                
            twoway (lfitci `s_type1' `s_type2', clwidth(vthin) clcolor("`INF_Blue_Light'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==0 , `scattercircle' mcolor("`INF_grey'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==1 , `scattercircle' mcolor("`INF_Red_Light'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==2 , `scattercircle' mcolor("`INF_Red'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==3 , `scattercircle' mcolor("`INF_Indigo_Light'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==4 , `scattercircle' mcolor("`INF_Indigo'")) ///    
                (scatter `s_type1' `s_type2' if disease_grp==5 , `scattercircle' mcolor("`INF_Indigo_Dark'")) ///    
                (scatter `s_type1' `s_type2' if disease_grp==6 , `scattercircle' mcolor("`INF_Green_Light'")) ///
                (scatter `s_type1' `s_type2' if disease_grp==7 , `scattercircle' mcolor("`INF_green'")) ///    
                (scatter `s_type1' `s_type2' if disease_grp==8 , `scattercircle' mcolor("`INF_gold'")) ///    
                (function y=x , range(``obj'`analyte'y') lcolor("`INF_Blue'") lwidth(vthin)), ///
                legend(off) ytitle("`s_type1'", size(medthick)) xtitle("`s_type2'", size(medthick)) ///
                xlabel("0(20)80", `axislabelformat') xscale(range("0 80")) ///
                ylabel("0(20)80", `axislabelformat') yscale(range("0 80")) ///
                `axislineformat' ///
                name(`s_type1'_`s_type2'_Scat,replace)
              
            }
            
        }
    I look forward to your help and responses. Please do let me know if you require more information.

    Annabelle
    Click image for larger version

Name:	Screenshot 2023-10-25 at 11.38.50.png
Views:	1
Size:	224.7 KB
ID:	1731464
    Last edited by Annabelle Coleman; 25 Oct 2023, 05:34.

  • #2
    Why not use a log scale for both y and x?
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Sorry, but I can't follow everything you're doing here and indeed -- as you say -- much of the code and all of your data are hidden from us.

      I can't see that you're trying to implement a scale break, so what is the problem otherwise?

      Evidently the text with regression results is badly placed and you're missing how to show superscript 2, and so forth. I can't quickly see the associated code.

      Someone on social media once commented that the FAQ you cite (I'm first author) was intended to make you feel guilty about wanting a scale break. That was sharp and exaggerated but not absolutely wrong. Scale breaks in my view are usually a bad idea and Stata doesn't really support them. Anything you do is a work-around.

      More fundamentally I don't off-hand know how to break both axes at once in a simple way, as that implies four sub-graphs.

      How about using logarithmic scales? Or one graph with all of the data and one omitting the moderate outliers?

      As commented recently red and green are best not shown together on a graph.

      Distinguishing 9 groups clearly is a hard task without something different, such as a front-and-back plot. https://www.statalist.org/forums/for...ailable-on-ssc

      I don't know what NfLCap_S_0 or NfLCap_P_0 is and presumably when you get this graph improved you will define variable labels or equivalently better axis titles.

      EDIT I see that Maarten Buis has also suggested logarithmic scales.

      Comment


      • #4
        Nick Cox is aware that he and I have an amicable gentlemen's disagreement on this topic.

        For me the issue often comes down to a tradeoff among (a) displaying the data "as is" versus (b) using some method (e.g. log transformation) that will pull in the extreme data points but will also visually compress the non-extreme data cloud versus (c) using an axis break that provides a compromise between the two. I believe that reasonable people can disagree about how they prefer to manage the tradeoff.

        I wrote a little note last year that should give you some idea of how to introduce axis breaks in Stata graphs. https://uwmadison.box.com/s/kt0lsncj...ye54wka5j0f3rb

        Comment


        • #5
          Indeed. as many serious researchers have wanted to show scale breaks, I am happy to think that they are otherwise smart people.

          The purpose of visualization is to make things clear. As John knows, I see quite often boundary cases where to me visualization requires something a bit awkward like log(y + c) even though that is not the best way towards a model.

          A scale break is easy to explain but harder to assimilate. But then some people tell me that log scales are hard to assimilate.

          Comment


          • #6
            "otherwise"

            Comment


            • #7
              There are scale breaks and scale breaks. Like many people of my generation or any older or some younger, I drew many graphs by hand long before I ever used a computer. We were encouraged, indeed instructed, that a y axis that didn't start at zero should be indicated as such by some kind of jagged line. These are no longer fashionable or even easy in many programs. +

              The scale breaks here are within the range of the data and a different kettle of fish.

              According to legend, the late statistician David Wallace called "take logarithms" the first rule of applied statistics. (I wish I could remember where I saw this.)

              Comment

              Working...
              X