Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to add lines in parts of a twoway graph?

    Hi everyone,

    I'm creating a sort of box plot using twoway graphs but I don't know how to add lines indicating if there's a significant difference between my groups. See picture below. The horizontal square brackets and asterisks (indicating differences between group 1 vs group 2-5) in the picture were added using powerpoint which isn't really optimal. Does anyone know how to do this within Stata?

    The variables I'm looking at are on a continuous scale with 20-200 observations per group. I'm using the Stata command ranksum to estimate differences between groups. My code and a fake example of my data is below.

    I'm using Stata v16.1 (on Mac).

    Many thanks!

    Gunnar
    .

    Code:
        sort Group
        by Group: egen med = median(Protein_1)
        by Group: egen lqt = pctile(Protein_1), p(25)
        by Group: egen uqt = pctile(Protein_1), p(75)
    
        twoway     (rbar lqt med Group if Group==1, barw(.5) fcolor(red) lcolor(black) ) /// 
                (rbar lqt med Group if Group==2, barw(.5) fcolor(midgreen) lcolor(black) ) ///
                (rbar lqt med Group if Group==3, barw(.5) fcolor(orange) lcolor(black) ) ///
                (rbar lqt med Group if Group==4, barw(.5) fcolor(olive_teal) lcolor(black) ) ///
                (rbar lqt med Group if Group==5, barw(.5) fcolor(midblue) lcolor(black) ) ||  ///
                (rbar med uqt Group if Group==1, barw(.5) fcolor(red) lcolor(black) ) /// 
                (rbar med uqt Group if Group==2, barw(.5) fcolor(midgreen) lcolor(black) ) ///
                (rbar med uqt Group if Group==3, barw(.5) fcolor(orange) lcolor(black) ) ///
                (rbar med uqt Group if Group==4, barw(.5) fcolor(olive_teal) lcolor(black) ) ///
                (rbar med uqt Group if Group==5, barw(.5) fcolor(midblue) lcolor(black) ) ||  ///
                (scatter Protein_1 Group if diagnosis3==1, graphregion(fcolor(gs15)) mfcolor(red) mlcolor(black) msymbol(o) legend(off)) ///
                (scatter Protein_1 Group if diagnosis3==2, graphregion(fcolor(gs15)) mfcolor(midgreen) mlcolor(black) msymbol(o) legend(off)) ///
                (scatter Protein_1 Group if diagnosis3==3, graphregion(fcolor(gs15)) mfcolor(orange) mlcolor(black) msymbol(o) legend(off)) ///
                (scatter Protein_1 Group if diagnosis3==4, graphregion(fcolor(gs15)) mfcolor(olive_teal) mlcolor(black) msymbol(o) legend(off)) ///
                (scatter Protein_1 Group if diagnosis3==5, graphregion(fcolor(gs15)) mfcolor(midblue) mlcolor(black) msymbol(o) legend(off)), ///
                xlabel(1 "Group 1" 2 "Group 2" 3 "Group 3" 4 "Group 4" 5 "Group 5", labsize(small)) ///
                xtitle("") ///
                ytitle("") ///
                yscale(log range(50 22000)) ///
                ylabel(1000 "100" 10000 "1,000" 100000 "10,000", labsize(vsmall) angle(horizontal) grid gmin gmax) ///
                graphregion(color(white)) ///
                ysize(7) xsize(5) ///
                title("A. Protein 1", position(11) size(medlarge))



    Fake example of my data
    clear
    input float(Group Protein_1)
    1 123
    1 245
    1 1044
    1 334
    1 535
    1 355
    1 455
    2 355
    2 354
    2 432
    2 132
    2 82
    end


  • #2
    picture is not showing up. fake data does not permit a run of your program.

    Comment


    • #3
      I'm sorry. I don't know what happened. I'll try to repost the picture here.

      I wish I could share my data but unfortunately I can't.

      Gunnar

      Click image for larger version

Name:	Protein 1.png
Views:	2
Size:	134.6 KB
ID:	1684684

      Attached Files
      Last edited by Gunnar Ek; 06 Oct 2022, 22:52.

      Comment


      • #4
        See https://www.statalist.org/forums/for...l-significance

        Comment


        • #5
          That's great! Thanks a lot, Andrew. I searched but didn't find that before. Your code from Jan 13 is indeed very helpful!

          I'm using a logarithmic scale on the x axis. Would you know if there's a way to get the same visual distance between each horizontal line that I add? Now it seems I have to fiddle around to get an approximately similar distance between each line.

          Comment


          • #6
            I do not understand the question. Can you present a data example illustrating the issue?

            Comment


            • #7
              Hi Andrew,

              Maybe if I show you this in code and a graph.

              My code is now as shown below.

              With this code I get the attached graph with the new horizontal lines in the upper part of the graph. The "visual" distance in the graph between the lines decrease when moving up the y axis because I'm using a logarithmic scale on the y axis. My question is if there's a way to make this visual distance between each line the same, or if I have to manually change each graph so that the distances between each line (as well as the small vertical hooks on each horizontal line) look approximately similar?

              Best
              Gunnar


              Code:
              
              sort Group
                  by Group: egen med = median(Protein_1)
                  by Group: egen lqt = pctile(Protein_1), p(25)
                  by Group: egen uqt = pctile(Protein_1), p(75)
              
                  twoway     (rbar lqt med Group if Group==1, barw(.5) fcolor(red) lcolor(black) ) ///
                          (rbar lqt med Group if Group==2, barw(.5) fcolor(midgreen) lcolor(black) ) ///
                          (rbar lqt med Group if Group==3, barw(.5) fcolor(orange) lcolor(black) ) ///
                          (rbar lqt med Group if Group==4, barw(.5) fcolor(olive_teal) lcolor(black) ) ///
                          (rbar lqt med Group if Group==5, barw(.5) fcolor(midblue) lcolor(black) ) ||  ///
                          (rbar med uqt Group if Group==1, barw(.5) fcolor(red) lcolor(black) ) ///
                          (rbar med uqt Group if Group==2, barw(.5) fcolor(midgreen) lcolor(black) ) ///
                          (rbar med uqt Group if Group==3, barw(.5) fcolor(orange) lcolor(black) ) ///
                          (rbar med uqt Group if Group==4, barw(.5) fcolor(olive_teal) lcolor(black) ) ///
                          (rbar med uqt Group if Group==5, barw(.5) fcolor(midblue) lcolor(black) ) ||  ///
                          (scatter Protein_1 Group if diagnosis3==1, graphregion(fcolor(gs15)) mfcolor(red) mlcolor(black) msymbol(o) legend(off)) ///
                          (scatter Protein_1 Group if diagnosis3==2, graphregion(fcolor(gs15)) mfcolor(midgreen) mlcolor(black) msymbol(o) legend(off)) ///
                          (scatter Protein_1 Group if diagnosis3==3, graphregion(fcolor(gs15)) mfcolor(orange) mlcolor(black) msymbol(o) legend(off)) ///
                          (scatter Protein_1 Group if diagnosis3==4, graphregion(fcolor(gs15)) mfcolor(olive_teal) mlcolor(black) msymbol(o) legend(off)) ///
                          (scatter Protein_1 Group if diagnosis3==5, graphregion(fcolor(gs15)) mfcolor(midblue) mlcolor(black) msymbol(o) legend(off)), ///
                          (scatteri 13000 1 13000 2, recast(line) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 13000 1 13000 2, recast(dropline) base(12500) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 11000 1 11000 3, recast(line) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 11000 1 11000 3, recast(dropline) base(10500) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 9000 1 9000 4, recast(line) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 9000 1 9000 4, recast(dropline) base(8500) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 7000 1 7000 5, recast(line) lw(medthick) mc(none) lc(black) lp(solid)) ///
                          (scatteri 7000 1 7000 5, recast(dropline) base(6500) lw(medthick) mc(none) lc(black) lp(solid)), ///
                          xlabel(1 "Group 1" 2 "Group 2" 3 "Group 3" 4 "Group 4" 5 "Group 5", labsize(small)) ///
                          xtitle("") ///
                          ytitle("") ///
                          yscale(log range(500 15000)) ///
                          ylabel(100 "100" 1000 "1,000", labsize(vsmall) angle(horizontal) grid gmin gmax) ///
                          graphregion(color(white)) ///
                          ysize(7) xsize(5)
              Click image for larger version

Name:	Skärmavbild 2022-10-12 kl. 11.56.18.png
Views:	1
Size:	165.7 KB
ID:	1685149

              Last edited by Gunnar Ek; 12 Oct 2022, 04:53.

              Comment


              • #8
                Take the antilog to estimate the point. Assuming base e, you have

                Code:
                di log(100)
                di log(1000)
                Res.:


                Code:
                . di log(100)
                4.6051702
                
                . di log(1000)
                6.9077553
                If you want the next point to be 2 units up, then the question becomes what number has the log value 8.9? Last I remember, you need the exponential function to find the inverse.

                Code:
                di exp(8.9)
                Res.:

                Code:
                . di exp(8.9)
                7331.9735
                and that is your approximate value on the axis.
                Last edited by Andrew Musau; 12 Oct 2022, 04:54.

                Comment


                • #9
                  Thanks for a quick reply. It seems it requires some manual work to make the graphs look good. At least I don't have to use ppt

                  Best
                  Ulf

                  Comment

                  Working...
                  X