Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with graph of mean by group






    Click image for larger version

Name:	Screenshot 2020-03-02 at 1.38.16 pm.png
Views:	1
Size:	89.7 KB
ID:	1539266


    Hi,

    My data is in the format above and what I'm looking to do is create a plot (with error bars) of the mean scores over the three time points (0, 1 and 2) by tag group (0 or 1) - the issue I had was not having the correct time points on my plot and ending up with the actual tag group on the X axis which is not what i'm after - I would ideally have the score on the Y axis and the time on the X.

    Any help with how to go about doing this would be much appreciated.

    Thanks





  • #2
    How do you define error bar? Otherwise it's hard to comment with an image not a decent data example and your alluding to code you don't show us.

    https://www.statalist.org/forums/help#stata gives guidance here.

    Comment


    • #3
      Apologies.

      This is the data example:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double(DSsum_av0 DSsum_av1 DSsum_av2) float tag
          
           9.5       10    9 0
             9        9    . 0
             7      6.5    7 0
          6.25      6.5    . 0
            10        9    . 0
            10      9.5    9 0
             6        5    5 0
             9      8.5  9.5 0
           6.5        8    6 0
             7      5.5    . 0
             7      5.5  7.5 0
             9        8    8 1
             7        8    . 1
             7      6.5    5 1
             7      8.5    7 1
            10       10   10 1
          8.25      7.5    . 1
             8        8  7.5 1
           8.5      6.5    . 1
             6        8    7 1
             9      8.5    . 1
             9        8    9 1
      end
      by error bar I mean a bar showing the standard error of the mean (SEM). Initially I tried to create a mean of the values by group using
      Code:
       bysort tag : egen DSsum__av1 = mean(DSsum_av1)
      etc but when I tried to
      Code:
       scatter
      the three values I got just three dots on the plot but what I am after is a plot of the mean values for the three scores (DSsum_av0, 1 and 2) for both of my groups (0 and 1) so that the score is on the Y axis and on the X axis I have the actual means of DS 0, 1 and 2 for the two groups so essentially 6 data points with the SEM bars.

      I hope I've explained it better now.

      Cheers
      Stan

      Comment


      • #4
        Looking at your code it is clear that you calculated the means for DSsum_av1 by tag, so the scatter plot (full command not given) could at most show two means for that variable. If you want means for all three variables, you need something more like this, with the extra twist that offsets make the graph more readable.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input double(DSsum_av0 DSsum_av1 DSsum_av2) float tag
            
             9.5       10    9 0
               9        9    . 0
               7      6.5    7 0
            6.25      6.5    . 0
              10        9    . 0
              10      9.5    9 0
               6        5    5 0
               9      8.5  9.5 0
             6.5        8    6 0
               7      5.5    . 0
               7      5.5  7.5 0
               9        8    8 1
               7        8    . 1
               7      6.5    5 1
               7      8.5    7 1
              10       10   10 1
            8.25      7.5    . 1
               8        8  7.5 1
             8.5      6.5    . 1
               6        8    7 1
               9      8.5    . 1
               9        8    9 1
        end
        
        sort tag
        
        forval j = 0/2 {
            by tag : egen mean`j' = mean(DSsum_av`j')
            by tag : egen SD`j'= sd(DSsum_av`j')
            by tag : egen count`j' = count(DSsum_av`j')
            gen upper`j' = mean`j' + SD`j' / sqrt(count`j')
            gen lower`j' = mean`j' - SD`j' / sqrt(count`j')
        }
        
        gen tag0 = tag - 0.2
        gen tag2 = tag + 0.2
        
        set scheme s1color
        
        twoway rcap upper0 lower0 tag0, lc(red) || rcap upper1 lower1 tag, lc(black) || rcap upper2 lower2 tag2, lc(blue) ///
        || scatter mean0 tag0, mc(red)  || scatter mean1 tag, ms(D) mc(black) || scatter mean2 tag2 , ms(T) mc(blue)                 ///
        legend(order(- "groups:" 4 "0" 5 "1" 6 "2") row(1)) ytitle(something explanatory avoiding jargon) ///
        yla(, ang(h))  xla(0 1, tlc(none)) xtitle(tag) xsc(r(-0.3 1.3) alt) xli(0.5, lc(gs8))
        Click image for larger version

Name:	meanplusbar.png
Views:	1
Size:	20.3 KB
ID:	1539294



        That said, these reductions throw away much detail that could be useful or interesting, to readers if not yourself. There are some puzzling gaps and repetitions in the data you show us, for example.

        Comment


        • #5
          Thank you very much! This is way further than what I got to so it's super useful.

          I'm thinking of using this just as a starting point to present my data before I go into the analyses - what I've given as an example is part of a larger dataset (around 6000 people) and these are their cognitive test scores over 3 years with 0s being non-diabetics and 1s being diabetics.

          Excuse the awful hand drawing, but could stata do something similar to that?

          Thanks a lot
          Stan
          Click image for larger version

Name:	IMG_0092 copy.jpg
Views:	1
Size:	1.25 MB
ID:	1539320

          Comment


          • #6
            There are several ways to achieve your drawing. One is given below. You need to reshape the data from wide to long format. See the codes below based on the data-example provided by Nick #4:

            Code:
            gen rowid=_n //generate row id
            reshape long DSsum_av, i(rowid) j(time) //reshape data to long formate
            
            replace time=time+.1 if tag==1 // .1 decimal added to time for tag-group-1 for visualisation
            
            ssc install lgraph, replace //install the lgraph program
            
            #delimit ;
            lgraph DSsum_av time tag, errortype(se)
                lop(con(none)) xl(0 "0" 1 "1"  2 "2") xscale(range(0 3))
                legend(order(1 "Tag-0" 2 "Tag-1")
                    pos(2) ring(0) col(1)
                    region(col(white)) size(small))
                yti("Mean-DS (S.E." " ") xti("Time")
                yl( ,ang(hor) format(%12.2f))
                ti("Mean DS score of two groups")
            ;
            #delimit cr
            Click image for larger version

Name:	test.png
Views:	1
Size:	22.5 KB
ID:	1539326
            Roman

            Comment


            • #7
              Thanks a lot, Roman! This is exactly what I was looking for.

              Much appreciated
              Stan

              Comment

              Working...
              X