Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making a bar graph for a subsection of data; including 'missing' data and average value

    Hi,

    I would like to create a bar graph showing the distribution of a certain value (the amount of ££s survey respondents are willing to donate to a given charity) together with a thin vertical line showing the average amount donated as a reference point.

    The 'problems' I am facing are:
    1) How to show all categories regardless of whether 0 or not? (The bar graph currently only shows bars for values that are given by respondents. E.g. no one gave £2 so is omitted. I presume it to be good practice to include these so as to visually read the graph better at sight.)
    2) How to graph these values only for a subset of the data? (I have collected data via a survey. There are 2 treatments (given as '1' and '2' by GroupID below. I would like to show the distribution of donations only for Group 1. However, so far I haven't been able to separate them.)
    3) How to add a thin vertical line to show the average value as a reference point? (Is there a command? I have only otherwise found how to manually draw a line).

    Below included (I) current graph (ii) intended graph (iii) dataex and code

    Click image for larger version

Name:	Graph.png
Views:	1
Size:	31.0 KB
ID:	1652563

    Click image for larger version

Name:	Graph intended-min.png
Views:	1
Size:	9.9 KB
ID:	1652562


    Code:
    graph bar, over(Donation, label(labsize(vsmall))) graphregion(color(white)) ytitle(`percent', size(small)) title('The distribution of individual willingness to fight global warming', size(small)) b1title("Donation (£)")
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    * dataex ResponseId Donation meanprice_for_group Groupid
    
    input str17 ResponseId int Donation float meanprice_for_group long Groupid
    "R_2bTC6yfJ9D3ZXql"  20 64.46154 1
    "R_Okvpu1Ga2S8cVUt"  52 64.46154 1
    "R_xzMwnLib43TGJPz" 110 64.46154 1
    "R_1gbT3pBht6XY1B6" 110 64.46154 1
    "R_1pmKGHiY7Zm6MFc" 100 64.46154 1
    "R_1igs0Ghw9Mlfdov"  50 64.46154 1
    "R_1DDbg6O6SDOCnrw"  80 64.46154 1
    "R_2rqmaKA6PEkEgfx"  80 64.46154 1
    "R_3HN682nKJ5Y1ycc"  20 64.46154 1
    "R_3qmbAsxwMlpEuiD"  60 64.46154 1
    "R_1OGKiQFyrd3TXmB"  84 64.46154 1
    "R_1CHgL5Cu0cJCjWX"  90 64.46154 1
    "R_43gV9fXPvO8iuVX"  95 64.46154 1
    "R_2D5K9tiKWYJXRXF"  30 64.46154 1
    "R_2EiApI6yUDBLlU8"  59 64.46154 1
    "R_rq2e4vek7UmF3r3"   1 64.46154 1
    "R_cMZrpeA00gjVIm5"  20 64.46154 1
    "R_e5SsOmIywfRwzTz"   1 64.46154 1
    "R_2CvbbvbXEzsJmG5"  90 64.46154 1
    "R_3nkKMH9SmNvNkld"  75 64.46154 1
    "R_XgFprN8sZrei7O9"  70 64.46154 1
    "R_1hLx1nTSNHDMAFe"  50 64.46154 1
    "R_23eofu3GpLXUeME" 110 64.46154 1
    "R_2sSenyiBiUJ0yCL"  30 64.46154 1
    "R_1PRwRAVy4xbeYpO" 110 64.46154 1
    "R_pQo42C2xKLt0r0l" 110 64.46154 1
    "R_3itk79Q8s3t8ycr" 110 64.46154 1
    "R_3OjlGGpuAqY11Oe"  71 64.46154 1
    "R_2R99vPGcdzNcZeG"  55 64.46154 1
    "R_3qVFZSidyrpekEJ" 110 64.46154 1
    "R_3qVBDmamjMnLpDc"  50 64.46154 1
    "R_1JDdR96i8TgYYJO"  64 64.46154 1
    "R_Wrsjwj8SOK6s48N"  95 64.46154 1
    "R_YbQgP2HxqDWAbM5"  10 64.46154 1
    "R_1YqSJ647fatsbCx"  39 64.46154 1
    "R_1LG2NFd7lLHLQgB"  84 64.46154 1
    "R_3Ld9vEyqBRqWM5b" 110 64.46154 1
    "R_uq8nRMxvVQ7ScOB"  90 64.46154 1
    "R_1FgK3GkHVgzrucq" 110 64.46154 1
    "R_RlycIrfuYWrEvbr"  82 64.46154 1
    "R_1lhRbnasoNVg3Yu" 110 65.63158 2
    "R_1LIqtfuaHirqNji"   0 65.63158 2
    "R_2B4muV5wfLQCQyP"  63 65.63158 2
    "R_1K8yADjlYukIKN9"   0 65.63158 2
    "R_6glzKqroF1tqA7f" 110 65.63158 2
    "R_SCy0Vvt3D6VnNT3"  10 65.63158 2
    "R_2SDlsAnTe4KjIMU"  84 65.63158 2
    "R_2pYyqdKDMIC2EEE"  90 65.63158 2
    "R_2SDnolevyhqSeg9"  20 65.63158 2
    "R_O8dN5Cj0zoj3NjX" 110 65.63158 2
    "R_1cUgIo8U2W2d5MA" 110 65.63158 2
    "R_yC2wrVmNkQCpWkp" 110 65.63158 2
    end
    label values Groupid Groupid
    label def Groupid 1 "Control", modify
    label def Groupid 2 "Empirical", modify
    Any help or guidance is greatly appreciated while I learn to find my way around Stata! Please let me know if any useful information is missing.

    Best regards

  • #2
    See the -if- qualifier to select groups

    Code:
    help if
    Thanks for the data example. You can try twoway hist with the -discrete- option.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str17 ResponseId int Donation float meanprice_for_group long Groupid
    "R_2bTC6yfJ9D3ZXql"  20 64.46154 1
    "R_Okvpu1Ga2S8cVUt"  52 64.46154 1
    "R_xzMwnLib43TGJPz" 110 64.46154 1
    "R_1gbT3pBht6XY1B6" 110 64.46154 1
    "R_1pmKGHiY7Zm6MFc" 100 64.46154 1
    "R_1igs0Ghw9Mlfdov"  50 64.46154 1
    "R_1DDbg6O6SDOCnrw"  80 64.46154 1
    "R_2rqmaKA6PEkEgfx"  80 64.46154 1
    "R_3HN682nKJ5Y1ycc"  20 64.46154 1
    "R_3qmbAsxwMlpEuiD"  60 64.46154 1
    "R_1OGKiQFyrd3TXmB"  84 64.46154 1
    "R_1CHgL5Cu0cJCjWX"  90 64.46154 1
    "R_43gV9fXPvO8iuVX"  95 64.46154 1
    "R_2D5K9tiKWYJXRXF"  30 64.46154 1
    "R_2EiApI6yUDBLlU8"  59 64.46154 1
    "R_rq2e4vek7UmF3r3"   1 64.46154 1
    "R_cMZrpeA00gjVIm5"  20 64.46154 1
    "R_e5SsOmIywfRwzTz"   1 64.46154 1
    "R_2CvbbvbXEzsJmG5"  90 64.46154 1
    "R_3nkKMH9SmNvNkld"  75 64.46154 1
    "R_XgFprN8sZrei7O9"  70 64.46154 1
    "R_1hLx1nTSNHDMAFe"  50 64.46154 1
    "R_23eofu3GpLXUeME" 110 64.46154 1
    "R_2sSenyiBiUJ0yCL"  30 64.46154 1
    "R_1PRwRAVy4xbeYpO" 110 64.46154 1
    "R_pQo42C2xKLt0r0l" 110 64.46154 1
    "R_3itk79Q8s3t8ycr" 110 64.46154 1
    "R_3OjlGGpuAqY11Oe"  71 64.46154 1
    "R_2R99vPGcdzNcZeG"  55 64.46154 1
    "R_3qVFZSidyrpekEJ" 110 64.46154 1
    "R_3qVBDmamjMnLpDc"  50 64.46154 1
    "R_1JDdR96i8TgYYJO"  64 64.46154 1
    "R_Wrsjwj8SOK6s48N"  95 64.46154 1
    "R_YbQgP2HxqDWAbM5"  10 64.46154 1
    "R_1YqSJ647fatsbCx"  39 64.46154 1
    "R_1LG2NFd7lLHLQgB"  84 64.46154 1
    "R_3Ld9vEyqBRqWM5b" 110 64.46154 1
    "R_uq8nRMxvVQ7ScOB"  90 64.46154 1
    "R_1FgK3GkHVgzrucq" 110 64.46154 1
    "R_RlycIrfuYWrEvbr"  82 64.46154 1
    "R_1lhRbnasoNVg3Yu" 110 65.63158 2
    "R_1LIqtfuaHirqNji"   0 65.63158 2
    "R_2B4muV5wfLQCQyP"  63 65.63158 2
    "R_1K8yADjlYukIKN9"   0 65.63158 2
    "R_6glzKqroF1tqA7f" 110 65.63158 2
    "R_SCy0Vvt3D6VnNT3"  10 65.63158 2
    "R_2SDlsAnTe4KjIMU"  84 65.63158 2
    "R_2pYyqdKDMIC2EEE"  90 65.63158 2
    "R_2SDnolevyhqSeg9"  20 65.63158 2
    "R_O8dN5Cj0zoj3NjX" 110 65.63158 2
    "R_1cUgIo8U2W2d5MA" 110 65.63158 2
    "R_yC2wrVmNkQCpWkp" 110 65.63158 2
    end
    label values Groupid Groupid
    label def Groupid 1 "Control", modify
    label def Groupid 2 "Empirical", modify
    
    *SAVE MEAN DONATION IN A LOCAL
    sum Donation if Groupid==1
    local mean= r(mean)
    
    *GRAPH
    set scheme s1color
    tw hist Donation if Groupid==1, discrete xlab(0(5)110) lcolor(gray) fcolor(gray%50) percent xtitle("Donation (£)") xline(`mean')
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	30.9 KB
ID:	1652597

    Comment


    • #3
      I join Andrew Musau in his implication that graph bar is not a good idea here, whereas a histogram may indeed be helpful.

      Much depends on whether #1 gives all the data, or you're talking about say 500 or 5000 people.

      Other ideas to throw into the discussion are quantile plots and even stem-and-leaf plots. The large number of values of GBP 110 is striking.


      Comment


      • #4
        Hi Andrew and Nick,

        Thank you very much for your help. I had been trying all sorts for many many hours until this point and couldn't figure it out so I really appreciate your advice.

        #1 is a small section of the data. I will look further into the commands used to better my understanding and look into your suggestions.

        Best wishes,
        Janina

        Comment


        • #5
          Here is a token quantile plot using qplot from the Stata Journal. The distributions could be superimposed if desired.

          Code:
           qplot Donation, by(Groupid, note("")) scheme(s1color) yla(0(10)110, grid) xla(0 0.25 0.5 0.75 1)
          Click image for larger version

Name:	qplot_gleed.png
Views:	1
Size:	22.9 KB
ID:	1652679

          Comment


          • #6
            I hadn’t considered a quantile plot – I will look into these more as they seem to provide a very useful visualisation of the data. Thank you very much Nick!

            Comment

            Working...
            X