Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bar graph with standard errors for survey data

    Dear statalist,

    I am trying to calcualte bar graphs with standard errors in Stata 14 using survey data.
    The easiest way I can think of is the following:

    Code:
    *Convert pweights into fweights
    local k = 2
    gen fwt = round(10^(`k')*weight,1)
    *Collapse data
    collapse (mean) meanX = X  (sd) sdX = X (count) n = X [fw=fwt], by(region)
    *Upper and lower values of confidence interval
    gen hiX = meanX + invttail(n-1,0.025)* sdX / sqrt(n)
    gen lowX = meanX - invttail(n-1,0.025)* sdX / sqrt(n)
    *create bar graph
    graph twoway (bar meanX region if region==1) ///
                         (bar meanX region if region==2) ///
                         (bar meanX region if region==3) ///
                         (bar meanX region if region==4)
    Is this right or am I missing anything? This way the upper and lower bound of the CI seem really close.


    Thanks!

  • #2
    In Stata terms you certainly need to start with twoway bar and add confidence intervals with other twoway commands if that is what you want.

    Some general terms here are detonator and dynamite plots.

    http://www.statalist.org/forums/foru...ver-dichotomic is a recent thread with links.

    http://biostat.mc.vanderbilt.edu/wik...de/Poster3.pdf is direct and devastating.

    Neither of these links offers detailed advice on tweaks with survey data. I would use mean's saved results to get confidence intervals and then plot those on graphs showing all the data.

    Comment


    • #3
      cibar, a new-ish package, written by Alexander Staudt, might provide what you want. Type ssc install cibar, and read the help file. Without a snippet of your data I can't be sure (incidentally please ssc install dataex and use it), but the code you want is probably something like this

      Code:
      cibar [fweight=meanX], over1(region)
      Last edited by Chris Larkin; 24 May 2017, 11:45. Reason: graphregion(col(white)) does not work as an option. I forgot that i'd edited the ado when I installed it

      Comment


      • #4
        I just saw that I forgot to include the last line of the code I am currently using. So the right one ist:
        Code:
          *Convert pweights into fweights local k = 2 gen fwt = round(10^(`k')*weight,1) *Collapse data collapse (mean) meanX = X  (sd) sdX = X (count) n = X [fw=fwt], by(region) *Upper and lower values of confidence interval gen hiX = meanX + invttail(n-1,0.025)* sdX / sqrt(n) gen lowX = meanX - invttail(n-1,0.025)* sdX / sqrt(n) *create bar graph graph twoway (bar meanX region if region==1) ///                      (bar meanX region if region==2) ///                      (bar meanX region if region==3) ///                      (bar meanX region if region==4) ///                      (rcap hiX lowX region)
        Can I use this one? PS: I want to stick to bar graphs, so the alternative of dotplot is not what I am looking for

        Comment


        • #5
          Ahhh, I dont know why the code is being displayed in one row now.
          I just added one line at the bottom:

          Code:
             
                                (rcap hiX lowX region)
          Can I use this with the code posted above ?
          PS: I want to stick to bar graphs, so the alternative of dotplots is not what I am looking for

          Comment


          • #6
            Your preferences are yours to follow, but as other people may be interested in this thread, I'll pursue #2. The Statalist link there is important to understand what I'm doing.

            Lacking a data example in #1 I turned to Stata's own examples. The help for mean includes this code

            Code:
            webuse highschool, clear
            svy: mean weight
            svy: mean weight, over(sex)
            so I started to play with plotting the data.

            The first discoveries, not surprisingly, are that the sample sizes are so large that standard errors or confidence intervals barely show on graphs and that weight is right-skewed, so comparisons on a transformed scale make more sense.

            Code:
            webuse highschool, clear
            
            gen log_weight = log(weight) 
            
            svy: mean log_weight, over(sex)
            mat table = r(table) 
            local mean1 = table[1,1] 
            local mean2 = table[1,2] 
            local ll1 = table[5,1] 
            local ll2 = table[5,2] 
            local ul1 = table[6,1] 
            local ul2 = table[6,2] 
            
            * install with -ssc inst mylabels- 
            mylabels 100(50)300, myscale(log(@)) local(yla) 
            
            * install from SJ site 
            qplot log_weight, over(sex) trscale(invnormal(@)) yla(`yla') aspect(1) ///
            ytitle(Weight (pounds)) mc(blue red) ///
            addplot(scatteri `mean1' -4 `mean1' 4, recast(line) lcolor(blue) /// 
            || scatteri `mean2' -4 `mean2' 4, recast(line) lcolor(red)) ///
            xtitle(normal quantile) yla(, ang(h)) /// 
            legend(order(1 2) pos(11) ring(0) col(1)) note(lines show geometric means)
            Although I won't plot confidence intervals, the code above gives examples of retrieving them from Stata's results.

            So, this is what I found:

            Click image for larger version

Name:	highschool.png
Views:	1
Size:	17.0 KB
ID:	1394659




            The moral is no more than any introductory text should explain. If you leap towards highly reduced summaries such as means +/- SE, you may miss structure in the data that could be interesting or important. Researchers in the field deserve the display even if they dismiss it as detail. There is more going on there than just the shift of distributions shown most strongly in the middle.

            Comment


            • #7
              @ Chris: The cibar package also looks good. What is the difference in the calculation between the cibar command and the code(s) in #1 and #5 I posted above?

              Comment

              Working...
              X