Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to draw a stacked bar graph in a twoway graph?

    hello

    I am a new user who is learning how to utilize stata.

    First of all, here is a description of the data I have built.

    area : House area categorized by (total 7 legends)
    (~20㎡, 20~40㎡, 40~60㎡, 60~85㎡, 85~100㎡, 100~130㎡, 130㎡~)
    con_ym : Contract year month (2022.12~2023.12)
    trade_volume: Trade volume of the corresponding housing area per month
    total_trade_volume : Total housing transaction volume per month
    ratio : trade_volume / total_trade_volume

    Data Example
    area con_ym trade_volume total_trade_volume ratio
    ~20㎡ 202212 3,020 97,176 3.1
    ~20㎡ 202301 2,560 97,918 2.6
    ~20㎡ 202302 2,680 129,347 2.1
    ~20㎡ 202312 2,301 103,627 2.2
    20~40㎡ 202212 10,156 97,176 10.5
    20~40㎡
    20~40㎡ 202312 11,817 103,627 11.4
    40~60㎡ 202212 34,067 97,176 35.1
    40~60㎡
    40~60㎡ 202312 37,657 103,627 36.3
    60~85㎡ 202212 40,927 97,176 42.1
    60~85㎡
    60~85㎡ 202312 42,325 103,627 40.8
    85~100㎡ 202212 1,573 97,176 1.6
    85~100㎡
    85~100㎡ 202312 1,530 103,627 1.5
    100~130㎡ 202212 5,264 97,176 5.4
    100~130㎡
    100~130㎡ 202312 5,633 103,627 5.4
    130㎡~ 202212 2,169 97,176 2.2
    130㎡~
    130㎡~ 202312 2,364 103,627 2.3

    The data period is from December 2022 to December 2023, and I built the data for a total of 5 variables as shown above.

    First, I utilized the code below to generate the following figure and the resulting graph.

    u "data.dta"

    gen trade_volume = 1

    collapse (count) trade_volume, by(area con_ym)

    egen total_trades_volume = total(trade_volume), by(con_ym)

    gen ratio = trade_volume / total_trades_volume * 100

    format %09.1fc ratio

    gen yyyymm = con_ym
    tostring yyyymm, replace
    gen year_month = substr(yyyymm, 3,2) + "." + substr(yyyymm, 5,2)

    * stacked bar graph *
    graph bar (asis) ratio, over(area) over(year_month) asyvars stack name(StackedBar, replace) ///
    title("2022.12 - 2023.12 Percentage of transactions by size") ///
    ylabel(, angle(horizontal)) ///
    blabel(bar, position(center) format(%9.1f)) ///
    ytitle("Transaction Share (%)") ///
    legend(order(1 "~20㎡" 2 "20㎡~40㎡" 3 "40㎡~60㎡" 4 "60㎡~85㎡" 5 "85㎡~100㎡" 6 "100㎡~130㎡" 7 "130㎡~") rows(2) position(6))

    Click image for larger version

Name:	StackedBar.jpg
Views:	2
Size:	76.7 KB
ID:	1742491




    However, I want to draw a stacked bar graph using a twoway graph instead of a bar graph, because I want to show the year and month as two x-axes with the boundary between 2022 and 2023 as shown in the following error graph result.

    However, when I run the code below, the x-axis is displayed in the desired form, but the stacked bar graph is not displayed properly.


    u "data.dta"

    gen trade_volume = 1

    collapse (count) trade_volume, by(area con_ym)

    egen total_trades_volume = total(trade_volume), by(con_ym)

    gen ratio = trade_volume / total_trades_volume * 100

    format %09.1fc ratio

    gen mdate = ym(con_y,con_m)

    su mdate if con_y == 2022, meanonly
    local m1 = r(mean)

    su mdate if con_y == 2023, meanonly
    local m2 = r(mean)

    levelsof mdate, local(labels)

    twoway (bar ratio mdate) ///
    , xla(`labels', noticks format(%tmNN)) ///
    xmla(`m1' "2022" `m2' "2023", tlength(*6) tlc(none) labsize(medium)) ///
    xli(`=ym(2022, 12) + 0.5', lp(dash) lw(thin) lc(gray)) ///
    xtitle("")


    Click image for larger version

Name:	fail_sample.jpg
Views:	2
Size:	29.2 KB
ID:	1742492




    The above describes my current situation and can be summarized as follows.

    1. is there any way to add 2 x-axes and a year separator using a bar graph?
    2. is there a way to draw a stacked bar based on the x-axis notation in a twoway graph?

    If possible, I would prefer method 2.

    Thank you for your expert help.

    Best wishes for peace and happiness in your home.
    Last edited by YEONOH HAN; 08 Feb 2024, 01:19.

  • #2
    Please show us the results of

    Code:
    dataex ratio mdate

    Comment


    • #3
      Mr. Nick Cox

      I'm always learning and trying new things with STATA, and I'm always learning from your expert advice.

      Here are the results of running the command to share the data you mentioned.

      I tried to be as detailed as possible, but perhaps I didn't convey it adequately.




      dataex ratio mdate

      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(ratio mdate)
       19.09989 755
       7.354556 755
       29.74753 755
       31.83315 755
      1.2074643 755
       6.256861 755
       4.500549 755
       .2551386 756
      .26373878 756
      1.5107646 756
       1.745836 756
      .02293381 756
      .28380588 756
       .1863372 756
       .2866726 757
       .4558094 757
      2.8609924 757
       2.820858 757
      .09460195 757
       .4844767 757
      .27807242 757
      .26087207 758
      1.1868246 758
       2.952728 758
       3.377003 758
      .12040249 758
       .5991457 758
       .3726744 758
       .3296735 759
       .4816099 759
       3.377003 759
       3.973282 759
      .12326922 759
       .7654158 759
       .4414758 759
       .3067397 760
        .564745 760
       3.672276 760
       4.102285 760
      .14906974 760
       .8514176 760
       .4959436 760
       .4558094 761
       .7109481 761
       4.165353 761
      4.2914886 761
      .20640427 761
       .9718201 761
      .55614483 761
      .21500444 762
      .56187826 762
      3.8213456 762
      4.3803573 762
      .15193647 762
       .8800849 762
       .5733452 762
      .54181117 763
      .50167704 763
       4.099418 763
      4.4921594 763
      .15193647 763
       .9402861 763
      .56761175 763
       .1748703 764
       .5446779 764
      3.5633404 764
       4.056417 764
       .1748703 764
       .9173523 764
       .4672763 764
       .2293381 765
      .52174413 765
      2.2876472 765
      2.6517215 765
       .1003354 765
       .6048791 765
      .43574235 765
      .28093913 766
       .4787432 766
       1.642634 766
      2.0869765 766
      .09173523 766
      .47014305 766
       .3382736 766
      .24940516 767
      .28093913 767
       1.588166 767
      1.7515695 767
      .08600178 767
      .41567525 767
      .21787117 767
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 91 out of 91 observations



      I look forward to your advice.

      Thank you as always.

      Comment


      • #4
        Thanks for your data example. Now, however, I have lost track of what you are trying to do. Your last command in #1 was

        Code:
        twoway (bar ratio mdate)
        with various extra options. That makes sense to me as matching the last graph showing one value for ratio for each date.

        But you've presented a data example with several values for each mdate.

        I guess you need to show us the values of area too.

        Comment


        • #5
          In addition to the dataex ratio mdate results you requested previously
          We've added information about the area and will share it again.

          I really want to get this resolved.

          Thank you as always for being so proactive and helpful.

          Here is the result of running the code


          copy starting from the next line -----------------------
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float(ratio mdate area)
           19.09989 755 1
           7.354556 755 2
           29.74753 755 3
           31.83315 755 4
          1.2074643 755 5
           6.256861 755 6
           4.500549 755 7
           .2551386 756 1
          .26373878 756 2
          1.5107646 756 3
           1.745836 756 4
          .02293381 756 5
          .28380588 756 6
           .1863372 756 7
           .2866726 757 1
           .4558094 757 2
          2.8609924 757 3
           2.820858 757 4
          .09460195 757 5
           .4844767 757 6
          .27807242 757 7
          .26087207 758 1
          1.1868246 758 2
           2.952728 758 3
           3.377003 758 4
          .12040249 758 5
           .5991457 758 6
           .3726744 758 7
           .3296735 759 1
           .4816099 759 2
           3.377003 759 3
           3.973282 759 4
          .12326922 759 5
           .7654158 759 6
           .4414758 759 7
           .3067397 760 1
            .564745 760 2
           3.672276 760 3
           4.102285 760 4
          .14906974 760 5
           .8514176 760 6
           .4959436 760 7
           .4558094 761 1
           .7109481 761 2
           4.165353 761 3
          4.2914886 761 4
          .20640427 761 5
           .9718201 761 6
          .55614483 761 7
          .21500444 762 1
          .56187826 762 2
          3.8213456 762 3
          4.3803573 762 4
          .15193647 762 5
           .8800849 762 6
           .5733452 762 7
          .54181117 763 1
          .50167704 763 2
           4.099418 763 3
          4.4921594 763 4
          .15193647 763 5
           .9402861 763 6
          .56761175 763 7
           .1748703 764 1
           .5446779 764 2
          3.5633404 764 3
           4.056417 764 4
           .1748703 764 5
           .9173523 764 6
           .4672763 764 7
           .2293381 765 1
          .52174413 765 2
          2.2876472 765 3
          2.6517215 765 4
           .1003354 765 5
           .6048791 765 6
          .43574235 765 7
          .28093913 766 1
           .4787432 766 2
           1.642634 766 3
          2.0869765 766 4
          .09173523 766 5
          .47014305 766 6
           .3382736 766 7
          .24940516 767 1
          .28093913 767 2
           1.588166 767 3
          1.7515695 767 4
          .08600178 767 5
          .41567525 767 6
          .21787117 767 7
          end
          label values area area_categorize8
          label def area_categorize8 1 "~20㎡", modify
          label def area_categorize8 2 "20㎡~40㎡", modify
          label def area_categorize8 3 "40㎡~60㎡", modify
          label def area_categorize8 4 "60㎡~85㎡", modify
          label def area_categorize8 5 "85㎡~100㎡", modify
          label def area_categorize8 6 "100㎡~130㎡", modify
          label def area_categorize8 7 "130㎡~", modify
          ------------------ copy up to and including the previous line ------------------

          Listed 91 out of 91 observations


          Comment


          • #6
            Thanks for the extra detail. I still don't understand everything here, but I can show you some technique.

            The ratio values are % of December 2022 total for December 2022, but otherwise % of 2023 total for each month in 2023. That doesn't seem consistent to me.

            Code:
            . tabstat ratio, by(mdate) s(sum)
            
            Summary for variables: ratio
            Group variable: mdate 
            
               mdate |       Sum
            ---------+----------
                 755 |       100
                 756 |  4.268555
                 757 |  7.281483
                 758 |   8.86965
                 759 |  9.491729
                 760 |  10.14248
                 761 |  11.35797
                 762 |  10.58395
                 763 |   11.2949
                 764 |  9.898805
                 765 |  6.831408
                 766 |  5.389445
                 767 |  4.589628
            ---------+----------
               Total |       200
            --------------------
            The area bins are ambiguous. Would 20 (for example) go in the lower class or the upper class? That is do your bins run > 20 etc. or >= 20 etc.? I have taken the upper limit as being inclusive and removed the awkward repetition of a tiny character for square metres.

            More importantly, I think your stacked bar chart suffers as most do from crowding. Here first is an alternative using tabplot from the Stata Journal.



            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(ratio mdate area)
             19.09989 755 1
             7.354556 755 2
             29.74753 755 3
             31.83315 755 4
            1.2074643 755 5
             6.256861 755 6
             4.500549 755 7
             .2551386 756 1
            .26373878 756 2
            1.5107646 756 3
             1.745836 756 4
            .02293381 756 5
            .28380588 756 6
             .1863372 756 7
             .2866726 757 1
             .4558094 757 2
            2.8609924 757 3
             2.820858 757 4
            .09460195 757 5
             .4844767 757 6
            .27807242 757 7
            .26087207 758 1
            1.1868246 758 2
             2.952728 758 3
             3.377003 758 4
            .12040249 758 5
             .5991457 758 6
             .3726744 758 7
             .3296735 759 1
             .4816099 759 2
             3.377003 759 3
             3.973282 759 4
            .12326922 759 5
             .7654158 759 6
             .4414758 759 7
             .3067397 760 1
              .564745 760 2
             3.672276 760 3
             4.102285 760 4
            .14906974 760 5
             .8514176 760 6
             .4959436 760 7
             .4558094 761 1
             .7109481 761 2
             4.165353 761 3
            4.2914886 761 4
            .20640427 761 5
             .9718201 761 6
            .55614483 761 7
            .21500444 762 1
            .56187826 762 2
            3.8213456 762 3
            4.3803573 762 4
            .15193647 762 5
             .8800849 762 6
             .5733452 762 7
            .54181117 763 1
            .50167704 763 2
             4.099418 763 3
            4.4921594 763 4
            .15193647 763 5
             .9402861 763 6
            .56761175 763 7
             .1748703 764 1
             .5446779 764 2
            3.5633404 764 3
             4.056417 764 4
             .1748703 764 5
             .9173523 764 6
             .4672763 764 7
             .2293381 765 1
            .52174413 765 2
            2.2876472 765 3
            2.6517215 765 4
             .1003354 765 5
             .6048791 765 6
            .43574235 765 7
            .28093913 766 1
             .4787432 766 2
             1.642634 766 3
            2.0869765 766 4
            .09173523 766 5
            .47014305 766 6
             .3382736 766 7
            .24940516 767 1
            .28093913 767 2
             1.588166 767 3
            1.7515695 767 4
            .08600178 767 5
            .41567525 767 6
            .21787117 767 7
            end
            
            label values area area_categorize8
            label def area_categorize8 1 "-20", modify
            label def area_categorize8 2 "-40", modify
            label def area_categorize8 3 "-60", modify
            label def area_categorize8 4 "-85", modify
            label def area_categorize8 5 "-100", modify
            label def area_categorize8 6 "-130", modify
            label def area_categorize8 7 "more", modify 
            
            * my code starts here 
            
            forval d = 755/767 { 
                local lbl : di %tmMon `d'
                label def mdate `d' "`lbl'", add 
            }
            
            label val mdate mdate 
            
            tabplot area mdate [iw=ratio], percent(mdate) showval yreverse xtitle("") xmla(1 "2022" 7.5 "2023", tlength(*5) labsize(medium) tlc(none)) ytitle(Area (m{sup:2})) separate(area) subtitle(%) name(NJC1, replace)
            I am not sure it's much better but (1) you get to see the percents displayed (2) you lose the legend, always an advance. The bars could easily be made wider.

            Click image for larger version

Name:	housearea1.png
Views:	1
Size:	72.0 KB
ID:	1743075

            Getting a stacked bar chart out of twoway is definitely possible, but not especially easy. Here is one way to do it. The bars could always be made narrower. I've not tried to superimpose the numbers on the graph.


            Code:
            reshape wide ratio, i(mdate) j(area) 
            egen total = rowtotal(ratio?)
            
            
            local lbl1 "-20"
            local lbl2 "-40"
            local lbl3 "-60"
            local lbl4 "-85"
            local lbl5 "-100"
            local lbl6 "-130"
            local lbl7 "more" 
            
            
            forval j = 1/7 { 
                gen pc`j' = 100 * ratio`j'/total
                local jm1 = `j' - 1 
                if `j' == 1 gen cumul1 = pc1 
                else gen cumul`j' = cumul`jm1' + pc`j'
                local J = 8 - `j'
                local lgdcall `lgdcall' `J' "`lbl`J''"
            }
            
            local call bar pc1 mdate 
            
            forval j = 2/7 { 
                local jm1 = `j' - 1 
                local call `call' || rbar cumul`jm1' cumul`j' mdate 
            }
            
            twoway `call', legend(order(`lgdcall')) xtitle("") xla(755/767, noticks valuelabel) xmla(755 "2022" 761.5 "2023", tlength(*5) labsize(medium) tlc(none)) ytitle(Percent by area (m{sup:2})) name(NJC2, replace)
            Click image for larger version

Name:	housearea2.png
Views:	1
Size:	50.2 KB
ID:	1743076

            Comment


            • #7
              That should be < 20 or <= 20, etc.

              Comment


              • #8
                Nick Cox

                Based on the code you provided, I'm sure we can get the graph you're trying to represent.

                Thanks for your help as always.

                May you always have peace and happiness in your life.

                Thanks again.

                Comment

                Working...
                X