Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using statplot to plot averages of one variable, grouped by a list of different variables

    Hello,

    Apologies for the potentially confusing title. This is my first post on Statalist and I couldn't quite think how to word the title.

    This is my issue. I'm working with data about market traders in Sierra Leone and Liberia. There was a multiple choice question (select all that apply) in the questionnaire, that asked people 'Which of the following do you sell? Vegetables, frozen food, meat, pharmaceuticals, etc. etc.'

    I have managed to get a nice chart using statplot, which looks a bit like something I could have got with the 'multiple response' option in SPSS, using this code:

    Code:
    statplot sell_veg-sell_pharm, ///
    by(market_num,title("Items sold among traders in Duala and Kpetema market", span) subtitle("(Not including 'other')")) ///
    varopts(relabel(1 "Vegetables" 2 "Cloth" 3 "Meat" ///
    4 "Frozen food" 5 "Frozen fish" 6 "Dry fish" 7 "Dry farm produce" 8 "Electrical goods" ///
    9 "Furniture" 10 "Jewellery" 11 "Toiletries" 12 "Pharmaceuticals" 13 "Other") sort(1) descending) ///
    ytitle("Percentage of traders selling item") ///
    blabel(bar,position(outside) color(black) format(%9.1f))
    Click image for larger version

Name:	Items most sold .png
Views:	3
Size:	118.3 KB
ID:	1657735



    I have another variable which asks 'Does your stall require electricity?'. In my dataset, this is called 'elec_yn'. What I'm trying to get, is the chart above, but instead of the mean of each variable (the percentage of traders who sell that item), I'd like it to show the proportion of people who sell each item, that require electricity.

    I can get this in a bunch of tables, by using the following code:

    Code:
    foreach v of varlist sell_veg-sell_pharm {
        table `v' (market_num elec_yn ), statistic(percent,across(elec_yn)) nototal
    }
    Which gives the following (just one example here)

    Click image for larger version

Name:	Items sold and electricity.png
Views:	1
Size:	2.2 KB
ID:	1657731




    I'm interested in the '34.62' and '21.74' figures - i.e. the percentage of traders that sell dry fish in Duala, and Kpetema, that say their business requires electricity.

    I tried to get all of these tables into a chart format, and tried the following 'statplot' code:


    Code:
    statplot sell_veg-sell_pharm , ///
    over(elec_yn) ///
    by(market_num,title("Whether businesses require electricity, by item sold - Duala",span)) ///
    varopts(relabel(1 "Vegetables" 2 "Cloth" 3 "Meat" 4 "Frozen food" 5 "Frozen fish" 6 "Dry fish" 7 "Dry farm produce" 8 "Electrical goods" 9 "Furniture" 10 "Jewellery" 11 "Toiletries" 12 "Pharmaceuticals" 13 "Other") sort(1) descending) ///
    blabel(bar,position(outside) color(black) format(%9.1f))
    And it gave me the following chart, which isn't right as I think it's showing me, 'of those who require electricity, what proportion sells each of these items?':

    Click image for larger version

Name:	Electricity yes no and products.png
Views:	2
Size:	45.1 KB
ID:	1657732




    Looking at the dry fish example, the percentage on the left chart that is 'yes' should be 34.6, and on the right chart should be 21.7.

    Will I be able to do this with statplot? I can't think how to write the ordering of the variables, or if I need to do something a bit more clever. If this isn't possible in statplot, is there another way to do it?

    Please see the data here using dataex:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long market_num byte(sell_veg sell_cloth sell_meat sell_frozfood sell_freshfish sell_dryfish sell_dryfarm sell_elec sell_furn sell_jewell sell_toil sell_pharm) float elec_yn
    1   0 100   0   0   0   0   0   0   0 0   0   0 100
    1   0 100   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0 100 100   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0 100   0   0   0 0   0   0   0
    1   0   0   0   0 100   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0 100   0 0   0   0   0
    1   0   0 100   0   0 100   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0 100   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0   0
    1   0 100   0   0   0   0   0   0   0 0   0   0   0
    1   0 100   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0   0
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0   0 100 0   0   0 100
    1 100   0   0   0   0   0 100   0   0 0   0   0   0
    1   0   0   0   0   0   0   0 100   0 0   0   0 100
    1   0   0   0   0   0   0   0 100   0 0   0   0 100
    1   0   0   0 100   0   0   0   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0 100
    1 100   0   0   0   0   0 100   0   0 0   0   0 100
    1   0   0   0   0 100   0   0   0   0 0   0   0 100
    1   0   0   0   0 100   0   0   0   0 0   0   0 100
    1 100   0   0 100   0   0 100   0   0 0   0   0   0
    1 100   0   0   0   0   0 100   0   0 0   0   0   0
    1   0   0 100   0   0 100   0   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1 100   0   0   0   0   0 100   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0   0
    1   0   0   0   0   0   0 100   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0 100   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0 100 100   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0   0
    1   0   0   0   0   0   0   0 100   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    .   0   0   0   0 100   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    . 100   0   0   0   0   0   0   0   0 0   0   0   0
    . 100   0   0   0   0   0   0   0   0 0   0   0   0
    2   0   0   0   0   0   0 100   0   0 0   0   0   0
    2   0   0   0 100   0   0   0   0   0 0   0   0 100
    2   0   0   0   0   0 100   0   0   0 0   0   0 100
    2 100   0   0   0   0   0   0   0   0 0   0   0 100
    2   0   0   0   0   0 100   0   0   0 0   0   0 100
    2 100   0   0   0   0   0   0   0   0 0   0   0 100
    2   0   0   0   0 100   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0 100   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0 100 100   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0 100   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2 100   0   0   0   0 100   0   0   0 0   0   0 100
    2 100   0   0   0   0 100   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0 100   0
    2 100   0   0   0   0   0   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0   0 100
    2   0   0   0   0   0 100   0   0   0 0   0   0   0
    2   0   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0 100   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0 100   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0   0
    1   0   0   0   0 100   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0   0 100   0 0   0   0 100
    1   0 100   0   0   0   0   0   0   0 0   0   0   0
    1 100   0   0   0   0   0   0   0   0 0   0   0   0
    1   0   0   0   0   0   0 100   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0   0 100 100
    1   0   0   0   0   0   0   0   0   0 0   0   0 100
    1   0   0   0   0   0   0 100   0   0 0   0   0 100
    1   0   0   0   0   0   0   0   0   0 0 100   0   0
    1   0   0   0   0   0   0   0   0   0 0   0   0   0
    1   0 100   0   0   0   0   0   0   0 0   0   0   0
    end
    label values market_num market_num
    label def market_num 1 "Duala", modify
    label def market_num 2 "Kpetema", modify
    label values elec_yn elec_yn
    label def elec_yn 0 "No", modify
    EDIT:
    I can get the chart for just one of the market products, using the following code:

    Code:
    graph hbar, over(elec_yn) over(sell_dryfish,relabel(0 "Doesn't sell dry fish" 100 "Sells dry fish")) ///
    by(market_num,title("Whether respondent requires electricity," "by whether they sell dry fish or not",span)) ///
    asyvars percentages ///
    blabel(bar,position(inside) color(white) format(%9.1f))
    Which gives me this chart (can't make the x labels work.. 0 should be 'doesn't sell dry fish', and 1 should be 'sells dry fish', then the values are the same as in the table so it's correct)
    Click image for larger version

Name:	Dry fish vs electricity correct.png
Views:	1
Size:	23.5 KB
ID:	1657728





    What I'm basically after is one chart which just shows the '100' and 'yes' values, for 11 different variables / market products. E.g. it would look something like this (apologies for the basic excel mock-up.. I realise it's the wrong forum!):

    Click image for larger version

Name:	Item and electricity_small.png
Views:	1
Size:	52.5 KB
ID:	1657730

    Thank you in advance.
    Attached Files
    Last edited by Luke Armitage; 03 Apr 2022, 16:12.
Working...
X