Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summarizing essential information by some conditions

    Hello!
    Now I am working with many files from 2006 to 2017 (each file for each year). In every of them there are date and month of interview, cost for food in this period, year of birth of a person and so on. I need to make a column with the number of week (1, 2, 3, 4) for each month and a column with money that he/she spent on food. I have already made a column for weeks. So, for every row I defined a number from 1 to 4. I do not know how to calculate the sum of costs for a particular week and month. Eventually, I need to make a linear graph, where on X axis I will have the number of week and month, Y axis - costs.
    For example,
    week month costs
    4 12 200
    1 11 500
    2 10 300
    3 10 550
    4 12 350
    1 11 400
    2 12 250
    And I think that I need to get something like
    week month costs
    1 11 900
    2 10 300
    2 12 250
    3 10 550
    4 12 550
    Later I suppose I will need to sort it in order to make a graph, but, actually, now I do not know, how to draw it, when I have two conditions on a week and a month.
    Could you help me with these problems, please?
    Thank you in advance

  • #2
    Here is a sketch. You will need to incorporate year into the code given that you have multi-year data. With many time periods, you may also consider switching the axes to improve readability. I use labmask from Stata Journal, authored by Nick Cox.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(week month costs)
    4 12 200
    1 11 500
    2 10 300
    3 10 550
    4 12 350
    1 11 400
    2 12 250
    end
    
    bys week month: egen tc= total(costs)
    egen time= group(week month)
    label define month 1 "Jan" 2 "Feb" 3 "Mar" 4  "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec"
    lab values month month
    decode month, gen(strmonth)
    gen lab_time= "Week "+ string(week) + " " +strmonth
    labmask time, values(lab_time)
    line tc time, xlab(, val) ytitle("Expenditure") xtitle("") xscale(r(0.75 5.3)) scheme(s1color)
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	18.9 KB
ID:	1541240





    ADDED IN EDIT: Note that the x-axis is not correctly ordered since I grouped by week-month and not month-week. So you want to group by year-month-week. The best way to achieve this is to create a time variable that Stata understands.
    Last edited by Andrew Musau; 13 Mar 2020, 16:54.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      Here is a sketch. You will need to incorporate year into the code given that you have multi-year data. With many time periods, you may also consider switching the axes to improve readability. I use labmask from Stata Journal, authored by Nick Cox.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float(week month costs)
      4 12 200
      1 11 500
      2 10 300
      3 10 550
      4 12 350
      1 11 400
      2 12 250
      end
      
      bys week month: egen tc= total(costs)
      egen time= group(week month)
      label define month 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec"
      lab values month month
      decode month, gen(strmonth)
      gen lab_time= "Week "+ string(week) + " " +strmonth
      labmask time, values(lab_time)
      line tc time, xlab(, val) ytitle("Expenditure") xtitle("") xscale(r(0.75 5.3)) scheme(s1color)
      [ATTACH=CONFIG]n1541240[/ATTACH]




      ADDED IN EDIT: Note that the x-axis is not correctly ordered since I grouped by week-month and not month-week. So you want to group by year-month-week. The best way to achieve this is to create a time variable that Stata understands.
      Andrew, thank you very much!
      I will try.

      Comment

      Working...
      X