Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating food consumption by HH

    Hi
    I am a PhD student working on 24 hours food consumption data at household level.Food items are entered in different column as variables below. I have also counted who ate fish and not in fish column below:
    Fish food1 quantity food2 quantity food3 HH ID
    0 veg 250 egg 50 rice 1
    0 rice 540 lentils 165 milk 1
    1 fish 150 meat 150 veg 1
    0 lentils 20 rice 678 fruits 2
    1 egg 33 fish 56 vitaA 2
    0 fruits 70 meat 67 meat 3
    1 milk 250 cake 89 fish 3
    I want to calculate total quantity of fish consumed by the household. Any support in this regard highly appreciated.

    Thank you,
    ​​​​​​​Rumana

  • #2
    Let's start out on the right track here. This is not from a Stata data set. It can't be, because you cannot have two variables both named quantity. Even if we correct that, this data cannot provide an answer to the question you pose because no quantity is provided for food3.

    If you have not yet imported your data into Stata, then it is premature to ask for help with data management or analysis (other than how to import it if you do not know.) So first create a real Stata data set. Then post back showing an example of your data from there. Please use the -dataex- command to do that. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    If you present the data better, I am confident you will get a timely and helpful response.

    Comment


    • #3
      Thank you so much Schechter for the reply! very sorry for the confusion, I just created a dummy data (in excel) to explain the situation (did not notice repeated name). It would be quantity_1 and quantity_2 and so on as below. Fish column '0' means not taken fish and '1' means eaten fish. Yes I am using version 15.1. I am very new in STATA and my data set is already in state. Please see the dummy data below and let me know if it make sense now. I want to calculate how much (quantity) fish is consumed by the household. The measurement was taken in grams. Data is last 24 hours food consumption of the Ohs.

      Thank you so much!
      Fish food_1 quantity_1 food_2 quantity_2 food_3 quantity_3 HH ID
      0 veg 250 egg 50 rice 677 1
      0 rice 540 lentils 165 milk 98 1
      1 fish 150 meat 150 veg 500 1
      0 lentils 20 rice 678 fruits 230 2
      1 egg 33 fish 56 vitaA 150 2
      0 fruits 70 meat 67 meat 80 3
      1 milk 250 cake 89 fish 170 3

      Comment


      • #4
        OK. Now, this data layout is a hybrid of long and wide, so it is going to be very difficult to work with. The first step is to put it into fully long layout, so that each observation contains variables HHID, food, and quantity, with just a single food in each observation. Then it is easy to total up, within each household, just the quantities that are associated with food == "fish".

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte fish str7 food_1 int quantity_1 str7 food_2 int quantity_2 str6 food_3 int quantity_3 byte hhid
        0 "veg"     250 "egg"      50 "rice"   677 1
        0 "rice"    540 "lentils" 165 "milk"    98 1
        1 "fish"    150 "meat"    150 "veg"    500 1
        0 "lentils"  20 "rice"    678 "fruits" 230 2
        1 "egg"      33 "fish"     56 "vitaA"  150 2
        0 "fruits"   70 "meat"     67 "meat"    80 3
        1 "milk"    250 "cake"     89 "fish"   170 3
        end
        
        gen long obs_no = _n
        
        reshape long food_ quantity_, i(obs_no) j(_j)
        by hhid (obs_no _j), sort: egen total_fish_consumed = total(cond(food_ == "fish", quantity_, .))
        I don't know if this is the way you want your results organized. If what you want is something like your original data layout, with an extra variable called total_fish_consumed, containing the same value in each observation associated with the same HHID, you could just add -reshape wide- to the end of the above code and that is what you will have. But I strongly recommend against that because any further analyses you want to do with this data will probably be difficult or impossible with that layout.

        If what you really want is for a data set to contain one observation per household, and that observation to contain just the HHID and the total fish consumed, then instead of the -by hhid (obs_no _j).....- command, you would instead run
        Code:
        keep if food_ == "fish"
        collapse (sum) quantity_, by(hhid)
        Now, once again, your data display cannot have come from a real Stata data set, because HH ID is an illegal variable name: embedded blanks are not legal. I gave you extensive advice on how to find and use the -dataex- command to show the actual example data in my last post. In the current post, I have imported your data, corrected the illegal variable name, and used -dataex- to create the example data set in the code above. I cannot overemphasize the importance of using -dataex- to show example data. In this instance, the omissions and deficiencies of a typed-out table were mostly unimportant (except for the variable name issue) and easily overcome. But often they are not. There is no substitute for -dataex-. Please use it exclusively in the future. It will also actually save you time: it is much faster than hand-constructing tables like yours.

        When asking for help with code, always show example data. When showing example data, always use -dataex-.


        Comment


        • #5
          Thank you so much Schechter for the quick responses! As most of the food name in Bangla, therefore was reluctant to share original state data! Here comes the original data: very sorry for taking your too much of your time. In the data set I just kept the households who ate fish (fish '1') for sure along with other foods. 'a01' is the household ID and x1_07_01, x1_07_02, x1_07_03 are name of different foods including fish (green color) and x1_08_01, x1_08_02, x1_08_03 are weight of all the consumed foods in grams including fish.

          May I know please how can I get the weight of fish consumed by each household. I highly appreciate your support.

          Sincerely,
          Rumana
          a01 x1_07_01 x1_08_01 x1_07_02 x1_08_02 x1_07_03 x1_08_03 x1_07_04 fish
          1 Lau Shak 250 Koi fish 500 Potato 300 Onion 1
          2 Water gour 2500 Tengra fish 250 Onion 10 Turmeric ( 1
          2 Sweet gour 500 Bele fish 200 Onion 48 Turmeric ( 1
          3 Beher gura 45 Potato 60 Rui fish 20 Onion 1
          3 cod fish 250 Potato 150 Lau Shak 250 Onion 1
          4 Rui fish 200 Potato 99 Dried chil 3 Turmeric ( 1
          4 Mung 60 Coconut 40 Rui fish 120 Potato 1
          5 Lau Shak 250 Puti fish 250 Potato 250 Onion 1
          5 tuna fish 250 Onion 35 Garlic 10 Ginger 1
          6 Tatkeni fish 250 Lau Shak 60 Onion 28 Garlic 1
          6 Telapia fish 360 Green bana 160 Onion 48 Garlic 1
          7 tuna fish 150 Dried chil 3 Salt (Iodi 6 Soybean 1
          7 Telapia fish 500 Green chil 15 Onion 200 Dried chil 1
          8 Water gour 1000 Golda fish 250 Onion 40 Garlic 1
          Last edited by rumana akter; 09 Jul 2018, 22:47.

          Comment


          • #6
            Well, I don't know what some of these foods are. The code below will calculate the total quantity of foods that contain the word "fish". But if, say, Water gour is a kind of fish, that will be missed. Without knowing exactly which of these foods are fish, I can't offer a better solution.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input byte a01 str12 x1_07_01 int x1_08_01 str11 x1_07_02 int x1_08_02 str10 x1_07_03 int x1_08_03 str10 x1_07_04 byte fish
            1 "Lau Shak"      250 "Koi fish"    500 "Potato"     300 "Onion"      1
            2 "Water gour"   2500 "Tengra fish" 250 "Onion"       10 "Turmeric (" 1
            2 "Sweet gour"    500 "Bele fish"   200 "Onion"       48 "Turmeric (" 1
            3 "Beher gura"     45 "Potato"       60 "Rui fish"    20 "Onion"      1
            3 "cod fish"      250 "Potato"      150 "Lau Shak"   250 "Onion"      1
            4 "Rui fish"      200 "Potato"       99 "Dried chil"   3 "Turmeric (" 1
            4 "Mung"           60 "Coconut"      40 "Rui fish"   120 "Potato"     1
            5 "Lau Shak"      250 "Puti fish"   250 "Potato"     250 "Onion"      1
            5 "tuna fish"     250 "Onion"        35 "Garlic"      10 "Ginger"     1
            6 "Tatkeni fish"  250 "Lau Shak"     60 "Onion"       28 "Garlic"     1
            6 "Telapia fish"  360 "Green bana"  160 "Onion"       48 "Garlic"     1
            7 "tuna fish"     150 "Dried chil"    3 "Salt (Iodi"   6 "Soybean"    1
            7 "Telapia fish"  500 "Green chil"   15 "Onion"      200 "Dried chil" 1
            8 "Water gour"   1000 "Golda fish"  250 "Onion"       40 "Garlic"     1
            end
            
            gen long obs_no = _n
            rename x1_07_(##) food(#)
            rename x1_08_(##) quantity(#)
            reshape long food quantity, i(obs_no) j(_j)
            replace food = lower(food)
            keep if strpos(food, "fish")
            collapse (sum) total_fish_consumed = quantity, by(a01)
            By the way, I strongly recommend that when you work with this data, that you rename the variables so that the names reflect what the variable means. While there are often good reasons for survey data curators to use variable naming systems like x_07_01 etc., they are difficult to work with. Certainly if you have to come back to this 6 months from now because a question about your work with it has arisen, you will struggle to remember what each variable is and why you used it the way you did. If you rename them suggestively (as in my code above), you will find it much easier for yourself, and also for anybody else who might have to review your code. You don't have to go to the trouble of renaming every variable in the survey data set, just those you plan to actually work with.

            Once again, I ask that going forward you use -dataex- to show example data. Your table in #5 was workable, but I am sure it was more work for you to create it than it would have been to use -dataex-, and it would have been easier for me to just start from a -dataex- example as well.
            Last edited by Clyde Schechter; 09 Jul 2018, 22:56.

            Comment


            • #7
              Thank you so much! let me try the way you have suggested and see if I can manage it!

              Kind regards,
              Rumana

              Comment


              • #8
                Thank you so much! These syntax are amazing! Few more things to know which you have perfectly pointed at the beginning that there are some foods where 'fish' is not written as 'fish'; just used the local name like 'Takeni'. In that case should I replace all the names of fish as fish using the code number of each fish. In that case, what would be the syntax. Second question is what does '_j' variable mean.
                I would be so grateful if you have time to explain these. This is surveillance data of IFPRI which seems so big for me as a newcomer!

                Sincerely,
                Rumana
                Last edited by rumana akter; 10 Jul 2018, 00:29.

                Comment


                • #9
                  The actually data was as below. I added fish manually that you can understand who is fish:
                  a01 x1_07_01 x1_08_01 x1_07_02 x1_08_02 x1_07_03 x1_08_03 x1_07_04 fish
                  1 Lau Shak 250 Koi 500 Potato 300 Onion 1
                  2 Water gour 2500 Tengra 250 Onion 10 Turmeric ( 1
                  2 Sweet gour 500 Bele 200 Onion 48 Turmeric ( 1
                  3 Beher gura 45 Potato 60 Rui fish 20 Onion 1
                  3 cod 250 Potato 150 Lau Shak 250 Onion 1
                  4 Rui 200 Potato 99 Dried chil 3 Turmeric ( 1
                  4 Mung 60 Coconut 40 Rui 120 Potato 1
                  5 Lau Shak 250 Puti fish 250 Potato 250 Onion 1
                  5 tuna 250 Onion 35 Garlic 10 Ginger 1
                  6 Tatkeni 250 Lau Shak 60 Onion 28 Garlic 1
                  6 Telapia 360 Green bana 160 Onion 48 Garlic 1
                  7 tuna fish 150 Dried chil 3 Salt (Iodi 6 Soybean 1
                  7 Telapia fish 500 Green chil 15 Onion 200 Dried chil 1
                  8 Water gour 1000 Golda 250 Onion 40 Garlic 1

                  Comment


                  • #10
                    Sorry to ask you too many questions! I have the code numbers of all fish (in row) by name within each variable like x1_07_01 (column) and there are 7 variables like this. How can I replace all the names of fish as 'fish' using the code number that at the end I am able to calculate the weight of fish as you did above.

                    Thank you so much in advance!

                    Comment


                    • #11
                      I don't understand what you're saying here. You refer to code numbers, but there are no variables in your data set that look like code numbers for the foods. You say you have a list of the code numbers of all fish, but you don't show it or even explain in what form it is. The data in #9 does not in any way direct way help me figure out which foods are fish. For example, in the second row of that table, while I am sure that turmeric and onion are not fish, I do not know whether that row contains a fish item because of Tengra or Water gour, having never heard of either of those foods.

                      May I ask, by the way, what is the reason for your continuing refusal to use -dataex- to show your example?

                      Comment


                      • #12
                        I think you misunderstood what Clyde suggested at #6. He did not advocate to change the fish-codings rather suggested you to change the name of the fish-variables (if they are fish at all !!!). For example one of your variables name is "x_07_01" and you can change the name for your convenience so that you can identify it distinctively. Use the code below to change the name of the variable "x_07_01":
                        Code:
                        ren x_07_01 bizarre_fish
                        As Clyde pointed out, you clearly do not have all items fishes there (onion, chilli, salt, potato !!). Moreover, you have same items in different variables i.e. Rui in both x1_07_01 and in x1_07_03. Collapsing mean over those items will be wrong.
                        Finally, please read through the FAQ.section on how to make a meaningful post. This will help you to receive meaningful answers and will help others who are willingly helping you. Specially read the sections for using "dataex" to provide data example and use of code delimiters when providing Stata codes. And we like to see the exact command/codes that you have used in Stata and what Stata gave you as output.
                        Last edited by Roman Mostazir; 10 Jul 2018, 08:49.
                        Roman

                        Comment


                        • #13
                          Sorry as I did not use -dataex- before, therefore did not dare to use it! Please see below if it works now:

                          input byte a01 str12 x1_07_01 int x1_08_01 str11 x1_07_02 int x1_08_02 str10 x1_07_03 int x1_08_03 str10 x1_07_04 byte fish
                          1 "Lau Shak" 250 "Koi" 500 "Potato" 300 "Onion" 1
                          2 "Water gour" 2500 "Tengra" 250 "Onion" 10 "Turmeric (" 1
                          2 "Sweet gour" 500 "Bele" 200 "Onion" 48 "Turmeric (" 1
                          3 "Beher gura" 45 "Potato" 60 "Rui" 20 "Onion" 1
                          3 "cod" 250 "Potato" 150 "Lau Shak" 250 "Onion" 1
                          4 "Rui" 200 "Potato" 99 "Dried chil" 3 "Turmeric (" 1
                          4 "Mung" 60 "Coconut" 40 "Rui" 120 "Potato" 1
                          5 "Lau Shak" 250 "Puti" 250 "Potato" 250 "Onion" 1
                          5 "tuna" 250 "Onion" 35 "Garlic" 10 "Ginger" 1
                          6 "Tatkeni" 250 "Lau Shak" 60 "Onion" 28 "Garlic" 1
                          6 "Telapia" 360 "Green bana" 160 "Onion" 48 "Garlic" 1
                          7 "tuna" 150 "Dried chil" 3 "Salt (Iodi" 6 "Soybean" 1
                          7 "Telapia" 500 "Green chil" 15 "Onion" 200 "Dried chil" 1
                          8 "Water gour" 1000 "Golda" 250 "Onion" 40 "Garlic" 1
                          end

                          All these colored names are fish.How can I calculate their weight to see the amount of fish re consumed by the households.

                          What I also tried to say in the previous post about coding is that if I do- tab x1_07_01, nol then I can see the numeric codes of each food item than name. Very sorry if I confused you. I am using state for the first time! Thank you so much in advance!

                          Comment


                          • #14
                            Code:
                            clear
                            input byte a01 str12 x1_07_01 int x1_08_01 str11 x1_07_02 int x1_08_02 str10 x1_07_03 int x1_08_03 str10 x1_07_04 byte fish
                            1 "Lau Shak" 250 "Koi" 500 "Potato" 300 "Onion" 1
                            2 "Water gour" 2500 "Tengra" 250 "Onion" 10 "Turmeric (" 1
                            2 "Sweet gour" 500 "Bele" 200 "Onion" 48 "Turmeric (" 1
                            3 "Beher gura" 45 "Potato" 60 "Rui" 20 "Onion" 1
                            3 "cod" 250 "Potato" 150 "Lau Shak" 250 "Onion" 1
                            4 "Rui" 200 "Potato" 99 "Dried chil" 3 "Turmeric (" 1
                            4 "Mung" 60 "Coconut" 40 "Rui" 120 "Potato" 1
                            5 "Lau Shak" 250 "Puti" 250 "Potato" 250 "Onion" 1
                            5 "tuna" 250 "Onion" 35 "Garlic" 10 "Ginger" 1
                            6 "Tatkeni" 250 "Lau Shak" 60 "Onion" 28 "Garlic" 1
                            6 "Telapia" 360 "Green bana" 160 "Onion" 48 "Garlic" 1
                            7 "tuna" 150 "Dried chil" 3 "Salt (Iodi" 6 "Soybean" 1
                            7 "Telapia" 500 "Green chil" 15 "Onion" 200 "Dried chil" 1
                            8 "Water gour" 1000 "Golda" 250 "Onion" 40 "Garlic" 1
                            end
                            
                            local fishlist1 `""koi", "tengra", "bele", "rui", "cod", "puti""'
                            local fishlist2 `""tuna", "tatkeni", "telapia", "tuna", "golda""'
                            
                            rename x1_07_* item*
                            rename item(##) item(#)
                            rename x1_08_* quantity*
                            rename quantity(##) quantity(#)
                            
                            gen long obs_no = _n
                            reshape long item quantity, i(obs_no) j(_j)
                            drop if missing(quantity, item)
                            replace item = trim(itrim(lower(item)))
                            keep if inlist(item, `fishlist1') | inlist(item, `fishlist2')
                            
                            collapse (sum) total_fish_consumed = quantity, by(a01)
                            
                            list, noobs clean
                            I use two different lists for fish names because the -inlist()- function will only allow a maximum of 9 options to compare with, and there are ten different fish types in your data. If there are still others in your full data, you may need to add them to the lists shown above (never allowing more than 9 per list), or start additional lists.

                            Thank you for the -dataex- example.

                            Comment


                            • #15
                              Thank you so much Clyde! I highly appreciate your time and efforts on this!
                              Yes this is a national level surveillance data, therefore list of foods are quite long. I might need to make couple of fish list as above.

                              I would like to acknowledge your name to my thesis!

                              Kind regards,
                              Rumana

                              Comment

                              Working...
                              X